Sep 30, 2010

Posting on ...

Lastly I've been posting links, status and thoughts on twitter, buzz and linkedin. It's an experiment, to try keep posting more often.

If you wish, you can follow me in twitter
http://twitter.com/gyakoo

google buzz
google profile

or linkedin
http://es.linkedin.com/in/gyakoo/

There are a lot of literature and articles about using script languages in games (lua, python, unrealscript, lisp, javascript, c# ...) or applications like 3dStudio Max, Maya, even FXComposer. In some projects I've bound the C++ code with Python or Lua (the most popular languages AFAIK), by using the language SDK or wrapping it around with boost::python or Luabind for advanced object oriented features.


To bind your application with those script languages, optimally and without memory leaks, is something which requires mid progamming skills about stack based virtual machine, reference counting objects, and sometimes a lot of C++ template stuff. Obviously, there're design issues about when, where and what is the script language used for. There's a great (and old) presentation by Tim Sweeney about their (Epic) thinking about how should be the next mainstream programming language. Slides talk about the importance to choice a well designed script language in next gen games.

Python and others languages are designed for generic purposes. Actually they're good enough for most of applications, but sometimes you need more specific features built-in in languages, like parallel and concurrency support or processes synchronization. Recently, languages like haskell, lisp or erlang are being considered because of their usefulness in concurrent environments, whether it is by functional programming or actor model paradigm.

We're working with a propietary MMO technology which uses its own script language. That language has specific keywords to synchronize client and server processes, to make remote procedure calls, to work with the server-centric data definition and objects' global states, to specify replication issues in fields and classes, and to merge classes in the way of component (behaviors) pattern, among other features (GUI, plugins, external functions, nodes ...). A language, in cases like that, which provides specific field application issues, increases productivity while technical risks are maintained low. It's a big advantage to count on it.

Several weeks ago, while I was working on my own project, at home, I decided to implement my own script language. That's a complex, titanic and unnecessary work, because is very difficult to program a better script language that existent ones, and I should to use them, but I kept thinking about it. Compiler subject has been always fascinating to me, and I started a project with some simple features in mind. Rapidly I figured out a lot of problems, specially those related with performance, but I've to say that I'm learning really useful things about compiler internals details, and to respect the work lua or python's developers have done.

At this point, I've implemented successfully a script language with an hybrid lua/python syntax and the following features:
- Stack based virtual machine
- Main language constructions (conditionals, loops, expressions, subroutines...)
- Object oriented
- Dynamic typing
- Native types: int, double, string, unicode, dictionary and list
- Binding with C++ functions
- Reference counting for objects

And currently I'm working on:
- Binding with C++ objects
- Concurrent paradigm
- Optimization
- Debugging issues

But the roadmap is very large. I'm applying extreme programming, with a little bit tech design, and doing refactoring every few steps. I'm very proud about simplicity of C++ code, actually I'm doing big efforts to achieve it, maintaining things as small and simple as possible. Also, there is not memory leaks, and binding with C++ is really simple, in fact is simpler than the others languages'. But that simplicity has a cost, performance. I'm not doing benchmarking, at least for the time being, but I suspect that optimization will be mandatory. Anyway I'm considering it while program, but without enter in paranoia mode.

I will post more technical details about how I did it in stages like lexer, parser, generating abstract syntax tree (AST), optimizing the tree, generating final stack VM code, code optimization and execution in virtual machine. Contexts/scopes are interesting to implement, and classes and objects too.

Below you've a code snippet:


Vector3 =
{
x = 0.0,
y = 0.0,
z = 0.0,

_init( _x, _y, _z )
{
this.x=_x;
this.y=_y;
this.z=_z;
}

add( o )
{
this.x = this.x + o.x;
this.y = this.y + o.y;
this.z = this.z + o.z;
}
};

suma(a,b)
{
return Vector3( a.x+b.x,a.y+b.y,a.z+b.z );
}

// -- Main code
v0 = Vector3(1,2,3);
v1 = Vector3(3,3,3);
c = suma(v0,v1);
v0.add( v1 );
if ( c.x == v0.x && c.y == v0.y && c.z == v0.z )
{
print( "Equals" );
}else
{
print( "Differents" );
}

As you can see, my approach is to use tables (dicts) like objects. This is used in Lua too, and other languages, but the difference here is I'm trying to do it in an easier way. The syntax is almost minimal, and you can invoke a table with the () operator, and when exists a constructor (I called it '_init') it is called to create a new object. 'print' is an external c++ function and you have others like 'std.execfile' or 'std.execstring', grouped.

I've created a powerful variant type which supports (via tagged union) reference objects, script code, native types, even C++ code, so, the C function 'print' is a variable in global context which is a CFunc, but you can pass it to a function, or you can overwrite it.


meta( func, param )
{
func( param );
}
doit( m )
{
m(print, 3);
}

doit( meta );


About the final stack based machine code, below you have the extensive code corresponding to the first snippet. Explaination in next posts:

[CODE]
[Func] $00001
load _x
push this
push x
dotlv
store
load _y
push this
push y
dotlv
store
load _z
push this
push z
dotlv
store


[Func] $00002
push this
push x
dot
push o
push x
dot
add
push this
push x
dotlv
store
push this
push y
dot
push o
push y
dot
add
push this
push y
dotlv
store
push this
push z
dot
push o
push z
dot
add
push this
push z
dotlv
store


[Func] suma
push a
push x
dot
push b
push x
dot
add
push a
push y
dot
push b
push y
dot
add
push a
push z
dot
push b
push z
dot
add
lst 3
call Vector3
ret


pushtable
load $00001
store _init
del $00001
load $00002
store add
del $00002
push 0.000000
store x
push 0.000000
store y
push 0.000000
store z
poptable
store Vector3
push 1
push 2
push 3
lst 3
call Vector3
store v0
push 3
push 3
push 3
lst 3
call Vector3
store v1
load v0
load v1
lst 2
call suma
store c
push v0
push add
dot
load v1
lst 1
call
push c
push x
dot
push v0
push x
dot
eq
push c
push y
dot
push v0
push y
dot
eq
and
push c
push z
dot
push v0
push z
dot
eq
and
jz 66
push Equals
lst 1
call print
jmp 69
push Differents
lst 1
call print

Sep 2, 2010

A long time ago...

Hi, I'm back, after about 1 year since I wrote the last post. A lot of things has happened in my life, personal and professional.

I left the company and moved to another city where I'm currently working on a massively multiplayer online game, mainly focused to learn the Spanish language, but it's almost a traditional MMORPG eventually. It's a big project with a considerable budget, to develop in several years. I'm really excited with it.

...and I got married last July, after ten years sharing my life with a great person, who understands all my freak passions :)

From now I will try post in this blog more regularly.

Exploiting of systems is a subject of which I'd like to know more about. I think it's good to understand how avoid future hacking, but really you neither find much information on web nor books related. I've found a book which maybe could help us to improve these skills. Topics in book include:

  • Program computers using C, assembly language, and shell scripts
  • Corrupt system memory to run arbitrary code using buffer overflows and format strings
  • Inspect processor registers and system memory with a debugger to gain a real understanding of what is happening
  • Outsmart common security measures like nonexecutable stacks and intrusion detection systems
  • Gain access to a remote server using port-binding or connect-back shellcode, and alter a server's logging behavior to hide your presence
  • Redirect network traffic, conceal open ports, and hijack TCP connections
  • Crack encrypted wireless traffic using the FMS attack, and speed up brute-force attacks using a password probability matrix


And related to that, in the video below, speakers Michael Stail and Felix Domke talk about The XBox 360 Security System and its Weaknesses. They first talk about all weaknesses in old Xbox I, and one by one they're going giving technical information about how they fixed them.

I'm really thinking that C vs C++ discussion will never end. I just read a Peter Seibel post about interviews he has written in his book Coders at Work, asking to fifteen top programmers and computer scientist around the world about what are their opinions in using C++ language.


Well, maybe you're going to be surprised since most of them think that C++ is a bad, too complex, useless language, and they prefer C when they need performance. That's what a couple of well considered programmers said.

Reading the post, I can see that they had that opinion mainly focusing in how the complex the C++ is. They said that sometimes programmers use only a subset of the language. They think that what you mostly need to do with an OO language, you best do it with some other like Java or Python, because they're better structured languages and things can be done in the same way by different programmers. They insist in the complexity of language and the different compiler implementations, which is a pain in compatibility issues.

Anyway, although I'm not by far as good programmer as they are (years of experience are talking here), I think they're old-school programmers, I mean, C++ has been changing along these years and already there exist another standard revision, as well as compilers do. In reference to the language complexity, all is about using it. If you're a Java experienced programmer and you want to program in C++ after a lot of years, it is obvious that C++ is very complex for you in the beginning. You'll get confused with OO artifacts (which is half-implemented in C++, it is true), or by the using of pointers and references.

C++ is a great language, as well as C and Java and others, depending on what you need to do. The google jam code contest is a good example about it. I've a good colleague that tried to resolve its exercises by using C++, but he said me that more rounds he passed more difficult the questions were, but difficult in the sense of he hadn't enough time to program the result algorithm, not by the algorithm's complexity itself. He was spending a lot of time with language (C++) to do simple tasks like word parsing, splitting a string ... I wonder if he had used python, I'm pretty sure he had obtained a better results.

A place where you can find a lot of C++ code is in the videogames industry, but anyway there is a common evolution there, the using of more high level scripting language, loosing a bit of performance but gaining flexibility and avoiding bugs. That evolution is focused also in tries to make an better script parallel language. And talking about parallelism, it is true that C++ might be a bit painful. That's because the lack of a memory model like Java has. But next language standard has done efforts in that field.

To summarize, my honest opinion about using C++ is that if you've got a couple of good programmers that understand well the language, and they're pragmatic, you could program a very good product (in the performance sense) by using C++, but even so there are parts of a system that is better program in another not-so-risky language. To make good use of the C++ you need good and experienced C++ programmers.

C++ is a dangerous language, like a knife is. But that has not made it so bad. All depend on who is using it and how he/she uses it. You don't let a child that plays with a knife, don't you?

A post from The Endeavour blog about a Linus Torval's response in a forum, makes me remember the ethernal dilemma about OO and Procedural methodologies flamewar, C++ vs C, STL vs C, Java vs C, well whatever vs C...

Linus:

C++ is a horrible language. It's made more horrible by the fact that a lot
of substandard programmers use it, to the point where it's much much
easier to generate total and utter crap with it. Quite frankly, even if
the choice of C were to do *nothing* but keep the C++ programmers out,
that in itself would be a huge reason to use C.
...
John:
...
Well, I’m nowhere near as talented a programmer as Linus Torvalds, but I totally disagree with him. If it’s easy to generate crap in a relatively high-level and type-safe language like C++, then it must be child’s play to generate crap in C. It’s not fair to compare world-class C programmers like Torvalds and his peers to average C++ programmers. Either compare the best with the best or compare the average with the average.
...


John's impressions are clever and looks like more polite, against Linus's outspoken attitude but completely honest, but anyway after read a lot of controversial comments and what Stroustup said about his C++, finally I think that why do we have to chose one of them? Are always things black or white? I neither agree nor disagree with extreme positions ... people like discussions!

What do you think about agile software development? Is it suitable for all kind of software? Maybe it isn't, but, are you sure that you can't apply principles of agile programming to some parts of your non-agile development process? Well, like almost everything in the life, it isn't a black or white response, but a gray one.


Whether you have the chance to apply it or not, learn more about this topic is very useful to improve your programmer or analyst toolboox, and always you can extract only the interesting part to integrate it in your daily process, I think. At least that was what we did for the last year, creating a big infrastructure from the scratch, with a quite few resources and always with short schedules for intermediate deliverables. Follow is almost like a project postmortem, mostly at management level, I hope you find it useful, that is this post's target.

The target
Currently, the result of our work is helping to the SQA department to do their work faster and more reliable, and the tools we've created are being used in the production's department to trace and debug the applications they create. The statistics we generate and store by executing automated testcases, are very important for the management process. We are consolidating the basis of how the different modules of production's department are shipping and integrating theirs products (at least we're trying to). Also, we've created an intranet where all the company can communicate their issues to SQA department, with a normalized procedure to do that.

It's an unfinished work, but for the things we got successfully, I'll try to expose here how we did it and what problems we found. How did we resolve them? What did we learn?

Team and work
Let's see, as you'll see in next paragraphs we needed to use an agile process, but the first and most important thing we didn't want was to spend a lot of time in the programmer training in terms like iteration, continuous process, pig/chicken roles, scrum master, features documents ... I assimilated all those concepts and then integrated them into our work pipeline, without give bored talks about agile to the team. We wanted to work and ship deliverables in a rapid pace, because we had short milestones and in the meantime we were developing tools, libraries, cluster, web, database and their integration, we wanted to improve certain numbers for the management. Well, really only a number. That number was the hardest part in the entire project. It was showing how many automated test cases we had, since it was used in the calculation of rate of return to justify our team work. Yes, everything must be viewed as a money problem in the end. But we can't increase that number without create a minimal technical infrastructure! It's a cyclical dependency!

So a work division was neccesary, due the limited number of developer we had. The division was accomplished in two ways: making sub-teams, one for tools, the other one for that magic number. And also, in certain moments, developers split their labor in two parts, programming some hours, and helping with the number the rest of the day. As you already know, developers are the most valuable asset in the company, so obviously we had to permute them to avoid their demotivation, and as result the entire team have a good cross-knowledge about the project. It is important. Nobody, individually speaking, is most important than other, only the global team and the inter-relationships are essentials.

Taking the control over the project
As you can read, targets were high enough being our resources so limited. We needed a process to handle that. More specifically an agile development process. I think here is where the project manager (in other cases the lead programmer doing management functions), have to do their work. By having in mind targets and times, as well as resources (how many programmers?) and their development (mixing junior and senior developers), you have to get the things running and giving us results.

Another of the things I wanted was to have the technical control about everything. It is an impossible issue, but you know that things can diverge. People have tendency to do the things the best they can, and although it isn't a bad action, when you're a lead developer you need to achieve a mix between fast and quality. Also as I said before, intermediate results were needed. So, we were updating a prioritized list of features every we wanted a new sub-product, feature, tools or procedure. Returning to the technical control, I meant that I always was getting feedback from developers. Helping them to solve any problem that can block them, and giving them advices about more technical parts that I have the experience about. Maybe it sounds a bit hard for people, but it isn't at all. You have to balance again between training, personal improvement and slight control, leaving to developers do their work, with their own ideas and code style, but continously retargeting them to the global team direction.

Anyway, I insist it can be viewed as a dictatorial technical leadership, but it is not at all. This it is because we had a lot of brainstorming meetings, and every decision and action was previously agreed with the sqa director and with the team as far as possible. In fact, I think you cannot make a really good decision entirely with your own. You need some feedback from your team and colleagues. Maybe your original decision will go as it, but by talking with people you can make some changes or to prepare some things for the future, and actually, they will be who develop that feature, so at least you need bear them in mind. What do you think? :)

Continuous meetings
To correctly apply agile principles, and to receive feedback continuously, we had daily meeting with developers working on a task. But I was clear here. Meetings was driven by something, with a clear target and no more than fifteen minutes long. These were tracking meetings, where the team talked about their progress and problems found, and I repeated targets, priorities and tasks for incoming labor day.

Exceptions were the brainstorming meetings, where we met for a period of an hour or more, talking about how to implement a feature for a new sub project, different approaches to a solution, how a specific technique could affect to other parts of architecture, ... Those meetings were very productive, and everyone can contribute with their own thoughts. Finally I gathered all that information and a solution came to us at the same time the team felt useful and made up relationships between them.

The place where we met was basically in the same place where we work. Trying not to disturb any co worker, we met around a white board where we made a note of everything we talked about. Again everyone can write down on the white board, and those notes were translated into tasks, a class diagram update, a feature, or a change in a function prototype in the emailing library.

But things aren't so easy as you think. People must to pay attention to meeting's driver, you need to avoid parallel discussions, and try to let only the topics around the main goal of the meeting, because fifteen minutes perhaps aren't enough, but it's what we have. Important things first.

Communication and continuous refinements
To perform a continuous refinement process obviously you need what I said before, a continuous feedback, and in one hand you have a continuous meetings with team and in other hand you've the control over what the developer is doing every moment. That's a lot of continuous stuff! So we reach a conclusion here: communication.

The success of any project, or at least a key factor, from my point of view, is the communication among team. Let me explain it. You always should to boost the communication between developers, and the simplest thing you could do is when a developer ask you for a doubt about a technical aspect or whathever, you already know the entire project and what is doing everyone in every moment, so you could forward him to talk with his team mate, depending on doubt he asked you. That is useful and in a short time you can see how people are talking amongst them when is needed. You're building a team. Doesn't matter if you have a team with a couple of gurus or über programmers if there not exists communication because they're almost autistics, or even worse, all of them are trying to impose their solutions. The synergy is the most important part in your team, and it is difficult to achieve: the know-how. I had luck to work with great and honest people there.

Once you have a communicative team, there are a lot of daily work to lead their efforts to build a product. As I said, in progress tracking meetings you can view how your list of tasks (or gantt chart) is being completed, letting you update priorities. But the real problem comes with unexpected events. What does happen when your director changes priorities? or even what if your director gets a developer out of the project? Well, there is no a perfect solution for these misfortunes. Here I have to say that sometimes some sub-projects gone frozen for a time while the entire team was changing the global targets, and in other moments you end up with only two developers when you had five. At least we had a agile team, which responses to changes very quickly, and all the developers understood this. Also at any moment, due the cross knowledge, a junior script writer can sit down to fix in c# a issue related with the debugger tool we were creating, for example, as well as a c++ senior developer can help with cluster maintenance at adminitration level, setting interfaces because he already knows how to do it. This also is a good thing to enrich the people.

But even so, there is a big work to break down complex requeriments into simpler tasks, to achieve rapid results in a week. That is what I did with the building of Studio Application, a complex .net desktop application which is used to generate automated testcases, integrating a lot of processes, techniques and languages. The first step I did was to analyze the requeriments, making a high level list of features needed. After that, a blocks diagram was created, following a class diagram. Once an overall classes and sequences diagrams were created, it was easier to distribute tasks to people, bearing in mind a milestone for a week. In a week we gone through the results and we had a minimal functional application. The remaining work was refinements, week after week, until we reach a quality product. But a good design is needed because you have to create code structure in order that it is extensible, for the next refinement weeks. It is always a painful work when code is not prepared for extensibility, anyway. And a grateful one when things run ok.

There is another minor problem with continuous refinement: features or tasks starvation. Maybe you're updating priorities in your project plan, and there are features that they're not as important than others, or maybe they're less motivating than others. You should to set priorities with that in mind, because among features, you'll find bugs and other issues that need to be fixed in order to unblock certain other developments, and they can be pushed down in the list.


Conclusions
To adopt an agile software development process should be a natural process. It is good all the documentation and theory about agile methodologies out there, but if you force the team to adopt that cool methodology you read a month ago in a magazine, without taking in mind that it's almost a way of react before a changing environment, maybe it will cost more than you expected.

Because the team cannot change suddenly, in fact, the agile process is focused to train the team to do that! It is important go floating with the work flow of your team and slightly retarget their motivations, doing something similar that we did.

Anyway, all what I've written here is mostly the beautiful parts of our work. Once you have seen all process from another point of view, when (important) things have finished and when you get results, it's very easy to see what you failed and what was a success, but anyway it depends on the company, project and people you have. In our case we learned a lot of interesting ways to do the things, but also ways to don't do them. Again, this isn't a magic recipe or the correct way to do the things, and what worked ok to us, not necessarily is suitable to another company. But it was only a postmortem!

Sorry for the long post.

Oct 9, 2009

Hacker's Delight

Since I read an article from EntBlog about this book, I pushed it in my pending books queue. And ... finally a good colleague at Optenet give it to me like a present. I'm really excited. Thank you dude.

Sure I'll learn a lot of interesting things.

Today a person asked me for books to learn to "program videogames". Well, despite that this question is more difficult to answer than you initially can think, I knew what he actually meant and I asked him (intentionally):


Me: What do you mean with "program videogames"? - He: Well, you know, to draw 3D meshes with textures and so, these things...

Nooo, graphics again! There is no problem, but, how many times I've heard this wrong association?

I said him: Why don't you begin with other things different from graphics? And his response was: Bah, that another stuff (non-graphics) is easy, at least I can understand it without problems, but I've never done graphics programming.

Wow!, Overall engine architecture, entities system, networking or even AI? Gameplay and entities behavior is really exciting. Sound system or script binding. And a parallel engine architecture, with tasks scheduling is also cool. Another good stuff is the filters pipeline, more related with graphics if you want. These are complicated ones, but are videogames too. And there are a lot of more things I can't talk you about. Also, XNA or Ogre3D are good start points. Even you can download my simple c++ space invaders game and modify it.

He: Yes, yes, but, what about graphics? Draw models. Textures!

Ok then, definitely he wants draw primitives :)


This conversation makes me remember other anecdote, whith a good co-worker who wanted program a fluid simulation, if I can help him. I asked him: what kind of simulation, a water shader effect, a particle based simulation or simply something like the old plasma demo effect?- He: No no, I want make a good fluid simulation, and I think I can, in fact I've already done something.

Me: Really? You're a crack!, Have you ever been working with CFD or Navier-Stokes equations?. I don't know much about it, so I can't help you, tell me more, I said.

He: Well, I don't know that CFD or equations. I don't need them, I think I'm going to deduce them.

I finally said: Ah ok, why don't, good luck, and don't forget to show me any result you get :)

Good people, best regards to them.

Complexity Order I mean. Algorithmic is an useful tool which every programmer needs to know. As you already know, you could find/do a very very fast routine to compare two strings, by using new SIMD instructions built in the SS4, as the PCMPEQQ for example, and reordering instructions to minimize cache fails and take advantage of out of order micro execution. But if you're applying a direct algorithm to traverse words or a direct structure to store them, there is nothing you can do when you get an asynthotic input. It is the beauty about algorithmic, an university's subject that all the software engineers should know, and apply it in their daily technology decisions (whenever they can), because the system's behavior with large input matters and can makes the difference.

I remember, when I was studying at UCA, we did practices about algorithmic and complexity order theory by measuring different algorithms times to the same problem. We sampled the time that an algorithm takes for different input sizes, so you're sampling the complexity order curve/function and then by comparing all the algorithms-times, you can see how they behave asynthotically, in a graphic manner.

I've been using this technique since then, when I need to test different approaches to same problem, or when I need to empirically demonstrate a data structure, for example. In another post in this blog I talked about a special radix tree to store IP numbers, demonstrating you'll get a constant complexity order O(k) when search for an IP, independently you store all internet IPs (another thing is the complexity order in space). This means that you'll perform same number of instructions to search any number.

So you have, O(1) < O(logn) < O(n) < O(n·logn) < O(n^2) < O(n^a) < O(a^n) < O(n!), being O(1) the best you can get in time and space. Related with this you have the NP complete problems.

Sometimes when trying to get a best time complexity, probably you'll get worse the spatial one, and vice versa. Another important note about measuring times is the time precision. Although you use a high performance counter, you cannot measure how much time takes two std vector's inserts, because current microprocessors are very fast. You should use a big N input, repeated M times, because memory never is enough (you can't insert 1·10^12 words in a vector, at least you shouldn't), and then divide the time to compute the average time. Selection of N and M is important. Also, you need to know that another daemons or processes are continuously interrupting the algorithm, maybe the compiler can generate an optimize version of your algorithm (only improving the multiplicative constant). All these variables can corrupt the times you get, so the conclusion is do not use the times as it, but the graphical comparisons between algorithms' times, or in other words, you need to extract all the multiplicative constants and think only in terms of sampled curves' behavior.

In the code below you see a snippet to measure a std list insert varying N.


unsigned long step = (MAX-MIN)/N_POINTS;
// Varying our input N
for ( unsigned long N = MIN; N < MAX; N+=step )
{
double time = 0.0;
// Repeating M times
for ( unsigned long m = 0; m < M; ++m )
{
Clock cl;
cl.Start();
// Algorithm test comes here.
// Trying to insert N numbers in a stl list
std::list<int> lis;
for ( unsigned long i = 0; i < N; ++i )
{
lis.push_back( i );
}
time += cl.Stop();
}

printf ( "%lu\t%0.10lf\n", N, time/(N*M) );
}


This code will print an output like follow:

99999 0.0000001016
6759999 0.0000001023
13419999 0.0000001042
20079999 0.0000001021
26739999 0.0000001028
33399999 0.0000001037
40059999 0.0000001016
46719999 0.0000001051
53379999 0.0000001017
60039999 0.0000001020
66699999 0.0000001012
73359999 0.0000001021
80019999 0.0000001005
86679999 0.0000001011
93339999 0.000000101


This data can be obtained for several algorithms. All those data then can be graphically represented using a plot program, like gnuplot for example.

Follow are curves for insert algorithms in <vector>, <list> and <map> Before you look through the image below, it would be interesting you know what are the complexity order for insertions in these containers. In this page you can see all O for stl containers. O(1) for vector and list. And * for map. Well, the time complexity requirements imposed by the standard for each map operation suggest a balanced binary search tree mostly. Let's see the results:



Wow!, it is interesting that the vector (red line) and the list (blue line) are constant ( O(1) ). In other words, doesn't matter how N increase, always you'll get a constant time. And the map (green line) follows a logarithmic curve ( O(logn) ). Obviously a linear interpolation between measuring points are used.

As you can view, vector has O(1) in push_back operations. Wait a minute! When a vector increases beyond its capacity, a memory resize is needed, and a copy for all elements is performed. So, why is it O(1)? It's a O(n) just because asynthotically you'll need to make N copies... Well, there exists a thing called amortized analysis to deal with it. In fact, in the wikipage about amortized analysis, the example used to describe it, is the array push back operation.

Also you can get from graphic that vector is faster than list when inserting elements, due to multiplicative constant, being smaller in vector structure. But remember, with this implementation and in the CPU I used. This is can be because in list you need to make pointers operations when a new node is added, in vector you only need access to last available memory slot to put the element. Also in list you need to create every new node (memory alloc operation).

Another comparison graphic below, measuring find operations in the three same containers.


But, what happened to green line (map's find operation)? This is a common issue you'll get when comparing very different measuring scales. Map's searches are so fast, comparing with the vector or list ones. So, you cannot see the green line because is very near to X axis, in other words, with low times. You need to change scale, but anyway you can see here that vector and list find operations are linear ( O(n) ), with different constants (again vector has lower constant). To view if map searching operation behaves like a logarithmic one, we represent independently the third curve.


And finally, it behaves like a logarithmic curve!.

Do you think you're a Duct Tape Programmer? It's more difficult than people think. Another Spolsky's great article.

And the duct-tape programmer is not afraid to say, “multiple inheritance sucks. Stop it. Just stop.”


...you’re not here to write code; you’re here to ship products.


One principle duct tape programmers understand well is that any kind of coding technique that’s even slightly complicated is going to doom your project. Duct tape programmers tend to avoid C++, templates, multiple inheritance, multithreading, COM, CORBA, and a host of other technologies that are all totally reasonable, when you think long and hard about them, but are, honestly, just a little bit too hard for the human brain.

Oct 1, 2009

Journal Hook

Following last Windows focused post, I come to talk you about hooks. Well, I'm not an expert about hooks, but basically, a Windows' hook is a procedure that you can register in the operating system to intercept events before they reach an application. More information in msdn.



Well, in our try to automatize systems, we need to create a mechanism to intercept all mouse and keyboard messages generated by user in desktop. So, we can record something similar to macros in Windows, and later playback them. For example, we need to automate the Skype user interaction, or the Outlook application, and then play the macro.

It's a very interesting thing because, WM_MOUSEMOVE is a message too ;), and we could save it for example to perform some QA automated tests in a game, moving the avatar around scenario, capturing mouse moving and keyboard strokes. There are a lot of applications about hooks, and there are more hooks than only Journaling which is only a kind of hook.

Here you have a very simple application that creates this hook and records all messages. Just a note: Be careful when you're executing from Visual Studio IDE, because if you set a breakpoint in code, the hook cannot be established correctly. Maybe it could be generated by VS thread (attached to process) which avoid messages to hook, I don't know, but it is a very rare bug, difficult to find, and can drives a team's member crazy. In fact, done it.

The idea behind, you intercept all messages and serializes them (lparam, wparam, time, message number, even the hwnd but have no sense) and when you want play them you've to feed to journal playback hook with those messages. In the code below, (all) the messages are stored in a plain txt file, but its performance can be pretty bad due to the big number of messages intercepted. You can buffer and then flush them. Also you can change this, for example, sending them by network or something. Let your imagination flies.

Finally, it is not a new thing, you can google for "winmacro" and you'll find an open software (with source code) which does this journaling stuff, but with an own interface and more control. Anyway, in my code below you only have a record feature. Playback in another post...

CPP file here.


#include <windows.h>
#include <stdio.h>

// -- wow, two global objects!
static HHOOK gHookHandle = 0;
static FILE* gFile = NULL;

LRESULT CALLBACK JournalRecordProc(int code ,WPARAM wparam,LPARAM lparam)
{
if ( code == HC_ACTION )
{
EVENTMSG* mesg = (EVENTMSG *)lparam;
// -- Serializes entire message
fprintf_s( gFile, "%d %d %d %d %d\n", mesg->message, mesg->paramL, mesg->paramH, mesg->time, (UINT)mesg->hwnd );
// -- Exiting when PAUSE key is pressed
if ( mesg->message == WM_KEYDOWN && GetAsyncKeyState(VK_PAUSE) )
{
const LRESULT result = CallNextHookEx( gHookHandle, code, wparam, lparam);
UnhookWindowsHookEx( gHookHandle );
gHookHandle = 0;
PostQuitMessage(0);
return result;
}
}
return CallNextHookEx( gHookHandle, code, wparam, lparam );
}


// WinMain
// --------
int WINAPI WinMain( HINSTANCE hInstance, HINSTANCE hPrevInstance, LPSTR lpCmdLine, int nCmdShow )
{
// -- Record hook and file to store messages
if ( (gHookHandle = SetWindowsHookEx(WH_JOURNALRECORD, JournalRecordProc, hInstance, 0) ) == NULL )
return 1;
if ( (gFile = fopen( "msgs.txt", "wt" ) ) == NULL )
return 1;
fprintf_s( gFile, "# Message ParamL ParamH Time HWND\n" );

// -- Main message dispatching
MSG msg = {0};
GetMessage(&msg, 0, NULL, NULL );
while ( msg.message != WM_QUIT )
{
TranslateMessage( &msg );
DispatchMessage( &msg );
GetMessage(&msg, 0, NULL, NULL );
}

// -- Closing
fclose( gFile );
return 0;
}

My company just bought it. A must-to-have book.

Me, pretty happy

Sep 26, 2009

Colin McRae: Dirt

This so addictive game really impressed me by its playability. The graphics aren't bad, and you have a lot of gaming modes, vehicles and tracks. When you're playing for a while, you get an inmersive experience, and is cool how you're running at 160km/h, listening to your co-driver giving you track information, while the car is skidding near to the trees.


And the menus are incredible. As you can see in the video below, menus are in real 3D space, with special effects and movement. An off voice is telling you tips and tricks, depending on your selected race configuration, and you can review all kind of statistics about your profile progress while the race is loading.

And finally, the best of all, I bought an original copy only for 9.95€.





Yesterday, a coworker passed me a pdf extracted form the {cuv} magazine, Sep09, where Bjarne Stroustup talks briefly about the incoming c++ standard revision, c++0x. Last revision was c++98.

At the beginning of article, he says:

‘What will C++ be once we have the facilities offered by the upcoming
standard?’ This is not an innocent ‘just philosophical’ question. Consider
how many programmers have been harmed by believing that ‘C++ is an
object-oriented language
’ and ‘religiously’ built all code into huge class
hierarchies, missing out on simpler, more modular approaches and on
generic programming. Of course C++ also supports OOP, but a simple
label – especially if supported by doctrinaire teaching – can do harm by
overly narrowing a programmer’s view of what can be considered
reasonable code.


Is interesting what the main creator of c++ thinks about using it in complex Object Oriented systems.

You can download the pdf here.