Now that the blog is up and working, I would like to start by reviewing all those "old videos" and give an explanation on how the algorithm was working by this time, how good it was and witch things needed to be changed.
So today we will comment on the first video, it is the most important one to understand in order to get the whole idea of the algorithm, so read carefully and comment on any aspect you feel is not cleared in this post, I will try to help my best.
So start by watching the video:
As you can see, we have a simple case: a track with a kart on it. The kart is also quite simple, it is always accelerating (it has no brakes) and you just have a joystick for left/right turning (it is said that this system only has one degree of freedom). The AI must decide where will it "push" the joystick on every frame, that's our goal.
1) Options = Possible decisions
First thing you need in order to make the AI, is to have a list of witch "option" or "decisions" you are going to ponder. In this simple case, we will consider only two possible options/decisions: Push the driving wheel left by adding +5º, or push it right by substracting -5º. In the video, each one is represented by red or blue lines.
2) Create random futures for each decision
Second thing to do is to "imagine" a bunch of, let say, 100 futures for each possible initial decision.
For instance, you consider the decision "pushing +5º ", then you simulate a frame (in my case, a frame is 0.1 second time) with this "push" working, so simulation code will tell you where the kart will be after 0.1s of pushing driving wheel +5º. It will be a little ahead and left form actual position, and surely a little rotated counter clock wise.
From this "lefty" possition, you continue simulating until you get to time+10s (those 10 second is a parameter, how far in the future you want to simulate), but on all subsecuent frames, instead of "pushing +5º", you will chose the "push" randomly in the range +5 to -5. This makes the kart to drive randomly, and it correspond to the blue lines on the video.
You repeat it 100 times, so you end up with 100 blue lines, representing 100 possible futures that start by turning +5º left the kart. Those 100 blue lines form a "blue flame" in front of the kart as you can see in the video.
Now you repeat with the second initial decision, pushing -5º to the right. In the same way, we get 100 more futures, all them painted on red on the video.
3) Counting different futures
With all those futures found for one of the two possible decisions, now we need to discard the similar ones in order to get the list of the "different" futures that taking this decision could bring to us.
For doing this, you need to know witch of the kart's parameters are going to be considered "important" for comparing ending position. I decided to only use the position of the kart (PosX, PosY), and rejected to use the angle. For considering "similar" two futures, their positions, rounded to a given precision, must be equal.
So, if future #1 ended up at position (234.4, 187.0) -here I use just pixel coordinates- and I am using a precision of 5, it means this future "roughly" ends at (235, 190), and this rounded position is the one to compare with other futures in order to get a list of different ones.
So, after discarded duplicated futures, may be you end up having 35 different futures for decision 1 (turning left 5º) and 15 for the other decision (turning right -5º).
Well, everything is ready, the AI only needs to decide by averaging the 2 possible decisions (+5 and -5) weightened with the number of different futures each decision had, compared with all the different futures found so weigths sum 1.
Decision = +5 * (35/(35+15)) -5 * (15/(35+15)) = 5*35/50 - 5*15/50 = 5*0.7-5*0.3 = 5*0.4 = 2
So, the "intelligent" decision, in this case, is turn +2º to the left!
5) Loop on it
Well, the work is done, now you just simulate the kart after applying the decision and the kart moves on screen. You are again ready to go to 2) and start over again from this new starting position.
Thats all the algorithm is doing: counting red and blue different dots and heading left if there are more blue dots than red ones.
-A future that crash with the fences was supposed to be a "bad" future, and its final point is not even draw, only non-crashing futures are considered as valid ones. It was a really bad decision from me, we will come back to this in future posts.
-Each different future found count as 1. A longer future (one that takes the kart far away from its intial position) count as much as another future where the kart