Friday 21 March 2014

No more suicides

Last video was quite impresive, a rocket flying inside a cave at full speed, but for me it was really disapointing: it showed up quite clearly a suicide tendency in the AI.

As you can see, the rocket leaves the track 2 or 3 times as it gets too fast to stop before crashing, but if you look closely, it happends that the crash WAS not impossible to avoid, not at all, but somehow the rocket put hands down, stop fighting, give up and let itself crash without even trying to scape. Why?

As I commented on the last post, I considered usign a negative scoring tendency to "keep alive" the player: if you die before 2 seconds while imagining a future, then give it a negative scoring so the player will actively avoid it. It was a really desperate try, as score is internally an "entropy gain", and allowing negative values is like allowing entropy to lower with time... it is a physic abomination, and I am really happy it didn't work out in my tests (I didn't finally use negative scoring but something as ugly as this).



It was not the problem, nor it was necessary or even healthy for the AI to use negative scoring, as the real problem was... a severe case of  poor imagination: too narrowed imagination gives the intelligence a clear tendency to commit suicide... again, sure there is a deep psicological lesson on this!

When I start imagining a new future, first thing is deciding witch free param (degree of freedom) I will change, then choose a value from a set of possible decisions. In the case of accelerating/braking a kart, for instance, I choose one value in the set (-25, -5, +5, +25) but later, in the rest of the future steps, when you have to randomly decide how to move the accelerator, I decided -shame on me- to narrow down the choices to a random value from -5 to +5: I didn't allow the AI to imagine "rude" futures, just smooth ones.

So now you have a kart running at full speed, it suddenly detects a wall in front of it, and start thinking on its options. When it comes to analize the option to "full brake" (accelerator -25), in all the futures it imagines after this first move, the AI is not allowed to continue "full braking" to -25, just at -5 rate, not enought to stop the kart at time, so as it can't imagine keeping the foot on the brake until it full stop, the kart is unable to find a solution, nothing it can imagine off will save its life... so it does nothing, just wait for the unavoidable.

Solutions: Use the full spectrum of possible decissions, from -25 to +25, instead of -5 to +5: when it comes to take this random decision, use a gaussian (normal) probability function with mean of zero and std. deviation of 5 to get the random value. It will range from -25 to +25 because it is hard limited, but will tend to get the more values in the range -5 +5.

After this change, the "spectrum" of red and blue futures that a player draw in front of it will visually spread and get quite wider as you can see on the video: the player is considering doing some more "extreme" drives than before.

Here you have a set of rockets playing around with this little change on. Now they never give up and let them crash, given a much more solid behaviour: now I would dare to let my own brand new rocket to one of those AIs for a walk with no fear... well, only if the gas was for free as in the simulaton!


The funny part is that it was already present in the code as an option, but switched off. I made some tests with this idea far in the beginings, but may be the AI was not still ready for it and really worked awfull, so it was just switched off.

May be -only may be- the base intelligence is finished... if it proves to be true and it works out as stable and optimized as it seems, I could move on into more interesting uses for the AI (team thinking is on my list).

No comments:

Post a Comment