It was then time to try with a brand new creature and compare it with the old known kart. I decided to code a classical rocket that travels the circuit as if it were a vertical cave, with a gravity force downwards that only apply to rockets, and see how the AI deal with it.
I have to admint I made it way too powerful, and given that the AI will try to maximize entropy, it is short of natural that it will drive the rocket as fast as it can. Notice how much it likes to spin around as a way to hover around a place before it decides where to go: the quickiest it rotates, the more future options it has, as it can leave the spinning to any needed direction in a short time. That is way spinning is a nice option to its eyes.
So here you have a rocket and a kart trying to get sweet drops from the circuit, look carefully at the rocket as it enter the narrowest parts of the circuit, it is amazing how good the AI is managing such a difficult to master kind of ship:
Having two kind of vehicles so different from each other traveling the same circuit gives you a better view of the AI, you find some strange behaviours on the AI that, on a kart, for instance, doesn't show up so clearly.
Watching the rocket fly out of the track because it tried to get a drop so blindly it didn't noticed that, after picking the drop, the crash with the circuit fences was inevitable, I wander if there is a magical set of goals a creature need to have in order to behavie in a consistent manner.
In this example, both players have the same set of goals: you get scored by racing more meters and by getting sweet drops. It sonds reasonable if crashing were not so bad, but crashing the vehicle is actually scored as zero, it is not scared of dying at all, and may be it should.
If I could add a new goal to both, a goal about surviving at any cost, I think the rocket would be quite more conservative some times and avoid some suicide rides that usually end outside the track. But how?
In the actual implementation, negative scoring is not allowed. The raced distance is allways positive, and so it is the drops collected scoring, and so on. Negative scoring is implemented with coeficients in the range 0-1 that are multiplied together to get a score reduction coeficient, that will be aplied to the sum. of the positive scoring (raced, drops collected, etc).
When you crash with other, for instance, you can code it so both players' scorings are reduced with a coeficient depending on the energy on the crash, so bigger crashes decrease this future's score quite a bit, making this option less appealing to the player.
But being afraid of breaking the toy must compute as a real negative scoring somehow, if going left makes you die in a second, then scoring left option with a zero is not enought, it should, in emergency cases, score negative, meaning that going right will increase its own scoring with this left negative scoring converted to positive.
So it is my next goal, being able to deal with negative scoring and then adding a goal "try not to die in the next N seconds", that will score negative any future that involves dying in less than N seconds.
This new "goal" will trigger when, in some of the future's steps, the "dead" property of the player changes from "false" to "true". In this "dying" moment, the "goal" will retrieve from the future its "time raced" or time length (elapsed time from the future's start until now) and compare it with the N seconds you want to keep alive.
If T = "time raced" if smaller than N, I will use T/N as a parameter (that will range from 0 to 1) and convert it to negative scoring in such a way that, for T=N (T/N=1) the scoring must be 0, but as T drops to 0, the scoring must tend to minus infinty or, to be conservative, to some floor scoring like -5000 or -10000.
Log(x) is a good candidate, as Log(1)=0 as we wanted, and Log(0.0001) is a big negative number, we just need to avoid asking for Log(0), so my best candidate so far is:
Score = Log((0.001+T)/N)
May be tomorrow I can show you how good or bad this idea was in a new video, and if it works as spected, may be it will be added to all the players as part of its "standard AI": a goal for raced distance, and a goal for trying to keep yourself alive for at least 2 or 3 seconds.
No comments:
Post a Comment