Negative goals are quite natural for us: if the kart lower its health by a 10%, you can think of it as a mere "reduction" applied to the possitive goals -distance raced squared in this case- or something purely negative: a -10 in the final score.
If we try to get the same results as in previous video but using some short of negative goals, we will end up with something odd: the fear is ok in some really dangerous situations, they help you avoiding them efectively, but too much fear, a big negative scoring arising in some moment, will make the "common sense" to freeze. You have added a "phobia" to something.
So I would suggest not using them if you can live without. Instead use positive and reduction goals combinations. I suppose negative goals are always avoidable, but it is a personal intuition more than a golden rule to strictly follow.
Anyhow, here you have a video where this "fear" clearly appears, there are more similar videos, all have in comomn that I was playing around with some short of negative goal.
In this particular case, I was trying to avoid crashing after adding bouncing to the simulation. I still didn't have a way to measure the energy of the impacts, so I could not still use health to avoid it.
My idea was to use "how long did you survived in this future" to get it: if you are simulating 4 seconds in the future and a kart crashed in the final moment, 4 seconds away from the starting point, then it score as zero, but if you crash in the second 2, then you score as -1, and if you crash at second 1, it score as -10, and so on. The exact formula you use is not important, just someting negative that get really high -but negative- when the crash occurs closer to the starting moment. Score=Log(time_lived/time_simulated) can do the work, as far as you avoid Log(0) somehow.
This trick is used on the second video below, in this first simulation I will show you a better variation: when the kart runs off-track, I simulate a engine cut-off, and the kart continue running by inertia and then stop with a high friction. This distance raced after the engine cut-off scores negatively. It was a way to use the impact energy some easy way, and it worked ok.
And this is the resulting behaviours: the white kart doesn't negatively score dying at all, it is fearless, and some times it plays against it and leaves the track. The yellow one have a "low fear" coeficient, and it result in the most reliable of the three. Finally, the grey kart feels this fear doubled, and it makes him refuse to pass some narrow paths, even if it can perfectly fit.
But using negative goals is not evil, it can serve a purpouse: fill the holes in the intelligence and forceit avoid risk. This is what made yellow one the best of the three after all.
In this second video we use four kart to fine tune the fear level that is better (but using the "time lived was too short" fear as comented before): a white kart with no fear compites with other four ones, each one with more and more fear to crashing. Yellow is the bravest one of them, followed by orange, red and black karts. Black kart is a little too "coward" and tends to avoid dangerous paths at the cost of speed.
Surprisingly -or may be not- there is a "sweet point" around the fear we added to the orange one. Being fearless drives the white one into troubles at some close turns, while darker karts are so afraid of colliding that some times they almost stop before deciding.
So adding a little of fear into the combination of goals showed in the last post could made a more reliable combo after all... I will need to make more test on this and find out some "right way" to combine the three kind of goals (actually only works ok with positive goals and a second kind of goals, not the three at the same time).
Anyway, finding such a way of combining the three kind of goals may be not the right way. The intelligence can be bettered in such a way it doesn't need negative goals anymore, and even the reduction goals can be forgotten.
We should have then built a truly "metric" over the phase space, a more sophisticated one than the ones showed here, and the "common sense" would shine without the limiting fears of both kind, negatives and reductions.
But it will be on a following post...
Nice series of posts! waiting forward for next post... :)
ReplyDeleteThanks again Gonfo!
ReplyDeleteI am now writting quite fast, after a period of not sleeping well at night because all those ideas were popping at my mind at odd hours, now most of the concepts have calmed down and I can "speel them out" like magic and get a little rest.
Next one will be also interesting: adding layer after layer of "common sense" in a simple way so basic intelligence evolve into a more sophisticated intelligence, capable of doing more usefull things.
It is almost written, but will need a lot of revision after I feel confortable with the result.
Ok
ReplyDeleteTake your time man.
I'll say to you that my mind is also flying with your ideas, and my sleeping hours have decreased last nights jajajaja ;)
que tal se combina con los sistemas de inferencia difusa, y como se comporta la carga computacional?
ReplyDeleteEl que la simulación sea "determinística" o "difusa" no afecta al algortimo, ya que este manda simular 100 futuros "posibles", y no le importa si el futuro que le das es "seguro" o "probable".
DeleteDigamos que no espera que si pregunta al simulador dos veces lo mismo le de la misma respuesta siempre, eso es algo que esta fuera de su "interes" del todo.
A nivel de carga computacional, básicamente solo cuenta lo que te cueste simular tu sistema 100 veces a 5 segundos vista, luego, la CPU necesaria para usar todo esto y darte la elección "inteligente" es realmente muy poco.