In the case not all futures are equipropable you have to switch to another, more complex, way of calculating entropy.

When you have N microstates but each one has a different probability of happening, call it P(i), in the instantaneous or "clasical" entropy, we use:

S = Sum. on all possible microstates(P(i)*Ln(P(i)))

If it were about a gas molecules, the Ln(P(i)) part will correspond to the momentum of the molecules, so translating it into our case, it correspond to m*v, the momentum of the kart. As m is the mass and it is a constant, we can forget about it, and only v remains, the kart velocity.

But we are talking on futures, so we need to integrate v -the kart velocity- over the path that followed the kart, and as we use constant time deltas to construct the futures, on each step v is proportional to the raced distance, call it x.

Integrating x over a path gives you r^2/2, with r = length of the path, so using r^2 seems to me like the perfect way to mimic the real entropy on the AI model (it may be wrong, not sure I really understand all the inners of the entropy at this level).

So lets compare and see!

Yellow kart: Use Level 3 with score = Ln(N).

Grey kart: Use level 4 with score = Sum.(r) (r = raced distance)

Black kart: Use level 5 with score = Sum.(r^2)

In my modest opinion, level 5 on black kart properly reflect the original paper formulaes for future entropy, and makes a really great job as a driver.

Can't we do it any better? It is a perfect AI? No, it is a nice and general algorithm, but there are still some aspect of it that can be tweaked for better, not many, but some.

## No comments:

## Post a Comment