Wednesday, 13 June 2018

Graph entropy 5: Relations

After some introductory posts (that you should had read first, starting here) we face the main task of defining the entropy of a graph, something looking like this:


We will start by dividing the graph into a collection of "Relations", a minimal graph where a pair of nodes A and B are connected by an edge representing the conditional probability of both events, P(A|B):


Nodes represent events: B could be "choose a random car from a parking lot" and B "the car has a gas engine inside".

Nodes will hold two forms of entropy: a 'raw' entropy in the form of a H2 entropy, and a 'processed' entropy H as a H3 entropy:

When a new node is created, raw entropy is set to 1 -so new inputs will just multiply the raw entropy as they come in- and H is then 1+log(1) = 1, so we will just initialise both to 1.


Edges represent the relation between A and B, the conditional probability of happening A given that B already happened.
This conditional probability defines an "entropy bit" of (2-pp) that will modify any entropy passing by the edge:

Edges have three internal variables that initialise to 1 as before:

  • P = P(A|B) is the conditional probability between A and B.
  • S = Input = External entropy H of the origin node A.
  • 𝛟' = Last value sent to end node B. 

When P or S change (A send a new input value S, or P is recalculated and modified externally), the edge perform those steps:

  • 𝛟 = Input value S times (2-pp).
  • Send Output = 𝛟/𝛟' to end node B.
  • Update 𝛟' = 𝛟.
That was all, we have defined how to calculate the entropy of a graph using two king of cellular automatons (the node and the edge) connected so they can send information down to the root node.

Being this entropy defined as a swarm of cellular automaton that autonomously send information to other automatons as required will allows us to make fast recalculation of the entropy as a probability changes or a node is added or taken from the graph, but also allows for a massive use of parallelization and distribution of big systems.

Now that we know the method, we can revisit the 'cars by engine example' and compute the graph entropy:

Before taking the structure of the graph into account, we calculated the entropy of the final distribution to be 1.6297, now that the internal structure is added, it raised up to 1.77, just a little higher than before, as expected.

In the next post I will be dealing with the infamous 4th Shannon-Khinchin axiom of separability, just to make sure it is comparable with the standard Gibbs entropy and other kind of generalised entropies.

No comments:

Post a comment