Friday, 3 August 2018

Roadmap to AGI

Artificial General Intelligence (AGI) is the holy grail of artificial intelligence and my personal goal from 2013, where this blog started. I seriously plan to build one AGI from scratch, with the help of my good friend Guillem Duran, and here is how I plan to do this: a plausible and doable raodmap to build an efficient AGI.

Plase keep in mind we both use our spare time to work on it so, even if the roadmap is practically finished in the theorical aspects, coding it is kind of hard and time-consuming -we don't have acces to any extra computer power except for our personal laptops- so at the actual pace, don't spect anything spectacular in a near future.

That said, the thing is doable in terms of a few years given some extra resources, so let's start now!

AGI structure

A general intelligence, being it artificial or not, is a compound of only three modules, each one with its own purpose that can do its job both autonomously and cooperating with the other modules.

It is only when they work together that we could say it is "intelligence" in the same sense we consider our selves intelligent. May be their internal dynamics, algorithms and physical substrate are not the same nor even close, but the idea of the three subsystems and their roles are always the same in both cases, just they are solved with different implementations.

In this initial post I just enumerate the modules, the state of its developemnt, and its basic functions. In next posts I will get depper into the details of each one. Interactions between moduels will be covered later, when the different modules are properly introduced

Module #1: Learning

Definition: This module uses learning to build a simulator of the agent's world from the raw sensorial inputs.

Basic function: The module will process the raw sensorial inputs, learns to build a representation of the world (as an embedding of those sensorial inputs) and then use it to predict the next state of the world -its representation- as it will probably perceive it in the next moment.

Development: It is a standar deep learning task, so this part is easy... once you find the right layer topology and a lot of GPU time to train it some hundreds of times until it does it job.

Module #2: Planning

Definition: This module scan the future outcomes -or consecuences- of taking diferent actions and decide which one will actually take the agent in order to behave intelligently.

Basic function: This module is a modified version of the actual FMC algorithm showed in our github repo. It reads the actual state of the world from the module #1 as a representation, uses its predictive skills to build a number of paths the world could follow, form a tree with them following the FMC rules, and finally choose the action with more leaf nodes attached.

Development: This module is already done and working like a charm. It really outperform any other planning algorithm out there (it beated all SoTA algorithm we could find, about 11 and all of the 50 Atari games used to test them in the literature with 360 times fewer samples on average), works perfectly fine on continuous spaces, and based on a first-principles theory of intelligence (of mine ;). 

Module #3: Consciousness

Definition: This module select the relative importance of the different goals available to the intelligence to build the reward function used by module #2.

Basic function: Basically, it changes the "personality" of the agent real time, making it more interested on some kind of things and less on others in orther to maximize some property of the tree built by module #2 on its internal process of deciding. The effect is autoselecting the goals to follow depending on the most probable world evolution so that the agent has an enjoyable and highly rewarding future in most of them.

Development: This is not as complicated as it sounds, actually it is about using FMC a second time on top of the first, but instead of deciding on the next action to take, now you decide on the next "personality change" of the agent using the same idea but with a deeply different form of entropy: the entropy of the whole tree as a graph (as opposed to using the entropy of only the final leaf nodes as in standard FMC), or its "graph entropy". So it is still waiting for a "coding slot", and will be for a time!


There are some important pieces of the model left out and lots of implementation details worth mentioning, but this is basically all it takes to build an AGI: A sensorial part that deals with inputs and builds a useful simulation of the world, an intelligent planning part that uses the simulation to scan the future and decide, and a final part that defines and modify the reward function to follow.

On its actual form, it is doable in a few years, but the part that have to learn the world dynamics, the ANN, is the weakest part with the most difficult task, it would be the bottleneck of the AGI.

In a short period of time, a link to the next probable post will be: here.


  1. Thanks for posting. I have such a question, should not artificial intelligence have a memory in order to make the most effective decisions for which consciousness is responsible?

    1. Yes, you would need "memory" to not repeat the same search over and over in the planning module, but in this schema, it is considered to be included in the "learning" module so the probability of visiting a state is lower once you visited it many times.

      A different way to think on this learning module is to consider it like just a memory module of what went well (to repeat it more oftenly) and what were bad (to avoid repetitions), so in the planning module, this memories form a prior telling you how likely it should be to go in one direction or the other. Over time, this memory willmake you look like if you knew what you are doing, so you "learned" to move around your environment.

      After all, to properly store the state where things went well or bad, you need to compress all your sensorial input data into some more compact internal representation, so after all, memory needs to "understand" your state and convert it into an internal state with a reduced dimensionality, not so different from a standard "learning" schema.

      Learning is basically an efficient use of memory.