 Intelligence is not Artificial

(These are excerpts from my book "Intelligence is not Artificial")

### The Search for an Artificial General Intelligence

There were many side tracks that didn't become as popular as expert systems and neural networks.

At the famous conference on A.I. of 1956 there was a third proposal for A.I. research. The Boston-based mathematician Ray Solomonoff presented "An Inductive Inference Machine" for machine learning, which sketched a universal general-purpose learner. Induction is the kind of learning that allows us to apply what we learned in one case to other cases. His method used Bayesian reasoning, i.e. it introduced probabilities in machine learning.

The problem of induction is to discover the theory (or, if you prefer, the cause) that explains the facts. If a number of people are sick and show the same symptoms, we want to find out what is the disease. We look for explanations to what happens in our world. In other words, induction answers the primordial question: "Why?" If we find the cause, we should be able to predict the future.

There are usually many possible causes, each of which explains the facts: those are hypotheses. In fact, there are an infinite number of them, although some are extremely unlikely (e.g., i could be writing this sentence because an evil force has hijacked my brain and is dictating me what to write).

The "Epicurean principle" is to consider all hypotheses that explain the facts. Mathematically speaking, facts are data and each hypothesis is an algorithm that consumes data and produces data: if it produces the data that we observe, then the algorithm represents a valid hypothesis, i.e. it could be the explanation that we are looking for.

Algorithms can be represented by Turing machines. Therefore the problem of induction is to find the best Turing machine that is compatible with our data. If there are many, we need a method to pick one. Solomonoff chose to use "Occam's razor", a medieval principle that basically picks the simplest explanation, i.e. the simplest algorithm, i.e. the simplest Turing machine, i.e. the least complex one.

This requires a definition of "complexity" in Turing machines, which are sequences of zeroes and ones. Solomonoff ("A Formal Theory of Inductive Inference", 1964) and the Russian mathematician Andrei Kolmogorov ("Three Approaches to the Quantitative Definition of Information", 1965) proposed a measure of an algorithm's complexity: the shortest possible description of it; or, equivalently, the least number of bits necessary to write its Turing machine. In today's jargon: the shortest program that computes it. For example, the complexity of "pi" is the ratio between a circumference and its diameter. Occam's razor can now be formulated mathematically, for example: a hypothesis that is one bit shorter is twice as likely to be the true hypothesis.

That is how Solomonoff ranked hypotheses, using Turing Machines to represent hypotheses and "algorithmic information theory" to quantify their complexity. Each hypothesis/ algorithm/ program can be assigned a "prior probability" proportional to its simplicity. Simpler hypotheses/ programs are assigned greater prior probability.

Bayes' theorem is then used to promote programs whose outputs match the facts/ data. This way the system learns to correctly predict anything using the minimum amount of data.

Solomonoff's inductive inference program is a universal predictor, and it's the best that one can build because all possible solutions are contained in Solomonoff's giant factory of programs, including anything that you can think.

Unfortunately, Solomonoff induction cannot be computed because it requires infinite calculations. Then it became a matter of finding approximations to Solomonoff's perfect algorithm. First came the Ukrainian mathematician Leonid Levin, who had studied under Kolmogorov, with a tractable but still impractical approximation ("The Complexity of Finite Objects and the Development of Concepts of Information and Randomness by Means of the Theory of Algorithms", 1970). Incidentally, Levin later proved the existence of "NP-complete" problems, one of the great theorems of 20th century mathematics ("Universal Sequential Search Problems", 1972). Solomonoff himself worked on a "scientist's assistant " capable of solving two general classes of problems ("The Application of Algorithmic Probability to Problems in Artificial Intelligence," 1986).