The Nature of Consciousness

Piero Scaruffi

(Copyright © 2013 Piero Scaruffi | Legal restrictions )
Inquire about purchasing the book | Table of Contents | Annotated Bibliography | Class on Nature of Mind

These are excerpts and elaborations from my book "The Nature of Consciousness"

Energy-based Models

In 1982 the US physicist John Hopfield ("Neural Networks And Physical Systems With Emergent Collective Computational Abilities”) revived the field by proving the second milestone theorem of neural networks. He developed a model inspired by the "spin glass" material, which resembles a one-layer neural network in which: weights are distributed in a symmetrical fashion; the learning rule is “Hebbian” (the rule that the strength of a connection is proportional to how frequently it is used, a rule originally proposed by the Canadian psychologist Donald Hebb); neurons are binary; and each neuron is connected to every other neuron. As they learn, Hopfield's nets develop configurations that are dynamically stable (or "ultrastable"). Their dynamics is dominated by a tendency towards a very high number of locally stable states, or "attractors". Every memory is a local "minimum" for an energy function similar to potential energy. Hopfield's argument, based on Physics, proved that, despite Minsky's critique, neural networks are feasible. 

Hopfield's key intuition was to note the similarity with statistical mechanics. Statistical mechanics translates the laws of Thermodynamics into statistical properties of large sets of particles. The fundamental tool of statistical mechanics (and soon of this new generation of neural networks) is the Boltzmann distribution (actually discovered by Josiah-Willard Gibbs in 1901), a method to calculate the probability that a physical system is in a specified state.

Research on neural networks picked up again. In 1982 Kunihiko Fukushima built the "Neocognitron", based on a model of the visual system. 

Building on Hopfield’s ideas, the British computer scientist Geoffrey Hinton and Terrence Sejnowsky (“Massively parallel architectures for A.I.”, 1983) developed an algorithm for the "Boltzmann machine" based on Hopfield's simulated annealing.  In that machine, Hopfield's learning rule is replaced with the rule of annealing in metallurgy (start off the system at very high "temperature" and then gradually drop the temperature to zero), which several mathematicians were proposing as a general-purpose optimization rule.  In this model, therefore, units update their state based on a stochastic decision rule.  The Boltzmann machine turned out to be even more stable than Hopfield's, as it will always ends in a global minimum (the lowest energy state).

Probabilistic reasoning had been introduced into Artificial Intelligence by the Israeli computer scientist Judea Pearl with his “Bayesian networks” (1985).

In 1974 Paul Werbos proposed a "backpropagation" algorithm for neural networks. In 1986 David Rumelhart and George Hinton rediscovered Werbos' backpropagation algorithm ("Learning Representations by Back-propagating Errors", 1986), a "gradient-descent" algorithm that quickly became the most popular learning rule.

The generalized "Delta Rule" was basically an adaptation of the Widrow-Hoff error-correction rule to the case of multi-layered networks, by moving backwards from the output layer to the input layer. This was also the definitive answer to Minsky's critique, as it proved to be able to solve all of the unsolved problems.  Hinton and Rumelhart focused on gradient-descent learning procedures. Each connection computes the derivative, with respect to its strength, of a global measure of error in the performance of the network, and then adjusts its strength in the direction that decreases the error.  In other words, the network adjusts itself to counter the error it made.  Tuning a network to perform a specific task is a matter of stepwise approximation.

The problem with these methods was that they are cumbersome (if not plain impossible) when applied to deeply-layered neural networks, precisely the ones needed to mimic what the brain does.

 


Back to the beginning of the chapter "Connectionism and Neural Machines" | Back to the index of all chapters