The Nature of Consciousness

Piero Scaruffi

(Copyright © 2006 Piero Scaruffi | Legal restrictions - Termini d'uso )
Inquire about purchasing the book | Table of Contents | Annotated Bibliography | Class on Nature of Mind

Cognition: A General Property of Matter

(These are excerpts from, or extensions to, the material published in my book "The Nature of Consciousness")

Cognition

Cognition is the set of faculties that allow the mind to process stimuli from the external world and to determine action in the external world. They comprise perception, learning, memory, reasoning and so forth. Basically, we perceive something, we store it in memory, we retrieve related information, we process the whole, we learn something, we store it in memory, we use it to decide what to do next. All of these functions make up cognition.

Is all of cognition conscious? Is there something that we remember, learn or process without being aware of it? Probably. At least, the level of awareness may vary wildly. Sometimes we study a poem until we can remember all the words in the exact order: that requires a lot of awareness. Sometimes we simply store an event without paying too much attention to it. Consciousness is like another dimension. One can be engaged in this or that cognitive task (first dimension) and then it can be aware of it at different levels of intensity (second dimension). It is, therefore, likely that cognitive faculties and consciousness are independent processes.

Since it processes inputs and yields outputs, cognition has the invaluable advantage that it lends itself to modeling and testing endeavours, in a more scientific fashion than studies on consciousness.

Language too is a cognitive process. Given its importance for humans, it deserves a separate treatment, but it is likely that language's fundamental mechanisms are closely related to the mechanisms that support the other faculties.

Mediation

Over the last few decades, psychologists have been deeply influenced by the architecture of the computer. When it appeared, it was immediately apparent that the computer was capable of performing sophisticated tasks that went beyond mere Arithmetic, although they were performed by a complex layering of arithmetic sub-tasks. The fact that the computer architecture was able to achieve so much with so little led to the belief that the human mind could also be reduced to a rational architecture of interacting modules and sequential processes of computation.

In the second half of the 19th century, the German physiologist and physicist Hermann Helmholtz pioneered modern thinking about cognition when he advanced his theory that perception and action were mediated by a (relatively slow) process in the brain. The "reaction time" of a human being is high because neural conduction is slow. His studies emphasized that the stimulus must first be delivered to the brain and the idea of action must first be delivered to the limbs before anything can occur. Helmholtz thought that humans have no innate knowledge, that all our knowledge comes from experience. Perceptions are derived from unconscious inference on sense data: our senses send signals to the brain, which are interpreted by the brain and then turned by the brain into knowledge. Perceptions are mere hypotheses on the world, which may well be wrong (as proven by optical illusions). Perceptions are hypotheses based on our knowledge. Knowledge is acquired from perceptions. This paradigm became the "classical" paradigm of cognition.

Representation

The British psychologist Kenneth Craik speculated in 1943 that the human mind may be a particular type of machine which is capable of building internal models of the world (representing knowledge) and of processing them to produce action (making inferences). Craik's improvement over Descartes' automaton (limited to mechanical reactions to external stimuli) was considerable because it involved the ideas of an "internal representation" and a "symbolic processing" of such representation. Descartes' automaton had no need for knowledge and inference. Craik's automaton needs knowledge and inference. It is the inferential processing of knowledge that yields intelligence.

Symbol Processing

Craik's ideas predated the theory of knowledge-based systems, which were born after the USA economist Herbert Simon and the USA mathematician Alan Newell developed their theory of physical symbol systems. Both the computer and the mind belong to the category of physical symbol systems, systems that process symbols to achieve a goal. A physical symbol system is quite simple: the complexity of its behavior is due to the complexity of the environment it has to cope with.

It was their belief that no complex system can survive unless it is organized as a hierarchy of subsystems. The entire universe must be hierarchical, otherwise it would not exist.

Production

Soon, the most abused model of cognitive psychology became one in which a memory system containing knowledge is operated upon by an inference engine; the results are added to the knowledge base and the cycle resumes indefinitely. For example, I may infer from my knowledge that it is going to rain and therefore add to my knowledge base that there is a need for umbrellas.

In this fashion, knowledge is continuously created, and pieces of it represent solutions to problems. Every new piece of knowledge, whether acquired from the external world or inferred from the existing knowledge, may trigger any number of inferential processes, which can proceed in parallel. Since knowledge is mainly represented via "production rules" (rules that state that something becomes true when something else has become true), these systems are referred to as "production systems". A production rule is, ultimately, a formula of classical Logic.

John Anderson's ACT (1976) was a cognitive architecture capable of dealing with both declarative knowledge (represented by propositional networks) and procedural knowledge (represented by production rules). Declarative knowledge ("knowing that") can be consulted, whereas procedural knowledge ("knowing how") must be enacted in order to be used.

The relationship between the two types of knowledge is twofold. On one hand, the production system acts as the interpreter of the propositional network to determine action. On the other hand, knowledge is continuously compiled into more and more complex procedural chunks through an incremental process of transformation of declarative knowledge into procedural knowledge. Complex cognitive skills can develop from a simple architecture, as new production rules are continuously learned.

Anderson, therefore, thought of a cognitive system as having two short-term memories: a "declarative" memory (that remembers experience) and a "procedural" memory (that remembers rules learned from experience).

Anderson also developed a probabilistic method to explain how categories are built and how prototypes are chosen. Anderson's model maximizes the "inferential potential" of categories (i.e., their "usefulness"): the more a category helps predict the features of an object, the more the existence of that category makes sense. For each new object, Anderson's model computes the probability that the object belongs to one of the known categories and the probability that it belongs to a new category: if the latter is greater than the former, a new category is created.

Later editions of the architecture organize knowledge in three levels: a knowledge level (information acquired from the environment plus innate principles of inference), an algorithmic level (internal deductions, inductions and compilations) and an implementation level (setting parameters for the encoding of specific pieces of information).

Newell, with the help of John Laird and Paul Rosenbloom, proposed a similar architecture, SOAR (1987), based on three powerful concepts. The "universal weak method" is an organizational framework whereby knowledge determines the inferential methods employed to solve the problem, i.e. knowledge controls the behavior of the rational agent. "Universal sub-goaling" is a process whereby goals can be created automatically to deal with the difficulties that the rational agent encounters during problem solving. A model of practice is developed based on the concept of "chunking", which is meant to produce the "power law of practice" that characterizes the improvements in human performance during practice at a given skill: the more you practice, the better you get at it. Within SOAR, each task has a goal hierarchy. When a goal is successfully completed, a chunk that represents the results of the task is created. In the next instance of that task, the system will not need to fully process it because the corresponding chunk already contains the instructions to achieve its goal. The process of chunking proceeds bottom-up in the goal hierarchy. The process will eventually lead to a chunk for the top-level goal for every situation that it can encounter.

These production systems are architectures advanced by the proponents of the symbolic processing approach in order to explain how the mind goes about acting, solving problems and learning how to solve new problems.

Mental Modules

Vision is one of the most important and complex cognitive faculties.

In 1982 the British psychologist David Marr delivered a landmark study on vision and, along the way, devised an influential cognitive architecture. Marr concluded that our vision system must employ innate information to decipher the ambiguous signals that it perceives from the world.

Processing of perceptual data must be performed by "modules", each specialized in some function, which are controlled by a central module. In a fashion similar to the linguist Noam Chomsky and the philosopher Jerry Fodor, Marr assumes that the brain must contain semantic representations that are innate and universal (i.e., of biological nature) in the form of modules that are automatically activated. The processing of such representations is purely syntactical. Marr, Chomsky and Fodor advanced the same theory of the mind, albeit from three different perspectives: they all believe that the mind can be decomposed in modules, and they all believe that syntactical processing can account for what the mind does.

Specifically, Marr explained the cognitive faculty of vision as a process in several steps. First, the physical stimulus from the world is received (in the form of physical energy) by transducers, that transform it into a symbol (in the form of a neural code) and pass it on to the input modules. Then these modules extract information and send it to the central module in charge of higher cognitive tasks.

Each module corresponds to neural subsystems in the brain. The central module exhibits the property of being "isotropic" (able to build hypotheses based on any available knowledge) and "Quinian" (the degree of confirmation assigned to a hypothesis is conditioned by the entire system of beliefs).

The visual system is thus decomposed in a number of independent subsystems. They provide a representation of the visual scene at three different levels of abstraction: the "primal sketch", which is a symbolic representation drawn from the meaningful features of the image (anything causing sudden discontinuities in light intensity, such as boundaries, contours, shading, textures); a two-and-a-half dimensional sketch, which is a representation centered on the visual system of the observer (e.g., describing the surrounding surfaces and their properties, mainly distances and orientation) and computed by a set of modules specialized in parameters of motion, shape, color, etc.; and finally the tri-dimensional representation, which is centered on the object and is computed according to some rules (Shimon Ullman's "correspondence rules").

This final representation is what is used for memory purposes. Not what the retina picked up, but what the brain computed.

Dimensions of Cognition

The obvious criticism against production systems is that they don't "look like" our brain.

David Marr claimed that a scientist can choose either of three levels of analysis: the computational level (which mathematical function the system must compute, i.e. an account of human competence), the algorithmic level (which algorithm must be used, i.e. an account of human performance) or the physical level (which mechanism must implement the algorithm). Different sciences correspond to different levels: Cognitive Science studies the mind at the computational level, Neurology studies the mind at the physical level, Eye Medicine studies the mind at the computational level.

Newell refined that vision by dividing cognition into several levels. The program level represents and manipulates the world in the form of symbols. The knowledge level is built on top of the symbolic level and is the level of rational agents: an agent is defined by a body of knowledge, some goals to achieve and some actions that it can perform. An agent's behavior is determined by the "principle of rationality": the agent performs those actions that, on the basis of the knowledge it has, will bring it closer to the goals. General intelligent behavior requires symbol-level systems and knowledge-level systems.

Newell then broadened his division of cognitive levels by including physical and biological states.

The whole band can be divided into four bands: neural, cognitive, rational and social.

The cognitive band can be divided based on response times: at the memory level the response time (the time required to retrieve the referent of a symbol) is about 10 milliseconds; at the decision level the response time is 100 milliseconds (the time required to manipulate knowledge), at the compositional level it is one second (time required to build actions), at the execution level it is 10 seconds (time required to perform the action).

At the rational band the system appears as a goal-driven organism, capable of processing knowledge and of exhibiting adaptive behavior.

Mental Models

The British psychologist Philip Johnson-Laird has questioned both the plausibility and the adequacy of a cognitive model based on production rules. A mind that only used production rules, i.e. Logic, would behave in a fundamentally different way from ours. People often make mistakes with deductive inference because it is not a natural way of thinking. The natural way is to construct mental models of the premises: a model of discourse has a structure that corresponds directly to the structure of the state of affairs that the discourse describes. For the same reason children are able to acquire inferential capabilities before they have any inferential notions: children solve problems by building mental models that are more and more complex, not by applying the rules of classical Logic (that they have not learned yet).

In his view, the mind represents and processes models of the world. The mind solves problems without any need to use logical reasoning. A sentence is a procedure to build, modify, extend a mental model. The mental model created by a discourse exhibits a structure that corresponds directly to the structure of the world described by the discourse. To perform an inference on a problem the mind needs to build the situation described by its premises. Such mental model simplifies reality and allows the mind to find an "adequate" (not necessarily "exact") solution.

Johnson-Laird's theory admits three types of representation: "propositions" (which represent the world through sequences of symbols), "mental models" (which are structurally analogous to the world) and "images" (which are perceptive correlates of models). Images are ways to approach models. They represent the perceivable features of the corresponding objects of the real world. Models, images and propositions are functionally and structurally different. Linguistic expressions are first transformed into propositional representations. The semantics of the mental language then creates correspondences between propositional representations and mental models, i.e. propositional representations are interpreted in mental models.

But the key to understanding how the mind works is in the mental models.

The French linguist Gilles Fauconnier advocates a similar vision in his theory of "mental spaces". Mental spaces proliferate as we think or talk. The mappings that link mental spaces, especially analogical mappings, play a central role in building our mental life. In particular, "conceptual blending" is a cognitive process which can be detected in many different cognitive, cultural and social activities. By merging different inputs, it creates a blended mental space that lends itself to what we call "creative" thinking. Therefore, Fauconnier finds that the same principles that operate at the level of meaning construction operate also at the level of scientific and artistic action.

The USA linguist George Lakoff has given mental spaces an internal structure with his theory of "idealized cognitive models" that are embodied, i.e. they are linked with bodily experience.

Mental Imagery

"Mental imagery" is seeing something in the absence of any sensory signal, such as visualizing an object that is not actually present. The mystery is what is seen if in the brain there is no such image. When I stare at an object, I "see" the image that the visual system creates in the brain (whatever projection of dots through the retina to this or that region of the brain). But am I "seeing" when I am simply imagining a Ferrari?

Scientists have found no pictures or images in the brain, no internal eye to view pictures stored in memory and no means to manipulate them. Nevertheless, there is an obvious correspondence between a mental image of an object and the object.

The USA psychologist Ronald Finke, for example, has identified five principles of equivalence between a mental image and the perceived object: the principle of implicit encoding (information about the properties of an object can be retrieved from its mental image), the principle of spatial equivalence (parts of a mental image are arranged in a way that corresponds to the way that the parts of the physical object are arranged), the principle of perceptual equivalence (similar processes are activated in the brain when the objects are imagined as when they are perceived), the principle of transformational equivalence (imagined transformations and physical transformations are governed by the same laws of motion), the principle of structural equivalence (the mental imagery exhibits structural features corresponding to those of the perceived object such that the relations between the object's parts can be both preserved and interpreted).

During the 1980s the debate became polarized around two main schools of thought: either (The USA psychologist Stephen Kosslyn) the brain maintains mental pictures that somehow represent the real-world images, or (the USA psychologist Zenon Pylyshyn) the brain represents images through a "non-imaginistic" system, namely language, i.e. all mental representations are descriptive and not pictorial (there are no picture-like representations in the brain).

Kosslyn put forth a representational theory of the mind of a "pictorial" type, as opposed to Jerry Fodor's propositional theory and related to Philip Johnson-Laird's mental models. Kosslyn thinks that the brain builds visual representations, which are coded in parts of the brain, and which reflect what they represent. Mental imagery involves scanning an internal picture-like entity. Mental images can be inspected and classified using pretty much the same processes used to inspect and classify visual perceptions. For example, they can be transformed (rotated, enlarged, reduced).

There exist two levels of visual representation: a "geometric" level, which allows one to mentally manipulate images, and an "algebraic" one, which allows one to "talk" about those images.

Kosslyn thinks that mental imagery achieves two goals: retrieve properties of objects, and predict what would happen if the body or the objects should move in a given way. Reasoning on shapes and dimensions is far faster when we employ mental images than concepts.

Kosslyn's is a theory of high-level vision in which perception and representation are inextricably linked. Visual perception (visual object identification) and visual mental imagery share common mechanisms.

Opposed to Kosslyn's "pictorialism" is Pylyshyn's "descriptionalism". Pylyshyn believes in a variant of Fodor's language of thought: to him images are simply the product of the manipulation of knowledge encoded in the form of propositions.

The "dual coding theory" of the Canadian psychologist Allan Paivio mediates these positions because it argues that the mind may use two different types of representation, a verbal one and a visual one, corresponding to the brain's two main perceptive systems. They both "encode" memories, but they do so in different ways (codes).

The neural processes that correspond to mental imagery in seeing people have also been detected in the brains of congenitally blind people. Thus mental imagery cannot possibly depend on forming a mental reconstruction of former visual sense experiences.

Mindsight

British philosopher Colin McGinn believes that percepts (the actual seeing of an object) and images (visualizing the object in the mind) are basically different "substances". McGinn comes up with a list of (nine) properties that differentiate them: percepts are unwanted (I see a tree when I see a tree, not when I want to see a tree), percepts contain information (images come from our minds and therefore we already have whatever information we put into visualizing them), percepts are located somewhere relatively to our body (whereas images are located in an abstract space of the mind), percepts don't come alone but with a background each point of which is in turn a percept, percepts exist whether we focus on them or not (whereas images exist only when we consciously visualize them), percepts are prone to error (I may recognize somebody who in fact was somebody else, whereas images are error-free because they are images of what I want to visualize), percepts do bo block thinking about them (images do block thinking about them), etc.

He neglects one that most people would consider the most obvious one: I can perceive things that I have never imagined, but I cannot imagine things that I have never perceived. (I can construct images of objects that I have never seen, but those constructions are made out of objects that I have seen).

Furthermore he is not fair to images. I do visualize things and people without wanting to. If I stumble in the name of a friend while I am reading a novel, I can’t help visualizing my friend. It is not my "will" that visualizes my friend, but some kind of conditional reflex. Thus a percept may "force" an image, and the relationship between percept and image is not so obvious (in this case it is the relationship between a five-letter word and a human face).

McGinn thinks that images constitute the core of our cognitive life. Dreams, for examples, are complex systems of mental images. He has to posit a split in the self, a dream producer versus a dream consumer, because he has posited that images are active (dreams do not look active, they look more like percepts: the way we react to dreams is exactly the way we react to percepts). His idea is that dreams are passive for the self qua consumer.

McGinn also believes that the development of logic and language was triggered by the emergence of imagination: in order to understand a sentence one has to be capable of "imagining" its content. Imagination came first, meaning came later. This sounds more convincing, although he spends precious little time on such a monumental topic.

Imagination thus becomes a fundamental cognitive faculty, the one on which the most sophisticated cognitive faculties depend. The ability to create and manipulate mental images is, in a sense, what gives us an inner life.

One big problem with mental images is that we call them "images". In reality, when I recall a friend, I also recall his voice, and, when I recall an ice cream, I also recall its taste. I even recall my dislike for spiders whenever I recall a spider. Thus "mental images" are actually not images at all. They are indeed something very different from visual percepts because... they are not "visual".

Semantics and Pragmatics of Vision

A new paradigm for the study of human vision was introduced by the Canadian physiologist Melvyn Goodale and the British psychologist David Milner. They analyzed the visual pathways in the cerebral cortex (the "visual cortex") and realized that vision is actually a binary operation made of dual processes: on one hand is the conscious visual experience of the world, on the other hand is the visual control of unconscious (instinctive) action. They both require the eye as the organ, but they are functionally and structurally different processes.

Sensory information received from the eye diverges into two streams (two anatomically different pathways) when it leaves the visual cortex: a "ventral" stream flows from the primary visual cortex towards the inferior temporal lobe, while the "dorsal" stream flows from the primary visual cortex towards the posterior parietal lobe. This means that the ventral stream analyzes what object the eye is seeing, whereas the dorsal stream analyzes the spatial location of the object. In other words, the ventral stream is about recognition (e.g. of faces or objects) and conscious perception, whereas the dorsal stream is about automatic, unconscious action in space directed towards the object (typically action by the hand).

In a sense, there exist two kinds of vision: conscious perception and unconscious action. They are physically handled by two separate systems in the brain. There isn't a visual system: there are two visual systems that work in parallel.

Studying the difference between these two functional roles of the visual system, the French neurologist Marc Jeannerod had advanced the theory that there are two information processing systems for visual input: the "semantic processing system" (that yields a perceptual representation of the object) and the "pragmatic processing system" (that yields a motor representation of the same object).

The "dorsal" visual system that is common to most animals helps the animal carry out the orienting function: to detect movement (typically, either danger or food) and to guide action. The ventral visual system of the human brain is relatively distinct from the dorsal system and is related to the "semantic processing system" that analyzes and "recognizes" the object. This allows the human brain to carry out more sophisticated actions in response to a visual act. Human vision therefore originated when the functions of orientation and identification got separated.

The Frame

In the 1920s the German psychologist Otto Selz had a fundamental intuition: to solve a problem entails to recognize that the situation represented by the problem is described by a known "schema" and fill the gaps in the schema. A schema is a network of concepts that organize past experience.

Given a problem, the cognitive system searches the (long-term) memory for a schema that can represent it. Given the right schema, information in excess contains the solution.

Representation of past experience if a complete schema. Representation of present experience is a partially complete schema. By comparing the two representations (the complete schema that was created in the past with the partial schema that describes the current situation) one can infer (or, better, "anticipate") something relative to the present situation. For example, a schema tells us how to cross a street. Whenever we want to cross a street, we look for (i.e., we know that there must be) a traffic light. Thanks to the schema's anticipatory nature, to solve a problem is equivalent to comprehend it, and comprehending ultimately means reducing the current situation to a past situation.

The USA psychologist Edward Tolman also thought that learning involves acquisition of knowledge about the world, or the creation of a "cognitive map". Such a cognitive map helps us navigate the world, or, better, respond to the world, because it encodes our "expectations" about the world.

In the 1960s Marvin Minsky rediscovered Selz’s ideas: his "frame" is but a variation on Selz’s schema.

A "frame" is a packet of information that helps recognize and understand a scene. It represents stereotypical situations and finds shortcuts to ordinary problems. A frame is the description of a category by means of a prototypical member plus a list of actions that can be performed on any member of the category. A prototype is described simply by a set of default properties. Default values, in practice, express a lack of information, which can be remedied by new information. Any other member of the category can be described by a similar frame that customizes some properties of the prototype.

Technically, a frame can provide multiple representations of an object: taxonomic (a conjunction of classification rules), descriptive (a conjunction of propositions of the default values) and functional (a proposition on the admissible predicates).

Memory is a network of frames, one for each known concept. Each perception selects a frame (i.e., classifies the current situation in a schema) which must then be adapted to that perception; and this is equivalent to interpreting the situation and deciding which action must be performed. Reasoning is adapting a frame to a situation. Knowledge imposes coherence on experience.

Because it does not separate cognitive phenomena such as perception, recognition, reasoning, understanding and memory which seem to occur always at the same time, the frame is more biologically plausible than other forms of knowledge representation that treat them as independent and sequential processes. Moreover, it offers computational advantages, because it focuses reasoning on the information that is relevant to the situation at hand.

Minsky later generalized the idea of the frame in a more ambitious model of how memory works. When a perception, or a problem-solving task, takes place, a data structure called "K-Line" (Knowledge Line) records the current activity (all the "agents" active at that time in memory). The recall of that event or problem is a process of rebuilding what was active (the agents that were active) in memory at that time. Agents are not all attached the same way to K-lines. Strong connections are made at a certain level of detail, the "level-band", whereas weaker connections are made at higher and lower levels. Weakly activated features correspond to assumptions by default, which stay active only as long as there are no conflicts. K-lines connect to K-lines and eventually form societies of their own.

The Script

In the 1970s, Roger Schank employed similar ideas in his model of "conceptual dependency" and in his theory of "case-based reasoning".

Case-based reasoning is a form of analogical reasoning in which the elementary unit is the "case", or situation. A type of memory called "episodic" archives generalizations of all known cases. Whenever a new case occurs, similar cases are retrieved from episodic memory. Then two things happen. First, the new case is interpreted based on any similar cases that were found in the episodic memory. Second, the new case is used, in turn, to further refine the generalizations, which are then stored again in episodic memory.

The crucial features of this model are similar to the ones that characterize frames. Interpretation of the new case is expectation-driven, based on what happened in previous cases. Episodic memory contains examples of solutions, rather than solutions or rules to find solutions.

Because the episodic memory is continuously refined, Schank refers to it more generally as "dynamic" memory: it can grow of its own, based on experience. The script is an extension of the idea of the case.

A scene is a general description of a setting and of a goal in that setting. A script is a particular instantiation of a scene (many scripts can be attached to one scene).

A script is a social variant of Minsky's frame. A script represents stereotypical knowledge of situations as a sequence of actions and a set of roles. Once the situation is recognized, the script prescribes the actions that are sensible and the roles that are likely to be played. The script helps understand the situation and predicts what will happen. A script performs "anticipatory" reasoning.

For example, a script describes the scene of a restaurant: the host seats the customers as they walk in and hands them a menu, the waiter takes their order and delivers it to the kitchen, the waiter brings the food, the waiter brings the bill, the waiter brings the change. When we enter a restaurant, we know what to expect. The moment we recognize a building as a restaurant, we know that there are waiters inside. If we walk in, we "expect" to be handed a menu, and we expect a waiter to take our order, and at the end we expect the bill. This is all in the script for restaurants. Our daily life is controlled by a multitude of scripts that direct our actions in all stereotypical situations (the vast majority of the situations that we encounter in our life).

A script is a generalization of a class of situations. If a situation falls into the context of a script, then an expectation is created by the script, based on what happened in all previous situations. If the expectation fails to materialize, then a new memory must be created. Such new memory is structured according to an "explanation" of the failure. Generalizations are created from two identical expectation failures. Memories are driven by expectation failures, by the attempt to explain each failure and learning from that experience. New experiences are stored only if they fail to conform to the expectations.

Here, again, remembering is closely related to understanding and learning. Memory has the passive function of remembering and the active function of predicting. The comprehension of the world and its categorization proceed together.

More and more complex structures were added by Schank and his associates to the basic model of scripts. A "memory organization packet" (MOP) is a structure that keeps information about how memories are linked in frequently occurring combinations. A MOP is both a storing structure and a processing structure. A MOP is basically an ordered set of scenes directed towards a goal. A MOP is more general than a script in that it can contain information about many settings (including many scripts). A "thematic organization packet" is an even higher-level structure that stores information independent of any setting.

Ultimately, knowledge (and intelligence itself) is stories. Cognitive skills emerge from discourse-related functions: conversation is "reminding" and storytelling is "understanding" (and in particular "generalizing"). The stories that are told differ from the stories that are in memory: in the process of being told, a story undergoes changes to reflect the intentions of the speaker. The mechanism underlying stories is similar to script-driven reasoning: understanding a story entails finding a story in memory that matches the new story and enhancing the old story with details from the new one. Underlying the mechanism is a process of "indexing" based on identifying five factors: theme, goal, plan, result and lesson. Memory actually contains only "gists" of stories, that can be turned into stories by a number of operations (distillation, combination, elaboration, creation, captioning, adaptation). Knowledge is embodied in stories and cognition is carried out in terms of stories that are already known.

This may explain both the passion for sport races (whether car racing or cycling) and for narrative art (whether films or novels). Both categories of human activities basically construct very complicated stories that challenge our minds. As we follow a race, we construct a story based on the stereotype actions that can happen in a race. And we root for our idol based on what stereotype actions would propel her/him to the head of the race. As we watch a film, we construct a story based on the stereotype actions that can happen in a film. And we identify with the protagonist based on what stereotype actions would make her/him succeed.

The Self-organizing Schema

Schemas resurface also in the work of the Australian mathematician Michael Arbib. Just like with Minsky’s frames and Schank’s scripts, Arbib argues that the mind constructs reality through a network of countless schemas. And, again, a schema is both a mental representation of the world and a process that determines action in the world.

Arbib's theory of schemas is based on two notions, one developed by an USA mathematician of the 19th century, Charles Peirce, and one due to the Swiss psychologist Jean Piaget of the 1930s. The first one is the notion of a "habit", a set of operational rules that, by exhibiting both stability and adaptability, lends itself to an evolutionary process. The second one is the notion of a "schema", the generalizable characteristics of an action that allow the application of the same action to a different context (yet another variation on Selz). Both assume that schemas are compounded as they are built to yield successive levels of a cognitive hierarchy.

Arbib argues that categories are not innate, they are constructed through the individual's experience. What is innate is the process that underlies the construction of categories. Therefore, Arbib’s view of the rules of categories is similar to Norman Chomsky's view of the rules of language.

What sets Arbib’s theory apart from Minsky’s and Schank’s is that Arbib’s theory is more closely modeled after a view of the brain as an evolving self-organizing system of interconnected units, e.g. with neural networks.

Conceptual Graphs

Both frames and scripts are ultimately ways of representing concepts. A broader abstraction with a similar purpose has been proposed by the USA mathematician John Sowa in his theory of "conceptual graphs", which is based both on Selz’s schemas and on Peirce's "existential graphs" (his graph notation for logic).

A "conceptual graph" represents a memory structure generated by the process of perception. In practice, a conceptual graph describes the way percepts are assembled together. Conceptual relations describe the role that each percept plays.

Technically speaking, conceptual graphs are finite, connected, bipartite graphs (bipartite because they contain both concepts and conceptual relations, represented respectively by boxes and circles). Some concepts (concrete concepts) are associated with percepts for experiencing the world and with motor mechanisms for acting upon it. Some concepts are associated with the items of language. A concept has both a type and a referent. A hierarchy of concept-types defines the relationships between concepts at different levels of generality.

Formation rules ("copy", "restrict", "join" and "simplify") constitute a generative grammar for conceptual structures just like production rules constitute a generative grammar for syntactic structures. All deductions on conceptual graphs involve a combination of them.

Habits

Charles Peirce, the founder of pragmatism, once proposed a unifying view of matter and mind, although it was disguised as a theory of "habits". Peirce believed that there was no absolute definition of things, including truth itself. An object is defined by the effects of its use: a definition that works well is a good definition. An object "is" its behavior. The meaning of a concept consists in its practical effects on our daily lives: if two ideas have the same practical effects on us, they have the same meaning. The meaning of a concept is a function of the relations among many concepts: a concept refers to an object only through the mediation of other concepts.

Truth is usefulness and validity: something is true if it can be used and validated. In practice, truth is defined by consensus of the society. Truth is not agreement with reality, it is agreement among humans (reached after a process of scientific investigation). Truth is "true enough", not necessarily an absolute, unchanging truth. Truth is a process, a process of self-verification.

What is relevant is not the concept of "true", but the "belief". We use beliefs in our daily lives, not theorems that prove what is true and what is false. Beliefs become fixed over a lifetime through experience and verification. I believe something if that belief has proven useful over the course of my life. Beliefs lead to the formation of habits that, in turn, get reinforced through experience. The more useful that belief turns out to be, the stronger it becomes in my mind. Ditto for habits: the better they work for me, the more "habitual" they become.

Peirce noted that the process of habit creation is pervasive in nature. All matter acquires habits. Matter is mind whose "beliefs" have been fixed to the extent that they can’t be changed anymore. Habit is what makes objects what they are. An object is defined by the set of all its possible behaviors, i.e. by its "habits". I am my habits. It makes no sense to talk of something or someone who does not have habits: randomness is absence of an identity.

The laws of Physics describe the habits of matter, because what physicists observe is the habits of nature. For example, heavenly bodies have the habit of attracting each other, thus the laws of gravitation.

Systems evolve because of chance, which is inherent to the universe ("tychism"). Habits progressively remove chance from the universe. The universe is evolving from absolute chaos (chance and no habits) towards absolute order (all habits are fixed). One can see Darwinian evolution at work on systems towards stronger and stronger habits. Human beliefs are a particular case of habits, that also get fixed through experience.

Enter the Body

Largely inspired by French philosopher Maurice Merleau-Ponty, a number of USA philosophers, such as Mark Johnson and Maxine Sheets-Johnstone, have "rediscovered" the body.

Cognitive scientists tend to focus on the input (perception) and neglect the output (action). Most models of the mind have been built by concentrating on perception: how the world is perceived by the mind. Very little is usually said on how the mind acts on the world. But action (movement) is no less important a part of our experience.

Sheets-Johnstone retraces Merleau-Ponty's philosophy in claiming that thinking is modeled on the body and grounded in animate form. The "tactile-kinesthetic body" is the source of corporeal concepts. All our cognitive life is grounded in movement. Consciousness does not arise from matter, but from self-movement. Even the simplest forms of life enjoy a "meta-corporeal consciousness of the chemical constitution of the environment".

The Process Behind the Structure

Robert Cairn once used an analogy: evolution is to biology what development is to psychology, i.e. the process behind the structure.

The USA psychologists Esther Thelen and Linda Smith advanced a theory of development that was as opportunistic as evolution. Their emphasis is on processes of change, on the ever-active self-organizing processes of living systems that are analogous to the selection algorithms of evolution.

For them development is the outcome of the interplay between action and perception within a system that, by its thermodynamic nature, seeks stability. Performance and cognition emerge from this process of interaction between a system and its environment. Cognition, in particular, is an emergent structure, situated and embodied, just like any other skill of the organism.

The development of the brain seems to be orderly, incremental, and directional (towards nutritional independence and reproductive maturity), but this is an illusion. In reality development is not driven by a grand design: it is driven by opportunistic, syncretic and exploratory processes. At a closer look, in fact, development is modular and heterochronic (i.e., different organs develop at different rates and different times), although the organism progresses as a whole. Global regularities (and simplicity) somehow arise from local variabilities (and complexities).

Knowledge for thought and action (i.e., categories) emerges from the dynamics of pattern formation in the context of neural group selection. Perception, action and cognition are rooted in the same pattern formation processes. Categories arise (self-organize) spontaneously and reflect the experiences of acting and perceiving, i.e. of interacting with the world. More precisely, categories are created through the cross-relation of multimodal (hearing, seeing, feeling, etc) experiences. Unity of perception and action is reflected in the way categories are formed. Development can then be viewed as the dynamic selection of categories. Categories are the foundation of cognitive development.

At the same time, categories are but a specific case of pattern formation. Therefore, cognitive development is a direct consequence of the properties of nonlinear dynamic systems, i.e. of self-organizing complex systems.

Being in the world "selects" categories. Therefore meaning itself is emergent.

These features are shared by all organisms: every living system is a cognitive system.

The Unity of Cognition

Perception, memory, learning, reasoning, understanding and action are simply different aspects of the same process. This is the opinion implicitly stated by all schema-based models of cognition. All mental faculties are simply different descriptions of the same process, different ways of talking about the same thing: one, whole process of cognition. There is never perception without memory, never memory without learning, never learning without reasoning, never reasoning without understanding, and so forth. One happens because all happen at the same time.

The mind contains this powerful algorithm that operates on cognitive structures. That algorithm has been refined by natural selection to be capable of responding in optimal time. This might be the case partly because that algorithm operates on structures that already reflect the nature of our experience. Our experience occurs in situations, each situation being a complex aggregate of factors. The actions that we perform in a given situation are rather stereotyped. The main processing of the algorithm goes into recognizing the situation. Once the situation is recognized, somehow it is reduced to past experience and that helps figure out quite rapidly the appropriate action.

Needless to say, various levels of cognition can be identified in other animals, and even in plants. Even in crystals and rocks. Everything in nature can be said to remember and to learn, everything can be said to be about something else.

Cognition is not "all" there is in the mind: this is the utilitarian, pragmatic, mechanic part of the mind. The mind also has awareness. But consciousness does not seem to contribute much to the algorithm, does not seem to significantly affect the structure of past experience, does not seem to have much to do with our ability to deal with situations. A being with no consciousness, but with the same cognitive algorithm and the same cognitive structures (i.e., with the same cognitive architecture), would probably behave pretty much like us in pretty much all of our daily actions, without the emotions.

Cognition does not seem to require consciousness. Ultimately, it is simply a material process of self-organization. It seems possible to simulate this process by an algorithm, which means that cognition is not exclusive to conscious beings. It may well be possible to build machines that are cognitive systems. Cognition may actually turn out to be a general property of matter, of all matter, living and nonliving.

Further Reading

Anderson John-Robert: THE ARCHITECTURE OF COGNITION (Harvard Univ Press, 1983)

Anderson John-Robert: THE ADAPTIVE CHARACTER OF THOUGHT (Lawrence Erlbaum, 1990)

Anderson John-Robert: RULES OF THE MIND (Lawrence Erlbaum, 1993)

Arbib Michael: CONSTRUCTION OF REALITY (Cambridge University Press, 1986)

Ballard Dana: COMPUTER VISION (Prentice Hall, 1982)

Block Ned: IMAGERY (MIT Press, 1981)

Craik Kenneth: THE NATURE OF EXPLANATION (Cambridge Univ Press, 1943)

Fauconnier Gilles: MENTAL SPACES (MIT Press, 1994)

Finke Ronald: PRINCIPLES OF MENTAL IMAGERY (MIT Press, 1989)

Finke Ronald: CREATIVE COGNITION (MIT Press, 1992)

Franklin Stan: ARTIFICIAL MINDS (MIT Press, 1995)

Green David: COGNITIVE SCIENCE (Blackwell, 1996)

Hampson Peter & Morris Peter: UNDERSTANDING COGNITION (Blackwell, 1995)

Johnson-Laird Philip: MENTAL MODELS (Harvard Univ Press, 1983)

Johnson-Laird Philip: THE COMPUTER AND THE MIND (Harvard Univ Press, 1988)

Johnson-Laird Philip & Byrne Ruth: DEDUCTION (Lawrence Erlbaum, 1991)

Johnson, Mark: THE BODY IN THE MIND (University of Chicago Press, 1987)

Kosslyn Stephen: IMAGE AND MIND (Harvard University Press, 1980)

Kosslyn Stephen: GHOSTS IN THE MIND'S MACHINE (W. Norton, 1983)

Kosslyn Stephen & Koenig Olivier: WET MIND (Free Press, 1992)

Kosslyn Stephen: IMAGE AND BRAIN (MIT Press, 1994)

Laird John, Rosenbloom Paul & Newell Alan: UNIVERSAL SUBGOALING AND CHUNKING (Kluwer Academics, 1986)

Leyton Michael: SYMMETRY, CAUSALITY, MIND (MIT Press, 1992)

Luger George: COGNITIVE SCIENCE (Academic Press, 1993)

Marr David: VISION (MIT Press, 1982)

McGinn Colin: MINDSIGHT (2004)

Goodale, Melvyn & Milner, David: THE VISUAL BRAIN IN ACTION (Oxford University Press, 1995)

Minsky Marvin: SEMANTIC INFORMATION PROCESSING (MIT Press, 1968)

Minsky Marvin: THE SOCIETY OF MIND (Simon & Schuster, 1985)

Newell Allen: UNIFIED THEORIES OF COGNITION (Harvard Univ Press, 1990)

Paivio Allan: IMAGERY AND VERBAL PROCESSES (Holt, Rinehart and Winston, 1971)

Posner Michael: FOUNDATIONS OF COGNITIVE SCIENCE (MIT Press, 1989)

Pylyshyn Zenon: COMPUTATION AND COGNITION (MIT Press, 1984)

Pylyshyn Zenon: SEEING AND VISUALIZING (MIT Press, 2003)

Schank Roger: SCRIPTS, PLANS, GOALS, AND UNDERSTANDING (Lawrence Erlbaum, 1977)

Schank Roger: DYNAMIC MEMORY (Cambridge Univ Press, 1982)

Sheets-Johnstone, Maxine: THE PRIMACY OF MOVEMENT (John Benjamins, 1981)

Sowa John: CONCEPTUAL STRUCTURES (Addison-Wesley, 1984)

Stillings Neil: COGNITIVE SCIENCE (MIT Press, 1995)

Thelen, Esther & Smith Linda: A DYNAMIC SYSTEMS APPROACH TO THE DEVELOPMENT OF COGNITION AND ACTION (MIT Press, 1994)

Ullman Shimon: THE INTERPRETATION OF VISUAL MOTION (MIT Press, 1979)