One of the central goals of neuroscience research is to understand the nature of the computations carried out in the neocortex. The neocortex is a thin sheet of roughly 1010 neurons and fibers that forms the external surface of much of the brain. The neocortex arose recently in evolution, around the time that mammals branched off from reptiles. Yet over the past 60 million years, the neocortex has shown the most dramatic expansion and elaboration of any brain system. In fact, differences in the neocortex best distinguish the human brain from those of other mammals. The neo-cortex is involved in higher perceptual and cognitive processing and contains areas specialized for decision making, executive function, sensory perception (vision, hearing, touch, and smell), generating action, language, mathematics, and complex memory tasks.
The functional properties of neocortex depend upon its structure. There are four major principles of neocortical organization: First, the cortex is organized vertically into six layers, each layer receiving from and sending connections to various brain sites. Second, there is a horizontal organization of the cortex into columns, such that neurons in a vertical column (of approximately 0.5 millimeter diameter) exhibit similar physiological properties. Third, most cortical areas contain one or more topographic maps. (In a topographic map, adjacent areas of the periphery are represented in adjacent areas of cortex. For example, in visual cortex, area Vl contains a map of the visual field; in somatosensory cortex, area SI contains four separate maps of the body surface.) Finally, cortical areas are functionally segregated—cells in each area are specialized for particular functional tasks. (For example, cells in visual area MT are specialized for motion detection, among other things, whereas those in visual area V4 are specialized for color and pattern analysis, among other things). The maps in different cortical areas are interconnected in a vast and intricate scheme.
In the somatosensory cortex (which mediates touch, temperature, and pain), maps are dynamically organized such that they change continuously in response to skin stimulation (Buonomano and Merzenich, 1998; Kaas, 2000). However, the visual neo-cortex has been the most intensively studied region and provides the clearest example of the kinds of computations the cortex must carry out.
Seeing is a complex act that requires a parallel analysis of shape, motion, depth, color, texture, and other attributes, all leading to the discrimination and recognition of an object. For example, a major cue to depth comes from stereopsis, a process of pattern matching in which the slightly disparate views of the world seen by the two eyes are compared; the distance of various objects is computed from the shift in view. Perception of depth thus results from a cortical computation.
Computations in another set of cortical areas result in perception of color. Contrary to the implications of simple optics, our perceptions of color are not solely determined by the wavelengths of light reflected from an object. As Edwin Land demonstrated, color perception is determined by the relative amount of light of different wavelengths reflected by an object as opposed to that reflected by its surrounding environment. Neurons in cortical area V4 have been found to respond to the color of an object as we perceive it; those in the lower-level area V1 respond to the wavelength of the light, not the color (Zeki, 1980); thus, the color computation probably depends on computations carried out in area V4.
The analysis of motion is similarly carried out in several stages and provides the best-investigated example of a cortical computation. In some species, such as the rabbit and the housefly, neurons in the retina are capable of detecting motion. In most mammals, however, sophisticated motion detection first arises in the cortex. Each cortical neuron receives visual inputs from only a small region of space—that is, it "sees" only a limited visual field. As Marr (1982) first pointed out, this leads to the so-called aperture problem (see Figure 1). If you look at a moving line through a small aperture, you can determine only the component of motion perpendicular to the line because it is impossible to tell whether the line is moving along its own axis. Each neuron faces essentially the same problem. Neurons in V1 are orientationally and directionally selective, which means that they respond only to lines and edges whose orientation falls within a certain narrow range (e.g., within ten degrees of vertical) and that are moving within a narrow range of directions (usually perpendicular to the preferred orientation). The only way around the aperture problem is to combine information from cells with different directional selectivities. This "neural synthesis" probably occurs in area MT (a small region of higher-level visual neocortex that receives input from area V1), as was shown in an ingenious experiment on monkey cortex by Movshon and colleagues (1985).
The key to Movshon's experiment is the visual stimulus presented to the monkey. The stimulus, as shown in Figure 2, consisted of two gratings (arrays of parallel, evenly spaced, lines), oriented at different angles, and each moving in a direction perpendicular to the orientation of their lines. If one draws two such gratings on transparent plastic sheets, superposes them, and moves them with respect to each other, then, instead of two sets of moving lines, one sees a checkerboard pattern moving in a third direction. This third direction is not the vector sum of the two component directions; rather, it is a more complicated function that reflects the intersection of the lines of uncertainty as specified by the constraints of the aperture problem (Movshon, Adelson, Grizzi, and Newsome, 1985).
Researchers presented this stimulus to alert monkeys and observed the responses of neurons in areas V1 and MT. In area MT, they found neurons that re sponded to the direction of motion of the checkerboard. In V1, they found no such neurons; rather, neurons responded to the direction of motion of either one grating or the other. Thus, to speak somewhat broadly, V1 "sees" only the component gratings, whereas MT "sees" the checkerboard. This is a situation analogous to color vision: A higher-level cortical area computes an object-related attribute based on information from lower cortical areas.
Taking this one step further, Bill Newsome and colleagues, in a remarkable series of experiments, demonstrated that the direction of motion of a stimulus, as viewed and reported by an awake, trained monkey, correlates precisely with the activation of individual neurons in area MT. By determining the selectivities of MT cells to various directions of motion, the experimenters could read off the neural response to any test stimulus. When the monkey occasionally made an error in judging the direction or velocity of the stimulus, it was correlated with a failure of the appropriate cells to fire. The investigators could even change the motion direction that the monkey perceived by stimulating particular cells to fire via implanted microelectrodes (Britten et al., 1992). A number of computational models have proposed mechanisms by which area MT may carry out these computations (Schrater, Knill, and Simoncelli, 2000).
Each area of the cortex is dedicated to particular computations; there are, however, similarities to the neuronal and architectural structure across the cortex. This suggests that a basic set of canonical computations may be carried out by all cortical regions and that differences in cortical computations may be largely due to the type of inputs received. It is not clear what the canonical cortical computations are, but researchers have suggested several candidate mechanisms.
Neurons in many regions of the neocortex and in phylogenetically older cortical regions such as the hippocampus can produce and maintain specific patterns of firing in response to a stimulus. As Earl Miller and colleagues have shown, if a monkey is shown a picture such as a photo of another monkey, specific recognition cells in high-level visual cortical areas (such as the inferotemporal IT cortex) will fire and continue to fire even after removal of the stimulus.
Neurons in areas of frontal cortex have an even more remarkable property. They can maintain a pattern of firing (in which certain neurons in the region continually fire while others are silent) even in the presence of distracting stimuli. The ability of a network to maintain a fixed pattern of activity in the absence of direct stimulus input is attractor behavior. Attractor states are stable and self-perpetuating because the inputs each cell receives via connections from other active neurons in the network are sufficient to maintain its firing. Based on theoretical considerations, John Hopfield (1982) proposed that attractor states might correspond to stored memories. Each attractor state (pattern of activity) has a "basin of attraction," a space of related activity patterns that transform into the attractor state, much like particles gravitationally attracted into a black hole. This confers on the network the property of auto-association or error-correction; if a noisy, corrupted or partial memory is used as input, the complete stored pattern can be retrieved.
The ability to stably maintain a piece of information in active or working memory is essential to any computational process. If we attempt to add two plus three, we have to store the first number while we fetch the second and perform the operation. Any type of matching process in recognition (e.g., "Wait a minute; I know that face!") requires holding a stimulus in working memory. Equally important is the ability to release the network from the memory or to use one memory as a prompt to the next associated memory in a sequence. In the frontal cortex, where neurons have the unique ability to hold on to the attractor state even in the face of distracting stimuli, the neuromodulator dopamine may control attractor stability (Durstewitz, Seamans, and Sejnowski, 2000). When dopamine levels are high, the attractors are robustly stable against distraction; when the dopamine level falls, distracting stimuli knock the system into a new pattern of activity corresponding to the novel stimulus. Some investigators have suggested that disorders of the dopamine system, such those typical of schizophrenia, may underlie the symptoms of perseveration (inability to get a thought out of one's head) and delusions/hallucinations (possibly the inappropriate linking of distracting stimuli or memories).
A cortical circuit can carry out different computations at different times. The system of neuromodulators, including acetylcholine, serotonin, noradrena-line, dopamine, and other agents are a control system regulating the kinds of computations and the flow of information between brain regions. For example, J. Lisman and N. A. Otmakhova (2001) have proposed that in the hippocampus bursts of dopamine may transiently switch a circuit into a learning mode (as opposed to an on-line information processing mode). Similarly, Michael Hasselmo (1999) has demonstrated that acetylcholine may switch between learning and recall, and may selectively suppress intrinsic as opposed to extrinsic inputs. The release of acetylcho-line and noradrenaline correlate closely with visual attention, and both acetylcholine and dopamine are associated with inducing rapid changes in cortical connectivity. Disorders of the neuromodulatory system, coupled with structural damage to cells and synapses, are the hallmark of most of the neurodegenerative diseases with cognitive deficits (e.g., Alzheimer's disease).
Perhaps the greatest mystery in understanding cortical processing is the mechanism of integration. A cortical hypercolumn integrates bottom-up, top-down, and horizontal (contextual) inputs. Taking our earlier example of color vision, the perceived color depends upon the incident wavelengths within the receptive field (bottom-up), the contextual inputs from neighboring regions of the visual field (horizontal), and top-down knowledge. Integration and inference are the central principles of cortical function, and mechanisms for these processes have been envisioned in several recent theories. Rajesh Rao and Dana Ballard (1999) have proposed that feedback loops between higher and lower areas carry out a kind of predictive coding. Higher areas "predict" the activity in the lower areas and send this prediction back via descending connections, and feedforward connections to the higher areas convey the residual errors in the prediction—a kind of neural Kalman filter. Several information-theory-based learning algorithms take a similar approach, in which the key operation is maximal compression of the represented information with the minimal loss of information about the inputs. Shimon Ullman has shown that a cortically inspired network based on maximizing mutual information can achieve remarkable image-recognition performance, including the ability to categorize novel objects into previously learned categories.
Cortical processing is extremely rapid. M. Fabre-Thorpe and colleagues (2001) have shown that well-known images can be recognized when presented at interstimulus intervals of fifty milliseconds. Given typical cortical firing rates, this result suggests that recognition and other higher processes require only one or two spikes from each cell—there is no time for any iterations of an algorithm. This harks back to a rule suggested by David Marr that a good neural algorithm operates in the time required for information to reach the relevant circuit but does not require any additional processing time. These timing constraints call for representations with fast dynamics. One likely candidate is the space-rate code (Maass, 2001) in which the degree of activation is represented by the fraction of neurons in a population that fire in a short (e.g., 5 ms) time window. Since, in each subsequent time window, the fraction firing can substantially change, information can be rapidly communicated.
How might recognition make use of spike-timing information? John Hopfield has demonstrated one possible mechanism using the example of speech recognition. Hopfield's network (2001) makes use of an array of feature detectors, each tuned to an onset or offset of sounds in a particular frequency range. Any speech stimulus (e.g., a speaker saying the word one) produces a spatiotemporal pattern of activation of the feature detectors. In Hopfield's network, the feature detectors activate cells in a second network whose firing fate adapts—that is, their firing rate slows with time after the stimulus. The rate of adaptation can be varied and is learned—it is set so that at some time point, all cells responding to the stimulus will have adapted to fire at the same rate. This common rate is transient—cells will continue to adapt, and their firing rates will diverge; except for a brief time window, all cells responding to the stimulus share a similar firing rate. Hopfield proposes that this situation, where a number of cells in a network fire at approximately the same rate, is statistically significant. It could be one of the fundamental kinds of computations carried out by cortical networks.
Understanding cortical computation remains a challenge. But advances in neuroscience, particularly the emergence of optical imaging techniques, coupled with the development of sophisticated information-theory models, offer the promise of new insights. Such advances promise a new era in artificial intelligence and the creation of information technologies powered by biology-based algorithms.
Britten, K. H., Shadlen, M. N., Newsome, W. T., and Movshon, J. A. (1992). The analysis of visual motion: A comparison of neuronal and psychophysical performance. Journal of Neuroscience 12, 4,745-4,765.
Buonomano, D. V., and Merzenich, M. M. (1998). Cortical plasticity: From synapses to maps. Annual Review of Neuroscience 21, 149-186.
Durstewitz, D., Seamans, J. K., and Sejnowski, T. J. (2000). Neurocomputational models of working memory. Nature Neuroscience 3, 1,184-1,191.
Fabre-Thorpe, M., Delorme, A., Marlot, C., and Thorpe, S. (2001). A limit to the speed of processing in ultra-rapid visual categorization of novel natural scenes. Journal of Cognitive Neuroscience 13, 171-180.
Hasselmo, M. E. (1999). Neuromodulation: Acetylcholine and memory consolidation. Trends in Cognitive Science 3, 351-359.
Hopfield, J. J. (1982). Neural networks and physical systems with emergent collective computational abilities. Proceedings of the National Academy of Sciences of the United States of America 79, 2,554-2,558.
Hopfield, J. J., and Brody, C. D. (2001). What is a moment? Transient synchrony as a collective mechanism for spatiotemporal integration. Proceedings of the National Academy of Sciences of the United States of America 98, 1,282-1,287.
Kaas, J. H. (2000). The reorganization of somatosensory and motor cortex after peripheral nerve or spinal cord injury in primates. Progress in Brain Research 128, 173-179.
Land, E. H. (1983). Recent advances in retinex theory and some implications for cortical computations. Proceedings of the National Academy of Sciences of the United States of America 80, 5,163-5,169.
Lisman, J. E., and Otmakhova, N. A. (2001). Storage, recall, and novelty detection of sequences by the hippocampus: Elaborating on the Socratic model to account for normal and aberrant effects of dopamine. Hippocampus 11, 551-568.
Maass, W. (2001). Computation with spiking neurons. In M. A. Arbib, ed., The handbook of brain theory and neural networks. Cambridge, MA: MIT Press.
Marr, D. (1982). Vision: A computational investigation into the human representation and processing of visual information. San Francisco: W. H. Freeman.
Miller, E. K. (2000). The prefrontal cortex and cognitive control. Nature Reviews Neuroscience 1, 59-65.
Movshon, J. A., Adelson, E. H., Grizzi, M. S., and Newsome, W. T.
(1985). The analysis of moving visual patterns. In C. Chagas, R. Gattass, and C. Gross, eds., Pattern recognition mechanisms. Experimental Brain Research Supp. 11, 117-151.
Rao, R. P., and Ballard, D. H. (1999). Predictive coding in the visual cortex: A functional interpretation of some extra-classical receptive-field effects. Nature Neuroscience. 2, 79-87.
Schrater, P. R., Knill, D. C., and Simoncelli, E. P. (2000). Mechanisms of visual motion detection. Nature Neuroscience. 3, 64-68.
Thorpe, S., Fize, D., and Marlot, C. (1996). Speed of processing in the human visual system. Nature 381, 520-522.
VanRullen, R., and Thorpe, S. J. (2001). The time course of visual processing: From early perception to decision-making. Journal of Cognitive Neuroscience 13, 454-461.
Zeki, S. M. (1980). The representation of colours in the cerebral cortex. Nature 284, 412-418.