Marr, David Courtnay

views updated

MARR, DAVID COURTNAY

(b. 19 January 1945, Essex, England, d. 17 November 1980, Cambridge, Massachusetts),

cognitive neuroscience, computational neuroscience, information processing, representation, theoretical neuroscience, vision.

During his tragically brief career, Marr founded the discipline that has come to be known as computational neuroscience. He developed the codon theory of information processing in the brain, which claims that the brain’s central function is statistical pattern recognition. He also recognized the fundamental distinction between the computational, algorithmic, and implementational levels of theory in describing brain function. Above all, he was interested in answering the question: What is it that the brain does?

Early Years . David Courtnay Marr was born in Essex, England. He attended Rugby, a British public school (similar to a private high school in the United States), on scholarship. In 1963, he went to Trinity College of Cambridge University, where he obtained BS and MS degrees in mathematics in 1966. Then, instead of pursing mathematical study, he switched his focus to neuroscience and studied neuroanatomy, neurophysiology, and molecular biology, earning a doctorate in theoretical neuroscience under the supervision of Giles Bradley in 1969. After obtaining his PhD, Marr accepted an appointment to the scientific staff of the Medical Research Council Laboratory of the Molecular Biology Department in Cambridge in the division of Cell Biology, working under Sydney Brenner and Francis Crick.

The results of his dissertation, essentially a theory of cerebellar function, inspired his three articles published between 1969 and 1971. Marr’s published theory was mathematical, but it also reflected the known anatomical and physiological data of the time. Despite the immense advances in neuroscience afterward, much of Marr’s theoretical work from his dissertation remained viable and relevant in the early twenty-first century. In the three articles, Marr described a possible function for each of three major brain structures: the cerebellum, the archicortex (the older part of the cerebral cortex), and the neocortex. Each of the answers dovetails with the others around the basic idea that the brain is a sophisticated pattern recognition device, working in a very high-dimensional feature space. The basic feature in all of these functions is a “codon,” or a subset of properties that perhaps individual cells fire in response to.

The cerebellum, Marr surmised, learns the motor skills associated with actions, movements, and posture. The cerebellar Purkinje cells learn to associate particular actions with the contexts in which they are performed. Marr suggested that they do this by modifying their synapses using some sort of codon representation. Later, after learning, the Purkinje cells initiate the appropriate actions when exposed to the correct context alone. Many researchers still agree with the broad outlines of Marr’s theory regarding how the cerebellum works.

In his second article, Marr then extended this idea to more general types of statistical pattern recognition and conceptual learning, arguing that this principle is probably behind many aspects of basic brain function. Marr called this idea his Fundamental Hypothesis: “where instances of a particular collection of intrinsic properties (i.e. properties already diagnosed from sensory information) tend to be grouped such that if some are present, most are, then other useful properties are likely to exist which generalize over such instances. Furthermore, properties often are grouped in this way” (1970, pp. 150–151). These ideas foretold much of contemporary neural network modeling, which assumes that learning consists in optimizing underlying probabilistic representational space.

Take the problem of facial recognition as an example. Humans can recognize one another’s faces, even though they are all very similar to one another and one never sees them in exactly the same position or with the same expression as one did formerly. One can recognize the frontal view of smiling face as the same face seen frowning in profile last week. How does one do this? How does one keep track of all the properties that make up a face, and how does one know which properties are important to keep track of?

Consider a multi-dimensional graph in which each dimension represents one property of some complex object, like a face. Each point in the multi-dimensional space of the graph would then represent one possible face in one possible configuration. These points can be grouped together in graphs that are close to one another into “sub-volumes.” Even though these grouped points are close to one another, which properties have which value will not be the same across points. Indeed, looking at the values of individual properties one at a time may not reveal whether their respective multi-dimensional points would be close together in space.

It turns out that the points associated with one particular face, seen from a variety of angles and with a variety of expressions is usually concentrated in a single and obvious sub-volume. To learn to recognize faces most efficiently, instead of learning all the possible values for each individual property, it is best to learn the distribution of sub-volumes in the multi-dimensional space. And that, thought David Marr, is exactly how the brain does it. In the early twenty-first century, neural network modelers tried to figure out what these graphs look like and exactly how one should be sensitive to the sub-volumes in the space.

These sorts of models require extensive memory so that the organ or machine can track and tally the probabilities of various events. Moreover, this memory has to react to the shape of the sub-volumes instead of something lower-level and more detailed, such as location along the axis of the graph. In this sense, it can be said that the memory retrieves its information based on content, not on physical attributes. Marr’s third essay argues that the hippocampus fulfills this function. To make this argument, Marr combined the abstract mathematical constraints on the combinatorial possibilities of his theoretical codons with the latest data on hippocampal anatomy and physiology. He borrowed Donald Hebb’s idea that the brain learns by modifying its synaptic connections through experience and gave a rigorous mathematical proof that his model can recall things very efficiently, even when only given partial content. Finally, Marr outlined several specific and testable predictions of hippocampal structure and function—such as synapses in the hippocampus that can be modified by experience—many of which were later corroborated by others.

Around 1971, Marr’s intellectual interests shifted from general theories of brain function to the study of more specific brain processes. He effectively abandoned the idea of creating abstract theories of brain function, which he came to believe were too vague to explain how anything actually works. By 1972, he had concluded that it would be necessary to understand the details of the specific brain mechanisms and the cognitive tasks they support in order to fully explain the brain. From this nascent idea, he went on to invent the field of computational neuroscience, the legacy for which he is best known. Marr first expressed these ideas publicly in 1972 at a workshop on brain theory organized by Benjamin Kaminer at Boston University. At this event, Marr opined that there was an “inverse square law” for theoretical research: the value of research varies inversely with the square of its generality. Eventually this idea was published in a two-page book review in Science (Marr, 1975, p. 875).

Massachusetts Institute of Technology . In 1973, Marr joined the Artificial Intelligence (AI) Laboratory at the Massachusetts Institute of Technology (MIT) as a visiting scientist. At first, he had planned to stay only a few months, but finding the intellectual climate quite stimulating, he ended up staying for the rest of his career. Cambridge, Massachusetts, was already at the intellectual center for advances in both computational modeling and neuroscience. Things happened quickly there, and Marr rapidly became one of the central figures in MIT’s dominance in AI. He saw early on that a marriage between the disciplines would be enormously fruitful. He wrote to his mentor and friend, Sydney Brenner, in September of 1973: “I have been thinking about the future. Presumably, as a result of the Lighthill report, AI must change its name. I suggest BI (biological intelligence). I am more and more impressed by the need for a functional approach to the CNS [central nervous system] and view it as significant that the critical steps for the retina were taken by land, the only scientist in the field actually concerned with handling real pictures.… I see a bright future for vision in the next few years, and am anxious to stay in the subject, doing my own AI research as well as acting as interface with neurophysiology.

At the AI laboratory at MIT, Marr embarked on a research program that sought to outline the computations involved in visual processing. Testing his ideas on computer models, Marr embodied a new approach in his work, combining insights from the new field of AI and those from neuroscience. Influenced by Horn’s algorithm for computing lightness and by Land’s retinex theory, Marr focused first on the functions of the retina, with great excitement and promise of groundbreaking results. As he writes to Brenner in July 1973,

Nick Horn, co-director of the vision mini-robot project, came up with a beautiful algorithm for computing Land’s retinex function. It is not quite the one actually used, but was near enough to enable one to take the last steps. I am busy tying up all the detailed anatomy and physiology now, and am very hopeful that the whole thing will turn out to be very pretty. But the retinex is the real secret. … One of our wholly new findings is that the so called center-surround organization of the retinal ganglion cells is all a hoax! It is nothing but a by-product of showing silly little spot stimuli to a clever piece of machinery designed for looking at complete views. That will put the cat among the pigeons in a very satisfying manner!

During this time, Marr’s thought was undergoing a transition from believing that computational descriptions are on a par with neurophysiological data to holding that the computational framework was the foundation supporting neurophysiology. Early during his stay at MIT, Marr became convinced that there simply was not enough empirical data to support any general theories of brain processing. Indeed, he suspected that it might turn out that the brain would not support any useful general theory of processing. As a result, he began to criticize the most popular theories of modeling brain function at the time. In a review of the conference proceedings for a summer school institute on the physics and mathematics of the nervous system (held in Trieste, Italy, in August of 1973), Marr takes broad swipes at catastrophe theory, automata theory, learning automata theory, and neural net theory. His basic complaint is that theoreticians do not connect their theories to actual data. It might very well turn out that there is no general theory of brain function; instead, there will only be local understandings of particular brain processes.

In this review, he complains that even neural net theory, the most “biological” of modeling approaches, falls short of its goals:

[T]here are two problems. First, the brain is large but it is certainly not wired up randomly. The more we learn about it, the more specific the details of its construction appear to be. Hoping that random neural net studies will elucidate the operation of the brain is therefore like waiting for the monkey to type Hamlet. Second, given a specific function of inevitable importance like a hash-coded associated memory, it is not too difficult to design a neural network that implements it with tolerable efficiency. Again, the primary unresolved issue is what functions you want to implement and why. In the absence of this knowledge, neural net theory, unless it is closely tied to the known anatomy and physiology of some part of the brain and makes some unexpected predictions, is of no value. (1975, p. 876)

Modeler’s reactions to his harsh criticisms were decidedly mixed.

At the same time, Marr espoused a positive program. His central positive idea was that in order to understand any system, one must first understand both the problems it faces and the form that possible solutions could take. One must focus there, instead of on the structural details of the mechanisms themselves. Marr called these two types of understanding “computational” and “algorithmic,” respectively. He placed these two desiderata above the third sort of understanding, the “implementational” level. In the brain, the implementational level would refer to the anatomy and physiology of the relevant neural areas involved in perception, action, and cognition. Marr also switched from working on retinal functioning to working with Tomaso Poggio, then housed at the Max Planck Institute in Tubingen, to develop theories of binocular stereopsis.

In 1977, Marr accepted a faculty appointment in the Department of Psychology at MIT, where he was given a continuing appointment and promoted as tenured full professor in 1980. In addition to his collaboration with Poggio, Marr also worked with Michael Ullman, Eric Grimson, Ellen Hildreth, H. Keith Nishihara, Whitman A. Richards, and Charles F. Stevens, among others. During this period, Marr also developed theories on low-level image representation and on shape and action characterization. Marr’s book, Vision: A Computational Investigation into the Human Representation and Processing of Visual Information, written during the last months of his life, summarized the results of these collaborations.

In his book, Marr described vision as proceeding from a two-dimensional visual array (on the retina) to a three-dimensional description of the world as output. His stages of vision include a “primal sketch” of the scene, based on feature extraction of fundamental components of the scene, including things such as edges, color, movement, and so forth. The primal sketch is translated into a “2.5 D” sketch of the scene, which is a subject-centered perception of visual information. Finally, this model is converted into a “3 D” model, in which the scene is visualized on a continuous, three-dimensional objected-centered coordinate map. In virtue of how it integrated neural data with principles from computing and other related fields, Marr’s book redefined the study of human and machine vision and continues to influence artificial intelligence research and research in computational neuro-science. Sadly, Marr was diagnosed with acute leukemia during the winter of 1978 and died from complications of the disease on 17 November 1980, in Cambridge, Massachusetts. His book was published posthumously in 1982.

During his time at MIT, people were in awe of Marr, and some of his proposals were taken simply as dogma. Consequently, his influence upon the research trajectories of many was profound. For an example, consider Marr’s influence on his colleague Steven Pinker’s work in shape recognition. The simplest way to recognize a shape would be to use some sort of template, a stored representation that imitates the shape of the object. However, if the object is shifted, rotated, or distorted, any simple template-matching mechanism will make errors. At the same time, midsized objects are things that humans all can recognize easily. To solve this problem, Marr suggested that human brains define shapes on a coordinate system centered on the object itself. As a shape moves, the coordinate system moves with it, and so one’s “template” for the shape remains unchanged.

Pinker tested Marr’s theory in numerous different ways. In one experiment, people memorized complex shapes presented in a certain orientation. The shapes were then presented in another orientation as a test to see how long it took people to name them. If Marr were right that humans represented shape invariantly across all orientations, people should be equally fast at recognizing objects in any orientation. But if humans were good at shape recognition because they had memorized shapes in the various orientations they had seen before, then they should identify objects at orientations they had seen before faster than objects seen in entirely new orientations.

Pinker found that Marr’s hypothesis that shape learning is independent of orientation was only partly right. In general, subjects took a longer time to recognize objects in orientations they had not seen before. However, when the object’s shape was symmetrical, they recognized it equally quickly, regardless of orientation. Perhaps humans have an object-centered coordinate system that is in fact mapped directly onto an object, regardless of orientation, but only for one dimension at a time.

Many of the details of Marr’s specific theory of computational vision were refuted by neurophysiological data and insights not available in the 1970s. Nonetheless, his basic insight that good theories in the brain and behavioral sciences can combine mathematical rigor with neurally-based data remains as a fundamental component of computational neuroscience. Moreover, his emphasis on asking why a brain process is occurring, instead of merely looking for a differential equation that describes it, was adopted by most neurophysiologists in the early 2000s.

BIBLIOGRAPHY

WORKS BY MARR

“A Theory of Cerebellar Cortex.” Journal of Physiology (London) 202 (1969): 437–470.

“A Theory for Cerebral Neocortex.” Proceedings of the Royal Society of London (Series B) 176 (1970): 161–234.

“Simple Memory: A Theory for Archicortex.” Philosophical Transactions of the Royal Society of London (Series B) 262 (1971): 23–81.

“The Computation of Lightness by the Primate Retina.” Vision Research 14 (1974): 1377–1388.

“Approaches to Biological Information Processing.” Science 190 (1975): 875–876.

“Early Processing of Visual Information.” Philosophical Transactions of the Royal Society of London Series B, 275 (1976): 483–524.

With Tomaso Poggio. “Cooperative Computation of Stereo Disparity.” Science 194 (1976): 283–287.

With Tomaso Poggio. “From Understanding Computation to Understanding Neural Circuitry.” Neurosciences Research Program Bulletin 15 (1977): 470–491.

With H. Keith Nishihara. “Representation and Recognition of the Spatial Organization of Three-dimensional Structure.” Proceedings of the Royal Society of London Series B, 200 (1978): 269–294.

With Tomaso Poggio. “A Computational Theory of Human Stereo Vision.” Proceedings of the Royal Society of London Series B, 204 (1979): 301–328.

With Ellen Hildreth. “Theory of Edge Detection.” Proceedings of the Royal Society of London Series B, 207 (1980): 187–217.

“Artificial Intelligence: A Personal View.” In Mind Design, edited by John Haugeland, 129–142. Cambridge, MA: MIT Press, 1981.

Vision: A Computational Investigation into the Human Representation and Processing of Visual Information. San Francisco: W. H. Freeman, 1982.

With L. M. Viana. “Representation and Recognition of the Movements of Shapes.” Proceedings of the Royal Society of London Series B, 214 (1982): 501–524.

OTHER SOURCES

Edelman, Shimon, and Lucia M. Viana. “Marr, David (1945–1980).” In International Encyclopedia of the Social and Behavioral Sciences, edited by Neil J. Smelser and Paul B. Baltes, 9256–9258. Oxford: Elsevier Science, 2001.

Vaina, Lucia M., ed. From the Retina to the Neocortex: Selected Papers of David Marr, vol. 1. Cambridge, MA: Birkhauser Boston, 1991.

Valerie Gray Hardcastle

Complete Dictionary of Scientific Biography