Concepts and Categories, Learning of
CONCEPTS AND CATEGORIES, LEARNING OF
Concepts are a fundamental aspect of intelligent behavior. Traditionally, a concept has been viewed as a mental representation that picks out a group of equivalent items or a category. For example, every person has a concept of dog and can use that concept to pick out a category of things that one would call dogs. Some of the most fundamental questions about the mind include the following: What do human concepts consist of (i.e, what is their structure)? How are they are acquired? Why do humans have concepts (i.e, What functions do they have)?
What Do Human Concepts Consist Of?
An early, popular view of concepts was the classical view. For a variety of reasons this approach was unsatisfactory and gave way to the probabilistic view. These views are described and compared below.
Classical versus Probabilistic View
The classical view argues that concepts are structured around defining features (Bruner, Goodnow, and Austin, 1956). Defining features are features that are singly necessary and jointly sufficient to define the concept. For example, the concept bachelor has the defining features human, unmarried, and male.
However, the classical view appears to have a number of serious problems. First, if concepts have defining features, then one ought to be able to specify what they are. But many common concepts, such as game or chair, seem to have no defining features. Instead, instances of these concepts have characteristic features that are neither necessary nor sufficient for category membership (e.g., has four legs and has a back for chairs). Second, not all instances of a concept are equally good examples of that concept. For example, people judge robins to be better examples of the concept bird than ostriches. If both robins and ostriches have the defining features of birds, then why should robins be considered better examples of birds than ostriches? These and other problems (Smith and Medin, 1981) have led to a shift in attention from the classical view to the probabilistic view.
The probabilistic view argues that most concepts are organized around properties that are only characteristic or typical of a category (rather than defining). One specific account of the probabilistic view assumes that a concept is a summary representation or prototype that indicates what is, on the average, true of a category (e.g., the bird prototype would include features such as flies and sings because the properties are true of most though not all birds). The prototype view readily handles goodness-of-example effects; the more similar an item is to the prototype, the more typical it will be of the category. For example, robins would be more similar to the bird prototype than are ostriches because they have more features that are generally true of birds. Thus, they would be better examples of birds.
However, the prototype view also has problems. Prototype representations alone are not rich enough to capture people's knowledge about a category. For example, people are sensitive to the number of instances in a category (e.g., there are many more houses than igloos), the variability of features (e.g., the size of quarters varies less than the size of pizzas), and the correlations between features (e.g., wooden spoons tend to be big whereas metal spoons tend to be small [Medin, 1989]). To overcome such difficulties, researchers have suggested that instead of (or in addition to) a prototype people may simply store memory instances or examples of the category and reason with them (Brooks, 1978). Thus, according to the exemplar view, for instance, the concept of a chair would include memory traces of particular chairs and their associated features.
The Theory View
Another approach to concepts is the theory view. As described below, this approach developed as a reaction to certain limitations of the classical and probabilistic approaches. In particular, the classical and probabilistic views failed to take into account people's background knowledge or theories of the world.
The shift from the classical view to the probabilistic view was motivated by a detailed analysis of natural object categories. Associated with this analysis is the view that concepts or mental representations of categories closely mirror the structure afforded by properties of category members. It seems almost a tautology that if the structure of examples does not have defining features then the corresponding mental representations cannot conform to the classical view. Similarly, a probabilistic category structure suggests a probabilistic concept representation. In brief, researchers assume that mental representations are determined by the structure of examples in the world.
However, the classical and probabilistic views tend to ignore the role of the learner in their accounts of concepts. According to the theory view, learners also impose structure on their concepts. That is, concepts are based on a learner's general knowledge and theories of the world together with information provided by the environment (Carey, 1985; Murphy and Medin, 1985; Rips, 1989). For example, Susan Carey showed that children's biological theories influence their patterns of inductions at a very early age. To illustrate, a mechanical monkey is rated by both children and adults to be more similar to a human being than is a worm, yet even young children infer that worms rather than toy monkeys have a spleen after being told that people have a spleen, a round and green [thing] … in the person's body. In this example, the feature has a spleen, which is more consistent with a child's background knowledge or "theory" about animate things than inanimate things.
Theories themselves may be anchored by how well their predictions receive support from the world. Gregory Murphy and Douglas Medin (1985) suggest that the relation between concept and examples is like that between theory and data. Thus, concepts would not necessarily consist of features that are also in examples; rather, the constituents of examples would only need to support more abstract constituents of concepts (Wisniewski and Medin, 1994). For example, one may infer that a man is drunk because one sees him jump into a pool fully clothed. If one does so, it is probably not because the feature jumps into pools, clothed is listed with the concept drunk. Rather, it is because part of one's concept of drunk involves a theory of impaired judgment that serves to explain the man's behavior.
How Are Concepts Acquired?
Young children probably enter the world with few preexisting concepts. Instead, they must acquire or form concepts from experiences (e.g., form a concept of dog from existing experiences with dogs). As described below, the classical, probabilistic, and theory views of concepts propose different ways in which concepts are acquired.
According to the classical view, the process of concept formation is one of discovering necessary and sufficient attributes by observing which attributes occur in all members and only in members of the category. Research associated with the classical view has been directed at investigating hypothesis testing strategies, with each hypothesis being a guess as to which features are part of the definition (Levine, 1971).
In contrast, according to the probabilistic view, concept learning occurs by averaging values of members (Posner and Keele, 1968), by attending to features commonly shared by members and discarding features varying among members (Elio and Anderson, 1981), or by noting the most common value on each dimension. The basic idea behind these models can be traced to Galton's "composite photograph" theory (Galton, 1879). Galton superimposed several faces to make a composite photograph in which common properties were accentuated and variant properties were attenuated. Such a process is assumed in prototype theories. On the other hand, exemplar models assume category instances are stored but generally have not specified detailed learning mechanisms (see Kruschke, 1992, for an exception). These views assume that the learner begins with features of the entities and then learns which features are important for the concept. However, research conducted in the 1990s suggests that an important part of concept learning is learning to identify the features themselves (Schyns, Goldstone, Thibaut, 1998).
The theory-based view of concepts takes a different perspective on concept formation. Several researchers have proposed that humans may be born with a naive physics and a naive biology or psychology (Carey, 1985; Keil, 1989; Spelke, 1990) that act as initial theories to organize conceptual knowledge. A major implication of the theory-based view is that concept learning involves integrating new examples with prior knowledge. In particular, prior knowledge may influence the identification of features and, in turn, information about examples may modify a person's prior knowledge (Wisniewski and Medin, 1994).
Taking the theory-based view, a group of researchers in artificial intelligence (an area in computer science the goal of which is to develop computers to do intelligent things) have developed models of concept formation called explanation-based learning (Mitchell, Keller, and Kedar-Cabelli, 1986; DeJong and Mooney, 1986). These models suggest that the most important aspect of concept learning is to explain why a given example is an instance of the concept. Construction of the explanation is carried out by causally connecting known concepts. For example, suppose a computer is to learn a concept cup and it already knows such concepts as liftable, handle, liquid container, and stable. Seeing an object that can be lifted, has a handle, contains liquid, and is stable, the system uses its background knowledge to construct an explanation about why one can drink from this object. Then it generalizes this explanation to develop its concept of cup.
Learning by analogy is another form of theory or knowledge-driven learning in which a known similar concept is modified. For example, one can learn about the internal structure of atoms by applying one's knowledge of solar system (e.g., electrons revolve around the nucleus as planets revolve around the sun [Gentner, 1989]). One can also discover new features through analogy or metaphors (e.g., given a smile is like a magnet, one can learn that a smile attracts).
The Why of Concept Learning
There are many reasons why humans have concepts. They allow people to classify things (e.g., recognize that something is snake) and to make important predictions or inferences (e.g., that snake may be poisonous). John Anderson (1990) has developed a theory of concepts that emphasizes this prediction function. Other functions include explanation (e.g., the concept introvert might help to explain why some person did not attend a party), reasoning (deriving knowledge from the stored information), communication, and conceptual combination (e.g., from the concepts glass and elephant one might construct the combined concept glass elephant). Many approaches to concept learning have focused only on the classification function. However, functions of concepts interact such that it is important to study multiple functions together. For example, Brian Ross (1997) found that the diagnosis (classification) of a disease was importantly influenced by features that were relevant to its treatment (Malt et al., 1999; Markman and Makin, 1998, for other examples of interactions). Researchers have begun to appreciate and investigate the variety of functions that concepts have.
Anderson, J. R. (1990). The adaptive characteristic of thought. Hillsdale, NJ: Erlbaum.
Brooks, L. R. (1978). Non-analytic concept formation and memory for instances. In E. Rosch and B. Lloyd, eds., Cognition and categorization. Hillsdale, NJ: Erlbaum.
Bruner, J. S., Goodnow, J. J., and Austin, G. A. (1956). A study of thinking. New York: Wiley.
Carey, S. (1985). Conceptual change in childhood. Cambridge, MA: MIT Press.
DeJong, G. F., and Mooney, R. J. (1986). Explanation-based learning: An alternative view. Machine Learning 1, (2) 145-176.
Elio, R., and Anderson, J. R. (1981). The effects of category generalizations and instance similarity on schema abstraction. Journal of Experimental Psychology: Human Learning and Memory 7, 397-417.
Galton, F. (1879). Composite portraits, made by combining those of many different persons into a single, resultant figure. Journal of the Anthropological Institute 8, 132-144.
Gentner, D. (1989). The mechanisms of analogical learning. In S. Vosniadou and A. Ortony, eds., Similarity and analogical reasoning. Cambridge, UK: Cambridge University Press.
Keil, F. C. (1989). Concepts, kinds, and cognitive development. Cambridge, MA: MIT Press.
Kruschke, J. K. (1992). ALCOVE: An exemplar-based connectionist model of category learning. Psychological Review 99, 22-44.
Levine, M. (1971). Hypothesis theory and non-learning despite ideal S-R reinforcement contingencies. Psychological Review 45, 626-632.
Malt, B. C., Sloman, S. A., Gennari, S., Shi, M., and Wang, Y. (1999). Knowing versus naming: Similarity and the linguistic categorization of artifacts. Journal of Memory and Language 40, 230-262.
Markman, A. B., and Makin, V. S. (1998). Referential communication and category acquisition. Journal of Experimental Psychology: General 127, 331-354.
Medin, D. L. (1989). Concepts and conceptual structure. American Psychologist 12, 1,469-1,481.
Mitchell, T. M., Keller, R. M., and Kedar-Cabelli, S. T. (1986). Explanation-based generalization: A unifying view. Machine Learning 1, 47-80.
Murphy, G. L., and Medin, D. L. (1985). The role of theories in conceptual coherence. Psychological Review 92, 289-316.
Posner, M. L., and Keele, S. W. (1968). On the genesis of abstract ideas. Journal of Experimental Psychology 77, 353-363.
Rips, L. J. (1989). Similarity, typicality, and categorization. In S. Vosniadou and A. Ortony, eds., Similarity and analogical reasoning. Cambridge, UK: Cambridge University Press.
Ross, B. H. (1997). The use of categories affect classification. Journal of Memory and Language 37, 240-267.
Schyns, P. G., Goldstone, R. L., and Thibaut, J. P. (1998). The development of features in object concepts. Behavioral and Brain Sciences 21, 1-54.
Smith, E. E., and Medin, D. L. (1981). Categories and concepts. Cambridge, MA: Harvard University Press.
Spelke, E. S. (1990). Principles of object perception. Cognitive Science 14, 29-56.
Wisniewski, E. J., and Medin, D. L. (1994). On the interaction of theory and data in concept learning. Cognitive Science 18, 221-281.