VOCABULARY [From Latin vocabularium a list of vocabula words. The medieval vocabularium was a list of Latin words to be learnt by clerical students. It was usually arranged thematically, with translation equivalents in a vernacular language]. A traditional term with a range of linked senses: (1) The WORDS of a language: the vocabulary of Old English. The general vocabulary of a language is sometimes called its wordstock and is generally referred to by linguists as its LEXICON or LEXIS. (2) The words available to or used by an individual: a limited French vocabulary. (3) The words appropriate to a subject or occupation: the vocabulary of commerce. (4) A word list developed for a particular purpose: Use the vocabulary at the back of the book; a dictionary with a restricted defining vocabulary.

Specialized vocabularies in a language

The vocabulary or lexicon of a language is a system rather than a list. Its elements interrelate and change subtly or massively from generation to generation. It increases through borrowing from other languages and through word-formation based on its own or borrowed patterns. It may decrease or increase in certain areas as interests change. Whole sets of items may vanish from general use and awareness, unless special activities serve to keep them alive. For example:(1) A vocabulary of carving. There was in 16c England a set of verbs for carving kinds of game, fish, and poultry, which included allaying a pheasant, barbing a lobster, chining a salmon, fracting a chicken, sculling a tench, and unbracing a mallard.(2) A vocabulary of coaches. In the 19c, there were many terms for horse-drawn vehicles, including brougham, buckboard, buggy, cabriolet, carriage, chaise, coach, coupé, droshky, gig, hackney (carriage), hansom (cab), jaunting/jaunty car, landau, stagecoach, tonga. The use of horse-drawn vehicles has steeply declined, but many such words are kept alive in historical novels and films, some refer to vehicles still in use in certain places, and some have moved into the vocabulary of rail travel and the automobile.

Distinctive vocabularies across languages

Just as the vocabulary of a language changes from age to age, so the vocabularies of different languages are distinct in their systems, uses, and references. There may be some close translation equivalents among several languages, but items and arrangements of items in one language may have no precise parallel elsewhere, because the culture in which the vocabulary has evolved rests on unique needs, interests, and experiences. As the American anthropologist Stephen Tyler has put it:
The people of different cultures may not recognize the same kinds of material phenomena as relevant, even though from an outsider's point of view the same material phenomena may be present in every case. For example, we distinguish (in English) between dew, fog, ice, and snow, but the Koyas of South India do not. They call all of these mancu. Even though they can perceive the differences among them if asked to do so, these differences are not significant to them. On the other hand, they recognize and name at least seven different kinds of bamboo, six more than I am accustomed to distinguish (Cognitive Anthropology, 1969).

The vocabulary of English

Historically, the word-store of English is a composite, drawn in the main from the Indo-European language family. There is a base of Germanic forms (mainly Old English and Old Norse) with a super-structure of Romance, mainly from French and Latin, with a technical stratum contributed by Greek (mainly through Latin and French): see BISOCIATION. In addition, there are many acquisitions from languages throughout the world. Because of such a complex background, and because dictionaries and other resources state that they list thousands of headwords and other items, the question often arises: How many words are there in the English language?

No easy answer is possible. In order to reach a credible total, there must be agreement about what to count as an item of vocabulary and also something physical to count or to serve as the basis for an estimate. Counting words (however defined) is wearisome, complex, and difficult, and experience suggests that no matter how well organized the count there can never be enough data to ensure completeness. There are at least five reasons for this: (1) There is no corpus available in a countable form which represents the whole language. (2) Even if there were, it would only indicate what was available at the time the count started. It would therefore be a static assessment of a dynamic process. (3) The result of the counting would consequently be out of date before the counting was completed. (4) Even with careful safeguards, the total reached would be different for each counter. In practice, counters tend to interpret instances differently and so count items in different ways. (5) The administrative work needed to homogenize the efforts of the counters would be formidable and time-consuming, making the survey even more out of date by the time it appeared.

Points 4 and 5 can be demonstrated by means of one example: the -ing form in running and walking. There are three ways to handle this suffix in a word count: (1) Count every item containing -ing as a distinct word. (2) Omit every item, treating -ing as an inflected form of the verb, like runs and walks, and therefore a matter of grammar, not lexis. (3) Count only some instances, like clearing and drawing, because these are used as distinct nouns with the plurals clearings and drawings, and ignore the rest. Whatever decision is taken significantly affects the outcome, because there are as many -ing forms as there are such verbs as run and walk. If solution 3 is chosen, it poses further problems, because in the corpus to be counted will appear citations like rustlings and twitterings among the trees. If rustling and twittering are taken as distinct words on this occasion, how will the counters handle the fact that many -ing forms can be so used, even if they are not recorded in the corpus?

In effect, the overall vocabulary of English is beyond strict statistical assessment. Nonetheless, limited counts take place and serve useful ends, and some rough indications can be given about the overall vocabulary. The Oxford English Dictionary (1989) defines over 500,000 items described as ‘words’ in a promotional press release. The average college, desk, or family dictionary defines over 100,000 such items. Specialist dictionaries contain vast lists of words and word-like items, such as the Acronyms, Initialisms & Abbreviations Dictionary (Gale, 1989), which contains over 450,000 accredited abbreviations. When printed material of this kind is taken into account, along with lists of geographical, zoological, botanical, and other usages, the crude but credible total for words and word-like forms in present day English is somewhere over a million items.

Individual vocabularies

No one person can know, use, or imagine the entire available lexical resources of English. Many people are, however, curious about either how many words they or someone else knows, or what the ‘average educated person’ might be supposed to know and use. When such questions arise, they raise issues comparable to those of the vocabulary at large. Even in the case of writers whose texts are available for analysis, different totals emerge: the vocabulary of Shakespeare's works is variously listed as c.25,000 ( Simeon Potter, Our Language, 1966), c.30,000 words ( Robert McCrum et al., The Story of English, book, 1986), and c.34,000 ( J. Barton, in McCrum et al., The Story of English, TV series, episode 3, 1986). Such totals apparently depend on what has been counted and the information passed on is a very rough approximation.

It is not unusual, however, to find assertions that the average person makes use of, for most purposes, fewer words than Shakespeare: perhaps 15,000 items. If, however, the count starts with the c.3,000 words in lists used for the early stages in the learning of English as a foreign language, such a total is soon exceeded simply by adding compounds, derivatives, phrasal verbs, abbreviations, and fixed phrases commonly associated with those c.3,000 words. For example, words formed on run alone include runner, running, run in, run out, run on, run off, run up, run down, runway, gun-runner, all with meanings and uses that qualify them as distinct words. All or most of such items are well within the range of most users of English educated to around 16–18 years of age. A crude extrapolation of 10 × 3,000 suggests that such people are familiar with some 30,000 such items, or twice the above estimate. Bringing in many everyday words not in the basic 3,000, and applying the same multiplier, soon takes the average person to double or treble this number without discomfort, every personal ‘list’ of words and wordlike items differing from every other.

Active and passive vocabulary

When teachers and linguists discuss the words people know, a distinction is commonly made between an active or productive vocabulary (what one can use) and a passive or receptive vocabulary (what one can recognize). The passive vocabulary is larger than the active, and the dividing line between the two is impossible to establish. All such terms and statements founder on the rock of what is meant by ‘word’ and ‘vocabulary’. Lexical skills go well beyond the simplicities of printed words with white space on either side. These skills include knowledge of the senses of words. To take this knowledge into account would multiply what an individual knows many times over, because common words like head and foot have over a dozen important senses each, and many nuances. A person who ‘knows’ 50,000 ‘words’, each with an average of five clear-cut senses, is actively or passively acquainted with 250,000 nuggets of lexical information. Such an estimate, however crude it may be, is decidedly impressive.

See: (1) FREQUENCY COUNT, VOCABULARY CONTROL. (2) The vocabulary sections of entries for major varieties of English, such as CANADIAN ENGLISH.


VOCABULARY CONTROL. A term in applied LINGUISTICS for the organization of WORDS into groups and levels, especially as the outcome of FREQUENCY COUNTS and in the form of word lists intended to help in writing, reading, and learning languages. Counting words and creating word lists is a complex task, and for useful results requires an initial conception of ‘word’ for the purpose in hand. Problems regarding what to count as a word include: (1) The orthographic problem of spelling variants such as colour and color. (2) The homonymic problem of identical forms such as bear (animal) and bear (carry). (3) The homographic problem of identical forms such as wind (air on the move) and wind (to turn, twist). (4) The phonological problem of statistics relating to spoken (and informally written) language (do items like 'll and n't count as will and not, are they special events to be counted on their own, or are they parts of units like I'll and didn't, to be counted separately?). (5) The morphological problem of the forms of be (are be, am, art, is, are, was, were different words or realizations of the same word?). (6) The lexical problems of counting COMPOUNDS and distinguishing them from ATTRIBUTIVE forms (is a key decision one word or two, in the same way that a keyhole or key-hole or key hole is one word?). (7) The statistical problem that, even granted that there is a utilitarian solution to the preceding problem, should the count include only agreed compounds or both compounds and their elements (so that for keyhole one counts keyhole, key, hole)? (8) The grammatical problem of particles and prefixes (does one count the up in put up with alongside the up in up the hill, and with the up in uproot?). (9) The onomastic problem of personal and place-names (are Manchester, Manila, and Manitoba, etc., to be counted as words simply because they appear in texts?). (10) The polysemic problem of deciding whether to count fire in a grate and fire at an artillery range as the same or a different word. (11) The lexicographical problem of how to describe and list the findings of any such survey, so that people of different experience can see and appreciate what has been counted. After all of these, there are two further problems: how to train the personnel and program the computer so that such highly sophisticated work can be brought to a successful conclusion. See HOMONYM, POLYSEMY, VARIANT.


vo·cab·u·lar·y / vōˈkabyəˌlerē; vi-/ • n. (pl. -lar·ies) the body of words used in a particular language. ∎  a part of such a body of words used on a particular occasion or in a particular sphere: the vocabulary of law | the term became part of business vocabulary. ∎  the body of words known to an individual person: he had a wide vocabulary. ∎  a list of difficult or unfamiliar words with an explanation of their meanings, accompanying a piece of specialist or foreign-language text. ∎  a range of artistic or stylistic forms, techniques, or movements: dance companies have their own vocabularies of movement.


a collection or list of words, 1532.

Examples: vocabulary of arms, 1862; of new denominations, 1821; of dishes, 1825; a vocabulary to the understanding, 1662.