Skip to main content
Select Source:

Optical Character Recognition

Optical Character Recognition

Optical Character Recognition (OCR) uses a device that reads pencil marks and converts them into a computer-usable form. OCR technology recognizes characters on a source document using the optical properties of the equipment and media. OCR improves the accuracy of data collection and reduces the time required by human workers to enter the data.

Although OCR is used for high-speed data entry, it did not begin with the computer industry. The beginnings of OCR can be traced back to 1809 when the first patents for devices to aid the blind were awarded. In 1912 Emmanuel Goldberg patented a machine that read characters, converted them into standard telegraph code, and then transmitted telegraphic messages over wires without human intervention. In 1914 Fournier D'Albe invented an OCR device called the optophone that produced sounds. Each sound corresponded to a specific letter or character. After learning the character equivalent for various sounds, visually impaired people were able to "read" the printed material. Developments in OCR continued throughout the 1930s, becoming more important with the beginnings of the computer industry in the 1940s. OCR development in the 1950s attempted to address the needs of the business world.

Methods for Recording Data

OCR requires hardware, in the form of a scanning device, and software to convert the images and character data from the source document into a digital form. Three primary methods are used to record data on a source document to be read by an OCR device. These include optically readable marks, bar codes, and optically readable characters, including handwritten characters.

Optical mark recognition (OMR) uses OMR paper, sometimes called a "mark sense form." This paper has a series of rectangular shapes that are filled in using a pencil. The completed form is then fed through a scanning device that reads the filled-in rectangles. The software of the OMR scanning device can perform an elementary statistical analysis of the data. OMR technology is commonly used to score standardized tests, such as the Scholastic Aptitude Test (SAT) and Graduate Management Aptitude Test (GMAT), quickly and accurately.

Bar codes are zebra-striped marks of various widths that appear on, or are attached to, most manufactured retail products. The most common use of the bar code is the 10-digit Universal Product Code (UPC) . Other kinds of bar code systems are used in a variety of placesfrom overnight mail packages to airplane luggage tags. The width and combination of the stripes on the bar code represent data. A bar code reader consists of a scanner and decoder. The scanner emits a beam of light that is swept past the bar code and senses light reflections to distinguish between the bars and spaces. A photo detector converts the spaces into an electrical signal and the bars into the absence of an electrical signal. The decoder analyzes the signal patterns to validate and interpret the corresponding data.

Some OCR readers can convert typed and handwritten documents into digital data. These readers scan the shape of a character on a document, compare the scanned character with a pre-defined shape, and convert the character into its corresponding bit pattern for storage in main computer memory. This technology is still in development; handwritten documents do not scan with 100 percent accuracy.

A special type of OCR, magnetic ink character recognition (MICR), is used by several industries, including banks. The enormous amount of paper in the form of checks, loans, and bank statements, combined with the need for accurate and quick processing, prompted the banking industry to seek new ways to manage the flow of paper. In 1956 the American Bankers Association recommended adopting magnetic ink for high-speed automatic character recognition, resulting in MICR. With MICR, data are recorded using a magnetic ink that is readable by either a scanning device or a person. On bank checks, which represent the most common use of MICR, characters in magnetic ink detail the bank's identification number, the individual's account number, and the check number. Checks can be scanned and the data are quickly and accurately read into a computer for further processing.

Another use of OCR allows printed documentssuch as text, images, or photographsto be stored in a computer. Either hand-held scanners or page scanners are used to convert physical documents into computer-readable forms. Page scanners are stationary. The page is typically placed face down on the glass plate of the scanner and then scanned. Hand-held scanners are manually moved over the document. Both types of scanners can convert monochrome or color pictures, forms, text, and other images into machine-readable digital data. The data can then be modified, saved, and distributed over computer networks.

see also Artificial Intelligence; Input Devices; Pattern Recognition; Virtual Reality; Virtual Reality in Education.

Terri L. Lenox and Charles R. Woratschek

Bibliography

Schantz, Herbert F. The History of OCR, Optical Character Recognition. Manchester Center, VT: Recognition Technologies Users Association, 1982.

Shelly, Gary B., and Thomas J. Cashman. Introduction to Computers and Data Processing. Brea, CA: Anaheim Publishing Company, 1980.

Stair, Ralph M., and George W. Reynolds. Principles of Information Systems: A Managerial Approach, 5th ed. Boston: Course TechnologyITP, 2001.

Cite this article
Pick a style below, and copy the text for your bibliography.

  • MLA
  • Chicago
  • APA

"Optical Character Recognition." Computer Sciences. . Encyclopedia.com. 21 Aug. 2017 <http://www.encyclopedia.com>.

"Optical Character Recognition." Computer Sciences. . Encyclopedia.com. (August 21, 2017). http://www.encyclopedia.com/computing/news-wires-white-papers-and-books/optical-character-recognition

"Optical Character Recognition." Computer Sciences. . Retrieved August 21, 2017 from Encyclopedia.com: http://www.encyclopedia.com/computing/news-wires-white-papers-and-books/optical-character-recognition

optical character recognition

optical character recognition (OCR), method for the machine-reading of typeset, typed, and, in some cases, hand-printed letters, numbers, and symbols using optical sensing and a computer. The light reflected by a printed text, for example, is recorded as patterns of light and dark areas by an array of photoelectric cells in a optical scanner. A computer program analyzes the patterns and identifies the characters they represent, with some tolerance for less than perfect and uniform text. OCR is also used to produce text files from computer files that contain images of alphanumeric characters, such as those produced by fax transmissions. See also computer graphics; pen-based computer; personal digital assistant.

Cite this article
Pick a style below, and copy the text for your bibliography.

  • MLA
  • Chicago
  • APA

"optical character recognition." The Columbia Encyclopedia, 6th ed.. . Encyclopedia.com. 21 Aug. 2017 <http://www.encyclopedia.com>.

"optical character recognition." The Columbia Encyclopedia, 6th ed.. . Encyclopedia.com. (August 21, 2017). http://www.encyclopedia.com/reference/encyclopedias-almanacs-transcripts-and-maps/optical-character-recognition

"optical character recognition." The Columbia Encyclopedia, 6th ed.. . Retrieved August 21, 2017 from Encyclopedia.com: http://www.encyclopedia.com/reference/encyclopedias-almanacs-transcripts-and-maps/optical-character-recognition

optical character recognition

op·ti·cal char·ac·ter rec·og·ni·tion (abbr.: OCR) • n. the identification of printed characters using photoelectric devices and computer software.

Cite this article
Pick a style below, and copy the text for your bibliography.

  • MLA
  • Chicago
  • APA

"optical character recognition." The Oxford Pocket Dictionary of Current English. . Encyclopedia.com. 21 Aug. 2017 <http://www.encyclopedia.com>.

"optical character recognition." The Oxford Pocket Dictionary of Current English. . Encyclopedia.com. (August 21, 2017). http://www.encyclopedia.com/humanities/dictionaries-thesauruses-pictures-and-press-releases/optical-character-recognition

"optical character recognition." The Oxford Pocket Dictionary of Current English. . Retrieved August 21, 2017 from Encyclopedia.com: http://www.encyclopedia.com/humanities/dictionaries-thesauruses-pictures-and-press-releases/optical-character-recognition

optical character recognition

optical character recognition See OCR.

Cite this article
Pick a style below, and copy the text for your bibliography.

  • MLA
  • Chicago
  • APA

"optical character recognition." A Dictionary of Computing. . Encyclopedia.com. 21 Aug. 2017 <http://www.encyclopedia.com>.

"optical character recognition." A Dictionary of Computing. . Encyclopedia.com. (August 21, 2017). http://www.encyclopedia.com/computing/dictionaries-thesauruses-pictures-and-press-releases/optical-character-recognition

"optical character recognition." A Dictionary of Computing. . Retrieved August 21, 2017 from Encyclopedia.com: http://www.encyclopedia.com/computing/dictionaries-thesauruses-pictures-and-press-releases/optical-character-recognition