Machine vision, also referred to as computer or robot vision, is a term that describes the many techniques by which machines visually sense the physical world. These techniques, used primarily for monitoring industrial manufacturing, are becoming increasingly popular as today’s manufacturing environments become more automated and quality control standards increase. Whether the task is to sort and assemble a group of machined parts, to determine if a label has been placed properly on a soda bottle, or to check for microscopic defects in an automotive door panel, machine vision plays an essential role.
Machine vision systems tend to mimic the human vision system. An optical sensor and electronic main processor typically act as the eyes and brain and, as in humans, they work together to interpret visual information. Also like their human counterparts, the sensor and processor are each somewhat responsible for filtering out the useless information within the scene before it is analyzed. This reduces the overall processing requirements and allows humans and well-designed machine vision systems to make decisions based on visual information very quickly.
Filtering the information within a scene begins with matching the vision system to its industrial requirements. Just as humans can adjust to a variety of situations by dilating their pupils or by tuning themselves to look for a particular shape or color, machine vision systems must also be somewhat flexible. Typically, however, the most efficient system is one that is designed with only limited applications in mind. For this reason, machine vision designers have developed a variety of application-specific techniques and systems to meet the speed and accuracy standards that modern industry demands.
The simplest type of vision system is one that senses only along a line. These one-dimensional sensors function best when used to simply detect the presence or absence of an object, and generally make no further attempt at interpretation. Typically, these are used in applications such as automated assembly line counters, where perhaps the number of bottles passing by a particular point needs to be monitored. The light passing from one side of a conveyer belt to a detector on the other side is occluded when a bottle passes by. This break in the light signal is, then, recorded electronically and another unit is added to the total count.
Along with the simplicity of this system, unfortunately, comes its limited applicability. This system (like most other one-dimensional scanners) is not very good at distinguishing between different objects. Two different-shaped bottles, for example, cannot be identified from one another. A pickle jar, a hand, or a large flying insect may trigger this system to record the break in light as another bottle. Although they tend to be inexpensive, the limited abilities of one-dimensional vision systems make them popular choices for only very specific, well-controlled applications. For the more sophisticated sensing requirements of most vision applications, two-dimensional techniques need to be employed.
The most common type of machine vision system is one that is responsible for examining situations two-dimensionally. These two-dimensional systems view a scene in much the same way that a person views a photograph. Cues such as shapes, shadows, textures, glares, and colors within the scene allow this type of vision system to be very good at making decisions based on what essentially amounts to a flat picture.
Like humans, most machine vision systems are designed to use shape as the defining characteristic for an object. For these systems then, it is important to make an object’s shape as easy to isolate as possible. Both proper illumination of the object and efficient computer processing of the image of that object are necessary.
Illuminating from behind is the most straightforward optical way to make an object’s shape stand out. The resulting silhouette effect is the same as that which occurs when a moth is seen flying in front of a bright light. To an observer a few feet away, the moth’s colors pale, and the contrast between the moth and the background is enhanced so its shape and size become the only discernable characteristics. For a machine vision system, an image of this silhouette is much easier to process than a conventional image.
Oftentimes, unfortunately, optical techniques alone do not make an object’s shape stand out clearly enough. For these situations, computer software-based techniques are generally employed. These routines perform mathematical operations on the electronic image of the scene to convert it into an image that is easier to interpret. Commonly used software routines can enhance the contrast of an image, trace out the edges of objects within an image, and group objects within an image.
Another defining characteristic for an object is its surface reflectivity. This cue is most often used for distinguishing between objects made from different materials and for distinguishing between objects of the same material but with different surface finish (such as painted or unpainted objects). At the extremes, an object is considered either a specular reflector or a diffuse reflector. If it is specular, it tends to act like a mirror, with most of the light bouncing off at the same angle with which it struck the surface (with respect to a surface normal). This is the case for a finely polished piece of metal, a smooth pool of water, or even oily skin to some extent. If, on the other hand, the surface is diffuse, light is reflected more or less evenly in all directions. This effect is caused by roughness or very slight surface irregularities, and is the reason objects made from materials like wood or cloth generally appear softer in tone and can be distinguished from those made from metal.
Often an object’s color or color pattern can serve as its identifying feature. Every object has a color signature that is determined by its material and its surface coating. Spectroscopic (color sensing) machine vision systems are cued to make decisions based on this feature and typically operate in one of two ways. Both techniques illuminate an object with white light, but one looks at the light reflected by the object while the other looks at the light transmitted through the object for identification.
The simplest color sensing systems are responsible for monitoring only one color across a scene. These are typically used in quality control applications such as monitoring of paints, to ensure consistency between batches made at different times. More sophisticated color sensors look at the color distribution across a two-dimensional image. These systems are capable of complex analysis and can be used for checking multi-colored labels or for identifying multi-colored objects by their color patterns.
The most advanced machine vision systems typically involve acquisition and interpretation of three-dimensional information. These systems often require more sophisticated illumination and processing techniques than one- and two-dimensional systems, but their results can be riveting. These scanners can characterize an object’s shape three-dimensionally to tolerances of far less than a millimeter. This allows them to do things such as identify three-dimensional object orientation (important for assembly applications), check for subtle surface deformations in high precision machined parts, and generate detailed surface maps used by computer-controlled machining systems to create clones of the scanned object.
The simplest way to extract three-dimensional information from a scene is to do it one point at a time, using a method known as point triangulation. The working principle behind this method is based on simple trigonometry. A right triangle is formed between a laser, a video camera, and the laser’s spot on the object. Measurement of the camera-to-laser distance and the camera-to-laser projection angle allows for easy determination of the camera-to-object distance (for a particular object point). This range gives the third dimension, and can be determined for every object point by scanning the laser beam across the surface.
This is a very powerful technique and it is used quite commonly for three-dimensional scanning because of its straightforwardness. The problem for this type of system, though, is that it has a relatively slow scan speed. A typical image may contain over a quarter of a million points. Recording only a fraction of these points, one at a time, tends to be quite time-consuming. The three-dimensional image, like taking a long exposure photograph of a moving car, can sometimes be blurred for all but the most stationary objects. To help overcome this problem, a technique known as line scanning is often used. Line scanning, or line triangulation, is a simple extension of point triangulation. In this case, however, the projected light is a line and an entire strip of the surface is scanned at a time. Although the computational algorithms are somewhat more complex for this method, the time required to scan an object is substantially less.
A further increase in image-capture speed can be achieved through the use of more sophisticated illumination. This illumination can take on many forms, but is typically an array of dots or a set of projected lines. An image of the structured illumination shown on an object can then be processed in much the same way as triangulation data, but a full frame at a time. It generally only takes a handful of these full-frame images to describe a surface three-dimensionally, making structured illumination techniques extremely fast.
One particularly interesting type of three-dimensional scanner that uses structured illumination is based on a phenomenon known as the moire´ effect. The moire´ effect is a fascinating visual display that often occurs when two periodic patterns are overlaid. It can easily be seen in everyday experiences such as overlapping window curtains or on television when a character wears a shirt with stripes that have nearly the same spacing as the television lines. Moire´ scanners typically operate by projecting a set of lines onto an object and then viewing that object through a transparency containing another set of lines. The resulting moire´ pattern is an array of curves that trace out paths of equal object height, much like elevation lines on a topographical map. This image can then be used directly to check for surface features or combined with a few others and processed to give a true three-dimensional plot of the object.
The semiconductor industry has become the largest user of automated vision systems. A silicon wafer that will become hundreds of microchips starts as a finely machined disc about 7.9 in (200 mm) or smaller in diameter. Before the disc is split into individual chips, the wafer undergoes dozens of steps—some of which are indiscernible by the human eye. To ensure the wafer maintains that sequence, sorting systems using optical character recognition (OCR) identify each wafer, sort it in a clean room environment and report the results to a central network.
For manufacturing, one can classify machine vision applications into four categories: gauging, inspection, identification, and alignment. Gauging refers to measuring critical distances on manufactured parts. Vision software can quickly and consistently measure certain features on a component and determine whether the part meets tolerance specifications. Inspection means looking to see if a mechanical part has been assembled properly. For example, inspecting the pins in an electronic connector to check for missing pins or bent pins. Alignment often involves using pattern-matching software to locate a reference object and then physically moving the object within some tolerance. Identification refers to classifying manufactured items. In an automotive assembly plant, parts often need to be identified or sorted using vision, such as tires by the tread pattern or inner and outer diameter.
Another application for machine vision that is becoming more popular as the technology improves is biometrics, or the identification of a person through his or her readily accessible and reliably unique physical characteristic. These features are compared via sensors against a computer system’s stored values for that characteristic. Some commonly used identifiers include hand proportions, facial image, retinal image, fingerprints, and voice print. The advantages of biometrics are that they cannot be lent like a physical key or forgotten like a password. A leading concern in the development of such applications, however, is how to avoid rejecting valid users or approving impostors. Such devices would be applicable for security systems at banks, offices, and Internet network applications.
As an example of a biometric device, finger scanners use a scanning device (called a reader) to scan a finger. Then, computer software converts the scanned information into digital form. The software is used to identify specific points of data as match points. The match points are processed within a database and changed into a numeric format. The database stores the data for later comparisons. When someone scans a finger into the scanning device, the database value is compared with the finger on the scanner. The computer will either indicate that the points match (real) or they do not match (phony). Deoxyribonucleic acid
Diffuse reflector —When the light hitting a rough surface is reflected in all directions. Such a surface can be thought of as consisting of many small-scale flat facets.
Optical character recognition —The process of applying pattern-matching methods to character shapes that have been read into a computer to determine the character that the shapes represent.
Optical sensor —One that measures light characteristics. These sensors either measure intensity change in one or more light beams or look at phase changes in the light beams by causing them to interact or interfere with one another.
Processor —The computer brain, or the main component that makes a computer work. These are typically a microprocessor on a single silicon chip.
Specular reflector —The reflection from a smooth surface or when the angle of reflection equals angle of incidence.
Surface reflectivity —The amount of light or visible radiation that is reflected or scattered in varying degree by the outside shell of all objects.
(DNA) is also a way to identify individuals using biometrics. A sample of blood, hair, semen, skin, semen, or other body material is taken from a person and examined on a microscope. It is generally only used in criminal investigations.
Though biometrics is still in its early developmental stage, many people feel that biometrics will turn into a very important technology in such fields as law enforcement, security, electronic commerce, and Internet communications. Along with national security, everyday life will include biometric devices. Medicare patients will eventually scan their finger to confirm their medical information. Livestock will be identified with retinal scanning tied to a global positioning system to expose individual cows, pigs, and other animals with dangerous diseases as they move through the United States. Grocery stores and other retail stores will use biometric machines to prevent forgery in checks and credit cards.
Wayman, James. Biometric Systems: Technology, Design and Performance Evaluation. London, UK: Springer, 2005.
National Science & Technology Council Subcommittee on Biometrics, Executive Office of the President of the United States. “Introduction to Biometrics.” <http://www.biometricscatalog.org/Introduction/default.aspx> (accessed October 15, 2006).
Scott Christian Cahall