Genetic polymorphisms are different forms of a DNA sequence. "Poly" means many, and "morph" means form. Polymorphisms are a type of genetic diversity within a population's gene pool. They can be used to map (locate) genes such as those causing a disease, and they can help match two samples of DNA to determine if they come from the same source. Depending on its exact nature, a polymorphism may or may not affect biological function.
Coding and Noncoding Sequences
The amino acid sequence of proteins is directed by the information found in genes, which in turn are made up of DNA. Genes that have different DNA sequences are said to be polymorphic. These different gene forms are called alleles, exemplified by the alleles that control eye color. When alleles result in differences in the amino acid sequence of a protein, the proteins encoded by alleles are called isoforms. The position of the gene on a chromosome is its locus (plural, loci). More generally, a locus refers to any position on a chromosome, whether or not a gene is located there.
Polymorphisms arise through mutation. The mutation may be due to a change from one type of nucleotide to another, an insertion or deletion (collectively known as indels), or a rearrangement of nucleotides. Once formed, a polymorphism can be inherited like any other DNA sequence, allowing its inheritance to be tracked from parent to child.
Polymorphisms are also found outside of genes, in the vast quantity of DNA that does not code for protein. Indeed, regions of DNA that do not code for proteins tend to have more polymorphisms. This is because changes in DNA sequences that encode proteins may have a harmful effect on the individual that carries it. Polymorphisms that do not have any effect on the organism are said to be selectively neutral since they do not affect its ability to survive and reproduce.
The process of determining an individual's genetic polymorphisms is known as genotyping. One of the earliest methods used in genotyping looked not at genes but at polymorphic proteins known as isoenzymes, or isozymes. Isoenzymes are different forms of a protein, with slightly different amino acid compositions. Since a protein's amino acid composition is genetically programmed by the DNA sequence that encodes it, analysis of isoenzymes surveys genetic polymorphism. Because these differences in amino acid composition can cause proteins to have different electrical charges, isoenzyme polymorphisms are assessed by extracting an organism's proteins and separating them using gel electrophoresis—a technique also used to study DNA polymorphisms.
In gel electrophoresis, an electric field is applied across a gel matrix, and molecules move through the matrix in response to the electric field. The gel matrix is a porous material, similar to Jell-O, that acts as a sieve and slows down large molecules more than small molecules. Isoenzymes move through the gel matrix according to their electrical charge and size, and are separated from each other on this basis. In this way, different isoenzymes can be identified.
Many tools for assessing DNA polymorphisms are now available. Some of these methods assess length polymorphism (indels), sequence polymorphism (base changes or rearrangements), or combinations of the two. DNA sequencing reveals all types of polymorphisms but is costly and labor-intensive. Gel electrophoresis is used in DNA analysis to both separate and sequence DNA. PCR (polymerase chain reaction) is frequently employed beforehand to produce large quantities of the DNA to be analyzed.
An early method of detecting DNA polymorphisms still in use employs restriction endonucleases. These bacterial enzymes cut DNA at specific recognition sequences. Restriction enzymes cleave DNA into a characteristic set of fragments that can be separated by gel electrophoresis. Some polymorphisms alter recognition sequences, so that the enzyme no longer recognizes a site or recognizes a new site. This results in a new set of DNA fragments that can be compared to others to detect the differences. These differences are called restriction fragment length polymorphisms (RFLPs).
STRs, VNTRs, and SNPs
Repetitive genetic elements are an important class of polymorphic DNA. These sequences consist of several repeats of a simple DNA sequence pattern, and they typically do not encode a protein or have strict requirements of size and sequence. For example, the two base pairs cytosine (C) and ade-nine (A) may be found together multiple times, resulting in a "CACACACA" sequence. If another copy of this sequence were found as "CACA" (two CA pairs shorter), then this sequence would be polymorphic. Repetitive genetic elements include microsatellites or STRs (short tandem repeats) and the minisatellites or VNTRs (variable number of tandem repeats), which are distinguished primarily on the basis of size and repeat pattern: The repeated sequence in microsatellites range from two to six bases, while in a VNTR it ranges from eleven to sixty base pairs.
Differences in single base pairs, known as single nucleotide polymorphisms (SNPs), are a valuable class of polymorphism that can be detected by DNA sequencing, RFLP analysis, and other methods such as allele-specific PCR and allele-specific DNA hybridization. Many RFLPs are due to single nucleotide polymorphisms. There are hundreds of thousands of SNP loci throughout the human genome, making them especially valuable for mapping human disease genes.
Uses of Polymorphisms
The study of polymorphism has many uses in medicine, biological research, and law enforcement. Genetic diseases may be caused by a specific polymorphism. Scientists can look for these polymorphisms to determine if a person will develop the disease, or risks passing it on to his or her children. Besides being useful in identifying people at risk for a genetically based disease, knowledge of polymorphisms that cause disease can provide valuable insight into how the disease develops. Polymorphisms located near a disease gene can be used to find the gene itself, through mapping. In this process, researchers look for polymorphisms that are co-inherited with the disease. By finding linked polymorphisms on smaller and smaller regions of the chromosome, the chromosome region implicated in the disease can be progressively narrowed, and the responsible gene ultimately can be located.
A related use of polymorphism is widely employed in agriculture. If a polymorphism can be identified that is associated with a desirable characteristic in an agriculturally important plant or animal, then this polymorphism can be used as a genetic flag to identify individuals that have the desirable characteristic. Using this technique, known as marker-assisted selection, breeding programs aimed at improving agriculturally important plants and animals can be made more efficient, since individuals that have the desired trait can be identified before the trait becomes apparent.
Polymorphisms can be used to illuminate fundamental biological patterns and processes. By studying polymorphisms in a group of wild animals, the familial relationships (brother, sister, mother, father, etc.) between them can be determined. Also, the amount of interbreeding between different groups of the same species (gene flow) can be estimated by studying the polymorphisms they contain. This information can be used to identify unique populations that may be important for survival of the species. Sometimes it is not immediately obvious if two different groups of organisms should be classified as different species. Comparing the genetic polymorphisms in the two groups aids in making a judgment as to whether they warrant classification as different species.
If enough polymorphisms are analyzed, it is possible to distinguish between individual humans with a high degree of confidence. This method is known as DNA profiling (or DNA fingerprinting) and provides an important tool in law enforcement. A person's genotype, or DNA profile, can be determined from very small samples, such as those that may be left at a crime scene (hair, blood, skin cells, etc.). The genotype of samples found at the crime scene can then be compared to a suspect's genotype. If they match, it is very likely that the suspect was present at the crime scene. Currently, the FBI uses thirteen different polymorphic loci for DNA fingerprinting. In a similar manner, analysis of polymorphisms can help prove or disprove fatherhood (paternity) in cases where responsibility for a child is disputed.
see also Gel Electrophoresis; Linkage and Recombination; Mapping; Mutation; Repetitive DNA Elements.
R. John Nelson
Avise, John C. Molecular Markers, Natural History and Evolution. New York: Chapman & Hall, 1994.
Weaver, Robert F., and Philip W. Hedrick. Genetics, 2nd ed. Dubuque, IA: WilliamC. Brown, 1992.
SNPs: Variations on a Theme. National Center for Biotechnology Information. <http://www.ncbi.nlm.nih.gov/About/primer/snps.html>.