Genetic mapping is the process of measuring the distance between two or more loci on a chromosome. In order to determine this distance, a number of things must be done. First, the loci (pronounced "low-sigh") have to be known, and alleles have to exist at each locus so that they can be observed. The specific pair of alleles that are present is usually referred to as a geno-type. Second, there has to be a way to measure the distance between the loci.
In genetic mapping, this distance is measured by the amount of meiotic recombination that occurs between the two loci. Meiotic recombination is the process in which the two chromosomes that are paired during meiosis each break apart and then reattach to each other, rather than back to themselves. These recombined chromosomes will end up in either eggs (for women) or sperm (for men).
Typically for any chromosome pair there will be only one or two such breaks per chromosome arm. The closer together two loci are, the less likely it is that such a break will occur. Thus, counting the number of breaks between two loci provides a good estimate of how far apart two loci are.
Genetic maps provide the order and distance between many markers all along the chromosome. In genetic maps, the loci that are used are called marker loci. Marker loci are almost always not in genes and serve only as signposts along the chromosome, "marking" a specific location. Thus genetic maps act much like road maps, and markers act much like mile markers or exit signs.
Why Create and Use Maps?
Genetic maps contain very important information and are used to help find the genes that can cause, or change the risk of developing, genetic diseases. For most diseases, the gene is not yet known and could be any one of the 30,000 to 70,000 genes that exist in the human genome. Since the disease gene is not known, its location is also not known. However, if the general location could be determined, then it would be much easier to figure out which of the genes near that location are the actual disease genes.
Genetic maps are very important for "disease-gene discovery," as they provide the reference locations for locating the disease gene. Finding the disease genes without a genetic map would be like trying to find a town by driving down a road without any mile markers or exit signs. There would be no clues as to where you are. The maps make it much easier to "navi-gate" the chromosomes.
Using Recombination and Map Functions
Genetic maps are created by measuring the amount of recombination that occurs between two or more loci. The easiest way to do this is to use families with a large number of children, since this provides a large number of recombination events to look at. Scientists have collected a panel of forty such families, called the CEPH families (pronounced "sef," from the French Centre d'Étude du Polymorphisme Humain—the Center for the Study of Human Polymorphisms). These families are measured (genotyped) for the variations at each locus, and the inheritance of each allele at each locus is compared.
An example of a CEPH family is shown in Figure 1. Using the father as an example (although in other families this could easily occur in the mother), allele a at locus 1 and allele b at locus 2 are always inherited together. Similarly, allele A at locus 1 and allele B at locus 2 are inherited together. There has been no recombination between locus 1 and locus 2, and therefore these loci are likely to be close together. In contrast, allele a at locus 1 and allele c at locus 3 are only inherited together half the time. There have been several recombination events between them, and therefore these loci are likely to be far apart.
The actual distance between two loci is measured using the recombination fraction, which is just the number of recombination events divided by the total number of events that are looked at. In the family diagrammed in the figure, the recombination fraction between locus 1 and locus 2 is 0 recombination events divided by 8 total events, or 0 ÷ 8 = 0. The recombination fraction between locus 1 and locus 3 is 4 recombination events divided by 8 total events, or 4 ÷ 8 = 0.50. Recombination fractions can vary between 0.00 and 0.50. To generate a complete genetic map of a chromosome, a large number of markers (between 50 and 200, depending on the size of the chromosome) are genotyped in many families, and more complex statistical analyses are used to compare the inheritance across all markers.
There is an additional complication in the analysis of recombination events. The further apart two loci are, the more likely it is that two recombination events could occur between them. The first event will shuffle the alleles, but the second event will reshuffle the alleles back to the way they were. Thus it will look like there were no recombination events when in fact there were two.
Another complication arises from the fact that the occurrence of one recombination event on a chromosome tends to inhibit the occurrence of a second recombination event, especially in regions close to the first one. This is called "interference" and will generally make the map smaller. To account for this, "map functions" have been created that are used to better estimate the true recombination distance between two markers.
Map functions are mathematical equations that are based on assumptions about how much recombination and how much interference exists on a chromosome. Map function distances are measured in units called centimorgans, named for Thomas Hunt Morgan, the first person to develop the techniques of genetic mapping. There are several map functions that have been proposed. Each is named for its originator. The most commonly used map function is the Haldane map function (named after John Burdon Sanderson Haldane), which assumes that there is no interference between loci. A second map function, the Kosambi map function (named after Damodar Kosambi), assumes a moderate level of interference and seems to more accurately reflect experimental data. Thus the recombination fraction is modified by the map function. Generally the recombination fraction and the centimorgans are very similar for distances from 0.00 to 0.10.
Types of Markers, and Their Advantages and Disadvantages
There are four major kinds of genetic markers that have been used for genetic mapping. The oldest of these is the restriction fragment length polymorphism (RFLP) that was first proposed for genetic mapping in 1980. RFLPs arise from changes in a single base pair that can be detected by restriction endonuclease enzymes . These enzymes can cut the DNA at that locus if the right base pair is present. Many maps were made with these markers, but they are expensive and time-consuming to genotype, and they generally have only two alleles. Having only two alleles means that in many cases it is impossible to tell the two chromosomes in any person apart for that marker and makes that marker useless for genetic mapping in that family. In the figure, the mother of the eight children has the same alleles at locus 1, the same alleles at locus 2, and the same alleles at locus 3. Thus we cannot tell if there have been any recombination events coming from the mother. RFLPs were the first type of marker known to occur almost everywhere, across all the chromosomes.
Variable number of tandem repeat (VNTR) markers were the next markers to be described. These result from the duplication of DNA sequences consisting of 50 to 5,000 base pairs each. The differences between the two homologous chromosomes are in the number of repeats present (and thus the length of the locus). These markers are expensive and time consuming to genotype but have the advantage of having many alleles (often more than twenty). Thus almost everyone in the world has a different allele on each paired chromosome at a VNTR locus. This allows more families to give recombination information. Having so many alleles, however, can cause problems, because it can be hard to tell many of the alleles apart during genotyping. VNTRs also tend to occur most often at the ends of chromosomes, not in the middle. This is unlike RFLPs, which occur at all locations on a chromosome.
Microsatellite markers—also known as simple tandem repeat polymorphisms (STRPs), simple sequence repeats (SSRs), or simple sequence length polymorphisms (SSLPs)—have become the most common type of marker for genetic maps. These markers are made of repeats of two, three, or four base pairs, with the variation being the number of repeats. For example, the most commonly used two-base-pair repeat is CA, and the most commonly used four-base-pair repeat is GATA. Thus a microsatellite marker actually varies in length between the paired chromosomes. On one chromosome, there might be eight repeats (CACACACACACACACA), while on the other chromosome there might be ten (CACACACACACACACACACA). Microsatellite markers are easy to genotype and have multiple (three to ten) but usually not large numbers (more than ten) of alleles. They also occur almost everywhere across the chromosome. Most of the genetic maps in use today are made with microsatellite markers.
The most recently described type of marker is the single nucleotide polymorphism (SNP, pronounced "snip"). As the name implies, these are variations at a single base on the chromosome. For example, on some chromosomes a locus might have a C, while on other chromosomes the same locus might have a T. These are the most common markers, with at least three million already described, and seem to occur across the entire genome. As with RFLPs, there are almost always only two alleles at a SNP locus. Individually they suffer the same problem as RFLPS of not being useful in many of the families. They are being used widely now because they are very easy to genotype, are very common (occurring at least ten times more frequently than the other types of markers) and thus can be used in combination with each other.
History of Genetic Mapping
The technique of genetic mapping was first described in 1911 by Thomas Hunt Morgan, who was studying the genetics of fruit flies. Morgan was able to study genetic mapping because he was able to actually see traits in the flies (like having white eyes instead of red) that were caused by mutations in single genes. He noticed that some traits violated Gregor Mendel's Law of Independent Assortment (which said that any two loci would segregate independently and thus have a recombination fraction of 0.50).
Genetic mapping did not start being applied to humans until the 1950s, because it was hard to know what traits were caused by genetic mutations. When RFLPs were first described in 1980, a large effort was undertaken to generate maps of all the chromosomes. The first such maps were made in the early 1980s but covered only parts of chromosomes and had only a few markers. Maps of whole chromosomes were made by the late 1980s. By the mid-1990s, as the abilities of the research teams improved, and as the statistical methods of analysis were refined, a number of whole-genome (i.e., covering all the chromosomes) genetic maps were generated. These maps were updated and improved, and they were made available on the Internet.
The Comparison of Genetic and Physical Distance
Genetic maps are a measure of distance based on recombination, which is a biological process. A different way of measuring the distance between two loci is to measure the actual number of base pairs between the loci. This is known as the physical distance, and, when many such distances are put together, it makes a physical map.
Genetic maps and physical maps are similar in that the loci will be in the same order. There is also a general correspondence of distance, in that bigger genetic distances usually correspond to bigger physical distances. The overall rule of thumb is that one centimorgan of genetic distance is about one million base pairs of physical distance. However, this comparison can vary dramatically across certain parts of chromosomes. In some areas, one centimorgan might be only 50,000 base pairs (e.g., at the ends of chromosomes, where recombination seems to be increased). In other chromosomal areas (e.g., near the centromere), one centimorgan might be five million base pairs.
see also Crossing Over; Gene Discovery; Linkage and Recombination; Meiosis; Morgan, Thomas Hunt; Polymorphisms; Repetitive DNA Elements.
Jonathan L. Haines
Bloom, Mark V., Greg A. Freyer, and David A. Micklos. Laboratory DNA Science: An Introduction to Recombinant DNA Techniques and Methods of Genome Analysis. Menlo Park, CA: Addison-Wesley, 1996.
The Center for Medical Genetics. Marshfield Clinic. <http://research.marshfieldclinic.org/genetics>.
Thomas Hunt Morgan. Cold Spring Harbor Laboratory. <http://www.cshl.org/History/morgan.html>.