Linkage and Recombination

views updated

Linkage and Recombination

Linkage refers to the association and co-inheritance of two DNA segments because they reside close together on the same chromosome. Recombination is the process by which they become separated during crossing over, which occurs during meiosis . The existence of linkage and the frequency of recombination allow chromosomes to be mapped to determine the relative positions and distances of the genes and other DNA sequences on them. Linkage analysis is also a key tool for discovering the location and ultimate identity of genes for inherited diseases.

Basic Concepts

Each individual inherits a complete set of twenty-three chromosomes from each parent, and chromosomes are therefore present in homologous pairs. The members of a pair carry the same set of genes at the same positions, or loci . The two genes at a particular locus may be identical, or slightly different. The different forms of a gene are called alleles.

Genes or loci can be linked either physically or genetically. Genes that are physically linked are on the same chromosome and are thus syntenic. Only syntenic genes can be genetically linked. Genes that are linked genetically are physically close enough to one another that they do not segregate independently during meiosis.

Understanding independent segregation is crucial to understanding linkage. Independent segregation was first discovered by Gregor Mendel, who found that, in pea plants, the different forms of two traits found in the parents, such as color and height, could occur in all possible combinations in the offspring. Thus, a tall parent with green pods crossed with a short parent with yellow pods could give rise to offspring that were tall with yellow pods or short with green pods, as well as some of each parental type. Mendel concluded that the factors controlling height segregated independently from the factors controlling pod color. Later work showed that this was because these genes occurred on separate (nonhomologous) chromosomes, which themselves segregate independently during meiosis.

How is it possible for physically linked genes to nonetheless segregate independently? The answer lies in the events of crossing over. During crossing over, homologous chromosomes exchange segments at several sites along their length, in a process called recombination. Thus, two loci at distant ends of the chromosome are almost certain to have at least one exchange point occur between them. If only one exchange occurs, two alleles that began on the same chromosome will end up on different chromosomes. If there are two exchange points between them, they will end up together; if three, they end up apart, and so on. Over long distances, the likelihood of two alleles remaining together is only 50 percent, no better than chance, and, therefore, loci that are far apart on a large chromosome are not genetically linked. Conversely, loci that are close together will not segregate independently, and are therefore genetically linked. It is these that are most useful for mapping and discovering disease genes.

The loci examined in linkage analysis need not be genes of functional significance; indeed, anonymous segments of DNA (stretches of DNA with no known function) called genetic markers are often more useful in genetic linkage analysis. In order for a genetic marker to be of benefit in a linkage analysis, the chromosomal location of the marker must be known and, most importantly, there must be some variation in the sequence or length of these markers among individuals. Nongene markers used in linkage analysis are classified into four broad categories: restriction fragment length polymorphisms (RFLPs), variable number of tandem repeat (VNTRs), short tandem repeat polymorphisms (STRPs), and single nucleotide repeats (SNPs).

Calculating Linkage and Map Distance

As noted above, when genes are not genetically linked, alleles at the loci segregate independently from one another. So, if locus 1 has alleles A and a, and if locus 2, not linked to locus 1, has alleles B and b, then four gametes can be formed (AB, Ab, aB, and ab ). Each of these four will occur with equal frequency (a 1:1:1:1 ratio), and all possible offspring combinations are expected with equal frequency.

If locus 1 and locus 2 are genetically linked to one another, however, deviations from this 1:1:1:1 ratio will be observed. If A and B begin on the same chromosome, then AB and ab will be more common than either aB or Ab. By counting the number of each type and determining the extent of this deviation, one can estimate the extent of recombination between the two loci: A large deviation means little recombination. The "recombination fraction," expressed as a percentage, is an indirect measure of the distance between the loci and is the basis for the development of genetic maps.

Genetic maps order polymorphic markers by specifying the amount of recombination between markers, whereas physical maps quantify the distances among markers in terms of the number of base pairs of DNA. Although mapping in humans has a relatively recent history, the idea of a linear arrangement of genes on a chromosome was first proposed in 1911 by Thomas Hunt Morgan, who was studying the fruit fly, Drosophila melanogaster. The possibility of a genetic map was first formally investigated by the American geneticist Alfred H. Sturtevant in the 1930s, who determined the order of five markers on the X chromosome in D. melanogaster and then estimated the relative spacing among them.

For small recombination fractions (usually less than 10 percent to 12 percent), the estimate of the recombination fraction provides a very rough estimate of the physical distance. In general, 1 percent recombination is equivalent to about one million base pairs of DNA and is defined as one centimorgan. Physical measurements of DNA are often described in terms of thousands of kilobases . Crossing over does not occur equally at all locations, so estimates of distance from physical and genetic maps of the identical region may vary dramatically throughout the genome.

Statistical Approaches

In experimental organisms, genetic mapping of loci involves counting the number of recombinant and nonrecombinant offspring of selected matings. Genetic mapping in humans is usually more complicated than in experimental organisms for many reasons, including researchers' inability to design specific matings of individuals, which limits the unequivocal assignment of recombinants and nonrecombinants. Therefore, maps of markers in humans are developed by means of one of several statistical algorithms used in computer programs.

Genetic maps can assume equal recombination between males and females, or they can allow for sex-specific differences in recombination, since it has been well established that there are substantial differences in recombination frequencies between men and women. Chromosomes recombine more often in females. On average, the female map is two times as long as the male map.

The complexity of the underlying statistical methods used to generate them renders genetic maps sensitive to marker genotyping errors, particularly in small intervals, and these maps are less useful in regions of less than about 2 centimorgans. While marker order is usually correct, genotyping errors can result in falsely inflated estimates of map distances.

Disease gene mapping is greatly facilitated by the availability of dense genetic maps. Linkage analysis for the mapping of disease genes boils down to the simple idea of counting recombinants and nonrecombinants, but in humans this process is complicated for a variety of reasons. The generation time is long in humans, so large, multigenerational pedigrees in which a disease or trait is segregating are rare. Scientists cannot dictate matings or exposures. They also cannot require that specific individuals participate in a study. Thus the process of linkage analysis in humans requires a statistical framework in which various hypotheses about the linkage of a trait locus and marker locus can be considered. How far apart are the disease and marker, and how certain is the conclusion of linkage?

When the inheritance pattern for a disease is clearly known (e.g., auto-somal dominant, sex-linked, etc.), the genetic data can be treated with a statistical approach that determines the likelihood that the gene is linked to a particular marker, at a particular position on a specific chromosome. This approach is often termed the "lod score approach," where "lod" is short for logarithm of the odds.

Lod score linkage analysis is used most frequently to consider diseases that follow a Mendelian pattern of transmission within families. Positive lod scores, especially those greater then 3.0, suggest evidence for linkage between a disease gene and a marker locus. Negative lod scores suggest that the disease gene and marker locus are unlinked to one another.

see also Crossing Over; Gene Discovery; Human Disease Genes, Identification of; Mapping; Meiosis; Morgan, Thomas Hunt; Polymorphisms.

Marcy C. Speer


Strachan, Tom, and Andrew P. Read. Human Molecular Genetics. New York: Wiley-Liss Publishers, 1999.