In the early 1860s, Gregor Mendel developed the concept of the gene to help explain results obtained while crossbreeding strains of garden peas. He identified physical characteristics (phenotypes), such as plant height and seed color, that could be passed on, unchanged, from one generation to the next. The hereditary factor that predicted the phenotype was termed a "gene." Mendel hypothesized that genes were inherited in pairs, one from the male and one from the female parent. Plants that bred true (homozygotes) had inherited identical genes from their parents, whereas plants that did not breed true (hybrids, or heterozygotes) inherited alternative copies of the genes (alleles) from one parent that were similar, but not identical, to those from the other parent.
Some of these alleles had a greater effect on the phenotypes of hybrids than others. For example, if a single copy of a given allele was sufficient to produce the same phenotype seen in homozygous organisms, that gene was termed a "dominant." Conversely, if the allele could only be detected in the minority of the offspring of hybrid parents that were homozygous for that "weaker" allele, the gene was termed a "recessive." Based on these observations, Mendel formulated a series of laws that are the basis of what we now term "Mendelian" inheritance patterns.
The "law of unit inheritance" holds that factors retain their identity from generation to generation and do not blend in the hybrid. The "law of segregation" states that two members (alleles) of a single pair of genes are never found in the same mature sperm or ovum (gamete) but always separate out (segregate). Finally, the "law of independent assortment" holds that members of different pairs of genes (nonalleles) are sorted out (assort) independently to different gametes.
Almost a century later, in 1953, Watson and Crick solved the structure of the DNA molecule and helped explain how this genetic information could be encoded in a polymer, deoxyribonucleic acid (DNA), which was found in the nucleus of the cell. They demonstrated that DNA is a double-stranded polymer consisting of two linear arrays of diverse purine (adenine [A] and guanine [G]) and pyrimidine (thymine [T] and cytosine [C]) bases. Each purine or pyrimidine on one strand pairs with a complementary base (A:T and G:C) on the other strand. Each strand is thus complementary to the other. The two antiparallel polynucleotide strands are gently twisted to form what is termed a "double helix."
In humans, the nucleus of each somatic cell contains twenty-three pairs of chromosomes, which are formed by tightly coiled DNA strands. Twenty-two pairs of the chromosome pairs are found in the cells of both men and women. These chromosomes are termed "autosomes," and they are numbered by size from 1 (the largest) to 22 (the smallest). The twenty-third pair of chromosomes determine the sex of the individual, and these two chromosomes are thus termed the "sex chromosomes." Women have a pair of X chromosomes, whereas men have a single X chromosome, which they inherit from their mother, and a single Y chromosome, which they inherit from their father. The Y chromosome is dominant for maleness.
During "mitosis," the DNA double strand is unwound and split apart. Each individual strand is then duplicated. By making copies of each DNA strand, a parental cell can transmit a complete set of genetic information into each of its two daughter cells.
Gametes result from "meiosis," which differs from mitosis in two ways. First, allelic chromosomes are paired prior to their duplication. Second, there are two sets of divisions before the final product, the gamete, is created. In the first set of divisions after DNA duplication, allelic chromosomes, rather than chromatids, segregate into the daughter cells. In the second set of divisions, the chromatids separate and segregate into the gamete. Thus, one and only one copy of each allelic pair is contributed to the gamete. In this way, a "diploid" germ cell gives rise to a "haploid" sperm or egg that contains an assortment of one of each of the twenty-three pairs of allelic chromosomes in the parental cell. During fertilization, a sperm and an egg unite to create a zygote with a newly constituted complete set of forty-six chromosomes. These fundamental properties of DNA and cell division are the basis of Mendel's laws of unit inheritance, segregation, and independent assortment.
The central dogma of molecular genetics holds that each gene encodes one polypeptide, forming a monomeric protein. The portion of the gene that specifies the polypeptide sequence is termed "coding" DNA. Each human cell contains approximately 3.9 × 109 base pairs of DNA per haploid genome, which is enough to encode about 1 million polypeptides of average length. However, there are approximately 35,000 structural genes—possibly in the range of 30,000—in humans; thus more than 90 percent of DNA does not encode peptide sequences. The DNA that does not code for protein, termed "noncoding" DNA, is often involved in the regulation of gene expression. Noncoding DNA can also play a structural role. Structural functions include providing structural stability for the chromosome (e.g., matrix-associated regions, or MARs), providing the specialized sequences that define the ends of the chromosome (telomeres), and providing a site to which the cellular cytoskeleton can be attached in order to allow the movement of chromosomes during meiosis and mitosis (centromeres). Approximately 10 percent of cellular DNA consists of a repetitive sequence that has been randomly inserted throughout the genome. Although the function of this repetitive DNA is unknown, its presence has proven useful for gene mapping studies.
Genetic information proceeds in a stepwise fashion from the sequence of a gene to the synthesis of a polypeptide. Located near the coding sequence of the gene are sequences, called DNA control regions, that identify the transcription start site (promoters), mark the tissue in which it will be expressed (enhancers), and control the use of batteries of genes during ontogeny (locus control regions). The regions of DNA that specify the sequence of a polypeptide chain, or structural genes, are organized into discrete units (exons) that are separated by noncoding sequences (introns). The first step in synthesizing a new protein occurs in the nucleus, where the sequence of the coding DNA is copied (transcribed) into ribonucleic acid (RNA), a less stable nucleic acid that can be rapidly degraded. The ends of the RNA are modified to help stabilize the final product and the introns are removed, or spliced out, generating messenger ribonucleic acid (mRNA). The mRNA is transported from the nucleus to the cytoplasm, where it is translated by ribosomes into polypeptide strands.
Ribosomes read the sequence of the mRNA in sequential groups of three, or triplets, termed a codon. There are sixty-four different combinations (e.g., AAA, TTT, CAC), all but three of which specify a specific amino acid. Each codon specifies a single amino acid, but amino acids can be encoded by more than one codon, thus there is considerable degeneracy in the code. Translation begins when the mRNA is bound to the ribosome. Transfer RNA (tRNA), an adapter molecule, contains a complementary triplet anticodon at one end, and an amino acid bound to the other end. The tRNA anticodon binds to the mRNA codon and helps stabilize the interaction with the ribosome. Each ribosome has two sites where the tRNA can bind. Binding of the downstream tRNA, which contains sequence complementary to the next three nucleotide codon on the RNA, brings its amino acid next to the end of the growing polypeptide strand. Formation of a peptide bond allows the ribosome to shift down the mRNA, providing a site for the next amino acid and its adapter to bind. Step by step, the protein is allowed to grow until the mRNA brings one of the three remaining codons into the ribosome. These codons do not have tRNA partners, and they function to terminate translation and allow the release from the ribosome of the mRNA and its protein product.
Many genes are composed of a series of structural or functional domains, with each exon specifying part or all of the sequence of a single structural domain. Each domain can endow the protein with a different property. For example, a protein may have one or more extracellular domains that allow it to bind to a specific soluble ligand, a transmembrane domain that allows it to be anchored in the cell membrane, and one or more intracellular domains that allow it to signal inside the cell. These types of proteins are the product of mixing and matching different types of domains during evolution, a process that is facilitated by the exon/intron structure of the gene. By changing the extracellular domains while maintaining the rest of the molecule relatively intact, for example, a similar signal can be elicited by the binding of several different types of ligands. Conversely, the presence or absence of a transmembrane domain can allow the protein to be tethered to the cell or to exist as a soluble factor. The function of an unknown protein can often be guessed by analyzing its complement of domains.
At first glance, the linking of genes in chromosomal units and their transmission as a unit to daughter cells would seem to violate Mendel's laws of independent assortment and segregation, because effectively one might expect genes to be inherited as part of only 23 sets of genes. However, when allelic chromosomes are brought into close juxtaposition during the process of meiosis, breaks occur in the chromosomes and allow bridges, or chiasmata, to form between homologous portions of the chromosomes. This crossing over of DNA strands allows allelic chromosomes to recombine, forming patchwork or chimeric chromosomes that contain portions of each of the parental chromosomes. Although recombination can occur anywhere in the chromosome, only a limited number of chiasmata form during each meiosis. Two genes that are on opposite ends of the chromosome may thus behave as if they were on different chromosomes, whereas recombination is less likely between genes that are very close to each other in their primary sequence. The increased frequency of the joint inheritance of two genes that are closely physically linked on a chromosome is termed "linkage disequilibrium."
Distances between genes on a chromosome are quantified by either their physical distance from each other in millions of base pairs (megabases), or by their genetic distance, as measured by the frequency of recombination between the two genes per generation. One percent of genetic recombination is termed a "centimorgan," after the geneticist Thomas Hunt Morgan, whose studies of the common fruitfly, Drosophila, in the first half of the twentieth century helped elucidate the properties of recombination. As a rough guide, one centimorgan covers approximately one megabase of DNA. However, the relationship between linear and genetic distance is not absolute. The frequency of recombination, and thus the genetic distance between genes in specific regions of the genome, may differ depending on the sequence or the nonhistone proteins that cover the DNA. Recombination frequencies in selected regions of the genome may differ in male and female gametes, implying that segments of chromosomes can be handled differently by spermatogonia and oocytes. This disparity in how DNA is treated by male and female gametes can lead to differences in the function of alleles, depending on whether they have been inherited from the mother or the father, a process termed "imprinting."
A "mutation" is defined as a stable, heritable alteration in the DNA sequence that can be passed from a parental cell to at least one its daughters. From the standpoint of evolution, mutations are required to generate the genetic diversity that is needed to permit species to adapt to a changing environment. The normal rate of mutation is approximately one base pair change per generation per 107 base pairs; thus, on average, each child differs from its parent by approximately 390 base pairs as a result of mutations in the gametes. Mutations in the nonreproductive cells of the body are termed "somatic" mutations. Although by definition these alterations are not transmitted to the gametes, the mutations are passed on to the daughter cells of the mutated parent. Somatic mutations in oncogenes, for example, foster the development of many cancers.
Mutations can involve an entire human genome, as in triploidy, in which a third copy of the entire chromosomal complement occurs. Mutations may involve all or part of a single chromosome, including duplications, deletions, and translocations of a portion of one chromosome to another. At the other extreme, a mutation can be minute and involve a small deletion or insertion, or a replacement of only a single base pair (point mutation). Deletions or insertions that occur in a coding region can alter the reading frame distal to the mutation (frameshift mutations). Frameshift mutations frequently alter the protein sequence and can lead to premature peptide termination by generating a stop codon, one of the three triplet sequences that does not encode an amino acid. Point mutations in coding regions may be of three types: (1) a nonsense mutation (about 4% of base substitutions in coding regions), in which the base change generates one of the three termination codons; (2) a missense, or replacement, mutation (about 73% of base substitutions in coding regions), in which the base change results in substitution of one amino acid for another; and (3) a synonymous, or silent, mutation (about 23% of random base substitutions in coding regions), in which the base replacement does not lead to a change in the amino acid but only to a different codon for the same amino acid. Even synonymous mutations can have deleterious affects, however. A change in the coding sequence of a given gene may alter splicing patterns or diminish mRNA stability, reducing protein production.
The consequences of a single-point mutation to the function of a given protein can vary greatly. Enzymes, for example, exhibit a hierarchy of resistance to mutation. Portions of the hydrophilic exterior may serve primarily to allow the protein to be soluble in an aqueous solution, hence changes in the amino acid sequence that preserve hydropathicity may have little or no effect on the function of the protein. The hydrophobic core provides structural stability for the molecule, and amino acid changes may result in an unstable protein product that is temperature sensitive (e.g., falling apart at high temperature). Finally, the catalytic site is exquisitely sensitive, and a single mutation may completely abolish function.
Large deletions may interrupt a coding region and cause an absence of one or more closely linked protein products. If the deletion removes a bridge between two coding regions, the result may be a fusion or hybrid protein containing the initial sequence of one protein and the terminal portion of the other. Such deletions can also result from unequal crossing-over between homologous genes. Finally, alterations of the DNA in the surrounding regions may lead to changes in RNA splicing, transcriptional efficiency, or control of tissue expression.
The Human Genome Project began in 1990 with the goals of developing genetic and physical maps and determining the complete DNA sequence of the human genome. The ultimate goal is to use this mapping and sequence information to isolate and study the structure and function of genes that can contribute to the development of disease. Knowledge of the genetic basis of susceptibility for specific diseases is likely to aid in disease prevention as well as therapy. Associated with these benefits, however, is the risk of discrimination against healthy at-risk individuals that may never develop a disorder. Thus, in addition to learning how to use this new knowledge, we must gain the wisdom to use genetic information appropriately.
Harry W. Schroeder, Jr.
Alberts, B. (1994). Molecular Biology of the Cell, 3rd edition. New York: Garland Publishing.
Macfarlane, W. M. (2000). "Demystified Transcription." Molecular Pathology 53(1):1–7.
Macilwain, C. (2000). "World Leaders Heap Praise on Human Genome Landmark." Nature 405:983–984.
Monk, M. (1995). "Epigenetic Programming of Differential Gene Expression in Development and Evolution." Developmental Genetics 17(3):188–197.
Paques, F., and Haber, J. E. (1994). "Multiple Pathways of Recombination Induced by Double-Strand Breaks in Saccharomyces Cerevisiae." Microbiology & Molecular Biology Review 63(2):349–404.
Preston, R. J. (1997). "Telomeres, Telomerase and Chromosome Stability." Radiation Research 147(5):529–534.
Russell, D. W.; Lehrman, M. A.; Sudhof, T. C.; Yamamoto, T.; Davis, C. G.; Hobbs, H. H.; Brown, M. S.; and Goldstein, J. L. (1986). "The LDL Receptor in Familial Hypercholesterolemia: Use of Human Mutations to Dissect a Membrane Protein." Cold Spring Harbor Symposia on Quantitative Biology 51(2):811–819.
Sybenga, J. (1999). "What Makes Homologous Chromosomes Find Each Other in Meiosis? A Review and an Hypothesis." Chromosoma 108(4):209–219.
Tournebize, R.; Heald, R.; and Hyman, A. (1997). "A Role of Chromosomes in Assembly of Meiotic and Mitotic Spindles." Progress in Cell Cycle Research 3: 271–384.
Vogel, F., and Motulsky, A. G. (1997). Human Genetics: Problems and Approaches, 3rd edition. Berlin: Springer-Verlag.
Watson, J. D. (1998). The Double Helix: A Personal Account of the Discovery of the Structure of DNA. New York: Scribners.
A common feature of organisms is that offspring tend to look like their parents. For example, tall, brown-eyed parents tend to have tall, brown-eyed children. The mechanism by which parents pass on particular traits to their offspring is termed heredity. The focus of the following entry will be to explore the role of genes in heredity.
What Is a Gene?
Originally, most biologists believed in "blending inheritance," where the joining of a sperm from the father and an egg from the mother yields off-spring which have characteristics that are a blend of the characteristics of the two parents. However, Austrian botanist Gregor Mendel's (1822-1884) pioneering work with inheritance in pea plants largely disproved this theory. Mendel showed that for many traits, when a pea plant with one trait(e.g., green pods) is bred to a pea plant with another trait (e.g., yellow pods), the offspring always look like one of the parental types, never a mixing of both (e.g., yellow-green pods are never seen). Mendel proposed that traits are inherited in a "particulate" manner. Parents transmit individual hereditary units to their offspring, and the particular combination of these units in an offspring controls how that offspring will look. These hereditary units are now known as genes. Thus the science of heredity is termed genetics and the overall genetic makeup of an organism is termed its genotype. The genotype determines the types of traits an organism will have, otherwise known as the organism's phenotype.
Most organisms are diploid, that is, they have two copies of every gene. Different forms of a particular gene are known as alleles. In the example above, the gene for pod color had two alleles, green and yellow. Diploid parents make haploid eggs and sperm, meaning those gametes have only one copy of each gene. Thus when the egg and sperm fuse, the resulting offspring is a diploid, having one copy of each gene from both parents. An offspring with two copies of the same allele for a particular gene is called a homozygote, while an offspring with two different alleles for a particular gene is called a heterozygote.
When an offspring receives different alleles of a particular gene from its parents (e.g., a yellow pod allele from its mother and a green pod allele from its father), one allele is typically dominant over the other. In our example, green pod is dominant over yellow pod so that an individual with both color alleles will always have green pods. The nondominant allele is known as the recessive allele.
Although uncommon, some pairs of alleles do not behave in a completely dominant or recessive manner. For example, the flowers of snapdragons typically come in two colors, red and white. Mating a red-flowered plant with a white-flowered plant yields pink-flowered offspring, which would be expected under the blending inheritance theory. To show that Mendel's laws still hold, a cross may be made between a pink-flowered plant and a white-flowered plant. A proponent of the blending inheritance theory would predict that all offspring from this cross would have light pink flowers, whereas in fact half the offspring have white flowers and the other half have the original pink. When the phenotype of the heterozygote is a combination of the phenotypes of homozygotes for those two alleles, the alleles are said to show incomplete dominance.
Codominance occurs when a heterozygote expresses both of the homozygote phenotypes. Take for example the A and B blood groups of humans, which determine the type of antigen a blood cell will produce. If a homozygote for A (written AA) mates with a homozygote for B (BB), the offspring will be heterozygous (genotype AB), and will produce both A antigens and B antigens, not a blending between the two.
Other factors may affect whether a normally dominant allele expresses its phenotype. Siamese cats have a dominant black fur allele, but that allele only expresses its phenotype at colder temperatures. These cats tend to have dark ears, paws, and a dark tail but are light-colored in areas of the body closer to the body's warm core. Furthermore, other genes in the genome, such as genes that code for modifiers and suppressors, can affect how an allele at one particular gene is expressed.
DNA as the Genetic Material
Because genes control the structural and functional properties of organisms, it became increasingly important to biologists in the early twentieth century that they determine what type of molecules genes actually are. It was tempting to believe that genes were proteins, as it had been established that proteins are an extremely diverse group of molecules that perform a wide variety of specific functions within cells. However, several lines of study eventually led to the conclusion that genes are made of deoxyribonucleic acid, or DNA.
When a diploid organism makes new cells, the new cells are also diploid and are exact copies of the old cells. The mechanism by which cells replicate is termed mitosis. However, when adult organisms mate, they make haploid gametes (eggs and sperm) through a process known as meiosis. Scientists worked out the steps by which diploid cells make diploid copies (mitosis) and haploid copies (meiosis) in the late 1800s. The difference between mitosis and meiosis lies largely in the sorting in the cell nucleus of chromosomes, condensed strands of DNA packaged with various proteins. Interestingly enough, chromosomes seem to move from one generation to the next in a way that mirrors the movement of genes across generations. In adult cells there are two copies of every chromosome, as there are two copies of every gene. Meiosis yields gametes with one copy of each chromosome, and fertilization of the egg by a sperm restores the chromosome number to its original state, paralleling the fact that one copy of each gene from both parents are fused into their diploid offspring. Thus genes appeared to be associated with chromosomes. Furthermore, later studies showed that particular genes could be mapped to precise locations within chromosomes.
In the 1920s, a British physician named Frederick Griffiths performed a number of experiments with the bacterium that causes pneumonia in humans, Streptococcus pneumoniae. Griffiths worked with two strains of the bacteria, one that was virulent and caused disease, and another that was avirulent (not virulent). Griffiths found that the avirulent bacteria, in the presence of extract from the virulent strain, could be transformed into a virulent form. This "transforming agent" was studied intensively by the American bacteriologist Oswald Avery and his colleagues over several years. They were able to destroy various chemicals found in the virulent strain extract so as to be able to test the significance each chemical had on virulence individually. In 1944 they concluded that DNA from the virulent strain extract was the transforming agent.
Further proof that DNA is the molecule of inheritance came in 1952 with the publication of a paper by Alfred Hershey and Martha Chase of the Carnegie Laboratory of Genetics. They studied the bacteriophage T2, a virus that infects bacteria such as Escherichia coli (E. coli). It was known that viruses were made almost entirely of protein and DNA, and that some viral component moved into the bacterial cells and caused the bacteria to use its cellular apparatus to make new viruses. Hershey and Chase were able to label the protein component of the virus and the DNA component of the virus in different ways so as to track which component was responsible for controlling the host cell. Their results confirmed that the viral DNA, not protein, was responsible for manipulating the bacterial host cells. It was finally apparent that the genetic material is made of DNA.
How Does the Genotype Determine the Phenotype?
George Beadle and Edward L. Tatum's work on the bread mold Neurospora crassa in the 1940s at the California Institute of Technology provided some of the first convincing evidence that the function of genes is to control the production of proteins. Neurospora can be grown in the lab on a medium made of a few simple nutrients. However, Neurospora mutants that required certain supplements in the medium to be able to grow were known to exist. In these mutants, enzymes (proteins that catalyze molecular reactions) that are necessary to the functioning of particular metabolic pathways do not perform properly. Beadle and Tatum irradiated Neurospora cells with x-rays to induce a wide variety of mutations that made the Neurospora unable to live on the minimal medium. Some of these mutations blocked different steps within the same metabolic pathway. Because the genes controlling different enzymatic steps from the same pathway were mapped to different chromosomal locations, it became clear that particular enzymes correspond to particular genes. In other words, each gene, which is made of DNA, is responsible for the production of one enzyme. It was later shown that genes can "code" for any kind of protein, including enzymes.
Biologists quickly focused on determining the structure of DNA to try to gain insight into the actual mechanism whereby genes control the production of proteins. DNA was found to be a double helix made primarily of the four nucleotides : adenine (A), cytosine (C), guanine (G), and thymine (T). The structural makeup of DNA is shared among all living things, suggesting that the different forms of life have a single common ancestor. The process by which DNA specifies the type of protein to be made was labeled the "central dogma of molecular biology." Genes are first copied from DNA to RNA (ribonucleic acid) in a process termed transcription. This RNA, which is referred to as messenger RNA or mRNA, then specifies the formation of proteins in a process termed translation. Translation involves the breaking up of the DNA into codons, combinations of three nucleotides in a row. Each combination of three bases (e.g., ATG, TCA, …) encodes a particular amino acid, the building blocks of proteins. So, despite the small number of nucleotide types that make up DNA, sequences of these nucleotides code for the wide variety of proteins found in organisms.
The Structure of Genes
In eukaryotes (organisms that possess membrane-bound organelles such as a nucleus), genes are typically made up of exons and introns. Exons are regions of the gene that code for protein (the codons), while introns are regions of the gene that are transcribed into mRNA but are spliced out before the translation stage. Introns are thought to have evolved to allow exon shuffling, the process whereby an exon from one allele of a gene in a heterozygote may "switch places" with the same exon from the second allele. This mixing and matching of exons in the two copies of a gene allows for rapid evolution of proteins. The exons are switched through a process known as recombination, the physical breaking and piecing together of homologous chromosomes. Having introns increases the probability that the locatio of the chromosomal breakpoints during recombination are not in coding DNA and so will not cause deleterious mutations.
There are several other types of noncoding regions within genes. For example, promoters are specific DNA sequences in front of the coding region which allow the RNA polymerase enzyme to bind and to start transcription of the gene. Other DNA sequences near the coding region of the gene allow for regulatory enzymes to bind and cause up-regulation (more or faster transcription) or down-regulation (less or slower transcription) of that gene. For example, if a host cell is being attached by a bacteria, enzymes in the host cell bind to and cause the up-regulation of genes coding for proteins that destroy bacterial cells.
Genes control the phenotype, the structural and functional properties of an organism. Since it is clear that phenotypes have evolved and diversified over the history of life, it stands to reason that genes controlling the phenotype have evolved as well. How do genes evolve?
The ultimate cause of evolution is the accumulation of mutations in the DNA of an organism. Mutations can be caused by a number of factors, including errors made by DNA polymerase during replication of the genome, by reactive molecules in the cell, and by external factors such as x-rays. Eukaryotes utilize many mechanisms, including repair enzymes, to fix mutations when they occur, but inevitably some mutations are not corrected and are then passed on to future generations. The overwhelming majority of mutations are harmful or neutral with respect to the fitness of the organism possessing them. However, when advantageous mutations occur, natural selection tends to increase their frequency in a population.
The most common types of mutations are point mutations, mutations that occur within a single gene. Point mutations can be broken up into a number of classes. Because the genetic code is "degenerate," that is, different codons may code for the same amino acid, some mutations are silent. Often a change in the third base of a codon (e.g., ACA to ACG) does not change the amino acid that is coded for. Silent mutations have virtually no effect on the fitness of an organism. Missense mutations are mutations that change a codon and change the amino acid that is coded for. A protein with one altered amino acid may be nonfunctional but will more likely just be less efficient in its job than the original. Nonsense mutations are mutations that change a regular codon into a stop codon, prematurely terminating translation of the mRNA. These mutations generally have severe effects on the ability of the protein to perform its required function. Frameshift mutations do not cause base substitutions but instead delete or add nucleotides into a sequence. Imagine a frameshift mutation that adds one nucleotide into a coding sequence. Because the mRNA message is read three nucleotides (one codon) at a time, the one base insertion will cause all the downstream codons to be one base off and to be read wrong. These mutations are extremely disruptive to the genes they occur in.
Other than point mutation, another way in which a gene might gain a new function is through gene duplication. Occasionally, parts of chromosomes or even whole chromosomes are duplicated. If a gene is duplicated, then one of the duplicates is free to evolve in any direction since the other will continue to fulfill its duties. Gene duplication allows genes to acquire novel functions and to create novel phenotypes. Gene duplication may be extremely important in an evolutionary sense; for example, it appears that the great diversification of vertebrates was accompanied by several genome duplication events.
Although most new point mutations have harmful effects on fitness, the majority of mutations which become "fixed," that is, which reach 100 percent frequency in a population, are neutral or advantageous. This is because natural selection tends to weed out harmful mutations or keep them at extremely low frequencies. Thus when comparing the gene sequences from two closely related species, any differences in the DNA sequences can be attributed to the fixation of neutral or advantageous mutations that have arisen since the time when those species evolved away from their most recent common ancestor. There has been great debate in the scientific literature regarding what proportion of fixed differences between species were actually favored by natural selection. The Japanese geneticist Motoo Kimura, in his controversial 1983 book The Neutral Theory of Molecular Evolution, provided compelling evidence to suggest that much of the evolution that genes undergo over time is neutral, and that very few genetic differences between species were favored by selection. span>
see also Biological Evolution; Geneticist; Mendel, Gregor; Morphology.
Todd A. Schlenke
Adams, Mark D. "The Genome Sequence of Drosophila melanogaster. " Science 287 (2000):2185-2195.
Dawkins, Richard. The Selfish Gene. Oxford, U.K.: Oxford University Press, 1989.
Griffiths, Anthony J. F., Jeffrey H. Miller, David T. Suzuki, Richard C. Lewontin, and William M. Gelbart. An Introduction to Genetic Analysis, 6th ed. USA: W. H. Freeman and Company, 1996.
Johnson, George B. Biology: Visualizing Life. New York: Holt, Rinehart and Winston Inc., 1998.
Kimura, Motoo. The Neutral Theory of Molecular Evolution. Cambridge: Cambridge University Press, 1983.
Lewin, Benjamin. Genes IV. Oxford, U.K.: Oxford University Press, 1990.
Li, Wen-Hsiung, and Dan Graur. Fundamentals of Molecular Evolution. Sunderland, MA: Sinauer Associates, Inc., 1991.
Smith, John Maynard. Evolutionary Genetics. Oxford: Oxford University Press, 1989.
Stern, Curt, and Eva R. Sherwood, eds. The Origin of Genetics: A Mendel Source Book. San Francisco: W. H. Freeman, 1966.
A gene is a unit of genetic information that codes for a single biological function or product. Genes are found on chromosomes and are made up of nucleic acid (deoxyribonucleic acid [DNA ] for most organisms) and proteins.
|COMPARISON OF GENE NUMBERS AND DNA FROM DIFFERENT SPECIES|
|Organism||Number of genes||Number of base pairs (millions)|
|source: Bork, Peer, and Copely, Richard (2001). "The Draft Sequence: Filling in the Gaps." Nature 409:818–820.|
|Caenorhabditis elegans (worm)||19,099||97|
The information is contained in the sequence of the four nucleic acid components much like the way written information is contained in the sequence of letters in a sentence.
DNA is measured in base pairs (bp) since it occurs as a double helix . The sum total of all genetic information in an organism is its genome. Different organisms have different sized genomes with different numbers of chromosomes, genes, and base pairs. Table 1 shows the values for several organisms, including the preliminary results for the human genome from the Human Genome Project (HGP) report. One surprising finding of the HGP was that only 30,000 to 40,000 genes were found. The human genome has almost 3 billion base pairs and many different gene products, so scientists were expecting over 100,000 genes. Results from the HGP suggest that about 75 percent of DNA is "nongene." This DNA is often referred to as "junk" or "selfish" DNA, but some portions do have important functions in maintaining the structure of chromosomes.
Genes "tell" a cell which molecules to synthesize based on the genetic code that those genes contain. The amount of code needed varies widely. Genes vary greatly in size, from those that code for small transfer RNA molecules (tRNAs) and have 73 base pairs, to those that code for very large proteins and have 250,000 base pairs (Maulik and Patel, p. 26). A eukaryotic gene for a large protein may be much larger than its coding region due to intervening noncoding sequences (introns). Each gene has two or more coding regions (exons) separated by introns. The introns are "cut out" of the genetic information during expression and do not show up in the final gene product. Some genes have as much as 90 percent intron DNA. Sometimes the exons are cut and pasted from the same gene in different ways, creating two or more different gene products issuing from the same gene. This added flexibility opens the door to even more debate about how genes and gene products are controlled. Although each gene codes for one or sometimes a few gene products (due to splicing variations), biological functions often require many genes working together or in sequence. The proper interplay of genes produces healthy cells. Some forms of cancer occur when certain genes, called oncogenes, become uncontrolled.
see also Chromosome; Double Helix; Genetic Engineering; Genome; Nucleic Acids; Proteins.
Berg, Paul, and Singer, Maxine (1992). Dealing with Genes: The Language of Heredity. Mill Valley, CA: University Science Books.
International Human Genome Sequencing Consortium (2001). "Initial Sequencing and Analysis of the Human Genome." Nature 409:860–921.
Maulik, Sunil, and Patel, Salil (1997). Molecular Biotechnology: Therapeutic Applications and Strategies. New York: Wiley-Liss.
McCarty, MacLyn (1985). The Transforming Principle: Discovering that Genes Are Made of DNA. New York: Norton.
Singer, Maxine, and Berg, Paul (1991). Genes and Genomes: A Changing Perspective. Mill Valley, CA: University Science Books.