DNA Structure and Function, History

views updated

DNA Structure and Function, History

DNA was discovered in the nineteenth century, but its significance as the physical basis of inheritance was not understood until midway through the twentieth. The realization that it was the molecule of heredity led to intensive efforts to determine its three-dimensional structure, and to understand how it stores and transmits genetic information. The discovery of the structure of DNA, and the elucidation of its function, ranks as one of the greatest achievements of science.

Discovery of DNA

Deoxyribonucleic acid (DNA) was first discovered in 1869 by Johann Friedrich Miescher (1844-1895), a young Swiss chemist studying in Tübingen, Germany. Miescher's interest in the biochemistry of the cell nucleus led him to collect used surgical bandages, from which he collected pus (white blood cells), which have very large nuclei. From these, he purified a new compound, which he termed "nuclein." Miescher showed that nuclein was a large molecule, acidic, and rich in phosphorus. Miescher continued to work with nuclein over the next two decades, turning for his source to salmon sperm, which have exceptionally large nuclei and were plentiful in the rivers near his laboratory. One of his students renamed the compound "nucleic acid."

In 1885 the German biologist Oskar Hertwig (1849-1922) suggested that nucleic acid might be the hereditary material, based on its presence in the nucleus and the growing certainty that the nucleus was the center of heredity. Despite this promising beginning, no further progress was made in understanding its true role until the 1940s.

The biochemistry of nucleic acid continued to be studied, however, and by 1900, scientists had learned that it was composed of three parts: a sugar, a phosphate, and a base (together termed a nucleotide ). The five-carbon sugar is a ringed structure, and it forms an alternating chain with phosphate (a phosphorus atom surrounded by four oxygens). Also attached to the sugar is one of five different bases: adenine, cytosine, guanine, thymine, and uracil, usually abbreviated A, C, G, T, and U. The names of the bases are related to their historical origin: Guanine was isolated from bird guano, thymine from the thymus gland of calves, and adenine from calf pancreas ("adeno-" is a Greek root for gland); uracil is chemically related to urea; and "cyto-" means cell.

In the 1920s it was discovered that there were two types of sugars, ribose and deoxyribose, differing by the presence or absence of one oxygen atom. DNA was shown to incorporate the bases A, C, G, and T, while RNA (ribonucleic acid) incorporates A, C, G, and U. Much of this work was carried out by the German biochemist Albrecht Kossel (1853-1927) and the American Phoebus Aaron Levene (1869-1940).

Also by the 1920s, Thomas Hunt Morgan and his colleagues had shown that genes, whatever they were made of, were carried on chromosomes. Chromosomes were shown to contain both protein and DNA, so the question of which of these two substances composed the genes took center stage. The more complex chemical nature of proteins gave them the theoretical edge, an opinion given great and, in hindsight, unfortunate weight by Levene, a widely respected biochemist.

Levene proposed the tetranucleotide hypothesis, which held that the structure of DNA was a monotonous repetition of the four nucleotides in succession. Levene's evidence was that DNA, which he had isolated from a variety of sources, had roughly equal amounts of A, C, G, and T. While the actual proportions he found were not exact, Levene attributed the differences to experimental error, rather than biochemical reality. Such a simple and highly regular molecule as the one Levene proposed could not account for the diversity of life, however, and so DNA was assumed to play only a structural role in the chromosome.

DNA Is the Transforming Factor

DNA was not again taken seriously as the hereditary material until 1944, when Oswald Avery (1877-1955) published a landmark paper outlining his experiments with two strains of Pneumococcus bacteria. The S ("smooth") strain was able to cause disease in mice, while the R ("rough") strain was not. Under the microscope, the S strain had a smooth, glistening surface, due to a sugar capsule it secreted. The R strain lacked the capsule.

Fifteen years earlier, Frederick Griffith had shown that injecting R bacteria plus heat-killed S bacteria into mice would cause disease just as surely as injecting live S bacteria, a result that Griffith attributed to a "transforming principle." When Avery grew R bacteria in a dish with the heat-killed S bacteria, he saw that the R bacteria were transformed into S bacteria, capable of making capsules and causing disease, just as Griffith had observed. Avery's group purified the components of the S bacteria, and showed that DNA alone could cause the transformation, while protein could not.

Though not immediately accepted by all scientists, Avery's discovery triggered intense interest among biochemists and geneticists, who turned their attention to discovering how DNA could be the genetic molecule. As in every other branch of biochemistry, the structure was presumed to hold the key to the function, and so the hunt was on for the structure of DNA.

One key was the discovery by Erwin Chargaff (1905-2002) that, contrary to Levene's conclusions, the nucleotide proportions were not all the same. Instead, in 1950 Chargaff showed that they varied from species to species, although within a species they were constant between tissues. Further, and most tantalizingly, he discovered that the amounts of adenine and thymine were equal to one another, and the amounts of cytosine and gua-nine were equal to one another; in other words, A = T, C = G. Chargaff initially did not understand the significance of this discovery, however.

Model Building

At the same time, other groups were trying to solve the DNA structure puzzle by building three-dimensional models. The group that succeeded was that of Francis Crick (born 1916) and James Watson (born 1928), in Cambridge, England. Watson and Crick built their models using data from X-ray crystallography, a technique for measuring interatomic distances through analysis of the scatter patterns made by X rays bouncing off a pure crystal. Rosalind Franklin (1920-1958), working in London, had made the best X-ray pictures of DNA, and Watson and Crick had been given access to these (without Franklin's permission) at a critical time in their model-building endeavors and saw features in it that Franklin had not yet discovered. Shortly after, Watson and Crick deduced the correct structure, and published their work in April 1953.

DNA is a double helix, in which two sugar-phosphate strands wind around each other, forming a structure that looks like a broad spiral staircase. The sugar molecule has a head and a tail, and each sugar-phosphate strand therefore has a direction, like a chain of arrows. The two helical strands point in opposite directions.

The bases project toward the inside like stair treads. In contrast to the monotony predicted by the tetranucleotide hypothesis, the bases along one strand can be in any sequence whatsoever. Critical to the stability of DNA is the hydrogen bonding between the bases across the interior. These weak chemical attractions form only when the base atoms are positioned just so—in particular, Watson and Crick discovered, only when adenine projects across to meet a thymine, and guanine a cytosine. Chargaff's ratios reflect this essential base pairing .

Replication

Given the DNA structure, three questions immediately arose: How is DNA copied to allow faithful inheritance of genes, how does DNA store information, and how does DNA use that information to determine the properties of the cell? Watson and Crick addressed the theoretical underpinnings of each of these issues in their next paper, published in May 1953. Regarding copying, they wrote:

Now our model for deoxyribonucleic acid is, in effect, a pair of templates, each of which is complementary to the other. We imagine that prior to duplication the hydrogen bonds are broken, and the two chains unwind and separate. Each chain then acts as a template for the formation onto itself of a new companion chain, so that eventually we shall have two pairs of chains where we only had one before. Moreover, the sequence of the pairs of bases will have been duplicated exactly.

In outline, this is precisely correct, as confirmed in 1957 by Matthew Meselson (b. 1929) and Franklin Stahl (b. 1930). They fed bacteria radioactive nucleotides, so that both DNA chains would be labeled with radioactivity. They then removed the radioactive nucleotide source and measured the dilution of the radioactivity in each round of DNA copying. After one round, each DNA molecule had half the amount of radioactivity. According to the Watson-Crick prediction, this meant that one strand of each was completely new. After the second round, half the DNA molecules maintained this level of radioactivity, and half had none at all, just as expected from the Watson-Crick model.

This process, in which one parental DNA strand is conserved, unchanged, and acts as a template to synthesize a new partner, is called semiconservative replication. The details of the copying process, called replication, are much more complex than this simple outline, though, and the entire process is still not fully understood in all its particulars. Central to it is DNA polymerase , a large multiprotein complex first discovered in 1957 by Arthur Kornberg (born 1918) and Severo Ochoa (1905-1993).

Coding

Regarding how DNA can act as a gene, storing information and directing activities of the cell, Watson and Crick wrote,

The phosphate-sugar backbone of our model is completely regular, but any sequence of the pairs of bases can fit into the structure. It follows that in a long molecule many different permutations are possible, and it therefore seems likely that the precise sequence of the bases is the code which carries the genetical information.

This too is correct, as a series of experiments showed.

Proteins are the workhorses of the cell, controlling the rates of all the reactions within and providing much of the cell's structure as well. Therefore, it was quickly realized, genes must control the production of proteins, and the genetic information carried in the sequence of bases in DNA is a code for the sequence of amino acids in proteins. Proteins are made of twenty amino acids, linked together in varying sequences. The sequence determines the shape and chemical properties of the protein, and so specifying protein sequence is the essential role of DNA.

Since there are four bases and twenty amino acids, a single nucleotide is not enough to specify one amino acid. Even two are not enough, because two nucleotides will only give rise to sixteen unique combinations (AA, AC, AG, and so on). Therefore, it was immediately obvious that each amino acid must be coded for by at least three nucleotides.

Stating this must be so was quite a bit easier than working out the details of how DNA and amino acids interacted to form a protein. Some researchers suggested a solution in which amino acids lined up directly on the surface of DNA; other alternatives were also proposed. A suggestion by Crick that there was some type of adapter between the two was confirmed with the discovery of transfer RNA. In fact, DNA and amino acids never do interact during protein synthesis—instead, an RNA copy of DNA is made (messenger RNA), which links with transfer RNAs that carry amino acids. The code itself was worked out between 1961 and 1967, by several different groups, including Marshall Nirenberg (born 1927), Har Gobind Khorana (born 1922), and Johann Matthaei, who developed pioneering cell-free systems that allowed researchers to work without the complexity and constraints involved with living organisms.

These heroic discoveries marked the beginning of the molecular biology revolution. From them even deeper questions have arisen, about how gene expression is regulated, how genes control development, and how (and whether) genes can be modified to treat disease and improve human life.

see also Crick, Francis; DNA; Genetic Code; Nucleotide; Replication; Transcription; Watson, James.

Richard Robinson