DNA Sequencing

views updated May 18 2018

DNA Sequencing

The genome of an organism is the sum total of its genetic information. The genome is not only a blueprint for the organism it also contains historical notes on the evolution of the organism. The ability to determine the sequence of deoxyribonucleic acid (DNA) and thus read the messages in the genome is of immense biological importance because it not only describes the organism in detail but also indicates its evolutionary history.

DNA is a linear chain of four nucleotides : adenosine (A), thymidine (T), cytidine (C), and guanosine (G). The genetic information in DNA is encoded in the sequence of these nucleotides much like the information in a word is encoded in a sequence of letters. The technique for determining the sequence of nucleotides in DNA is based on the same mechanism by which DNA is replicated in the cell. DNA is composed of two complementary strands in which the As of one strand are paired with the Ts of the complementary strand and the Cs of one strand are paired with the Gs of the complementary strand. When DNA is replicated, a new DNA strand (primer strand) is extended by using the information in the complementary (template) strand. The DNA has a direction (polarity); the growing end of a DNA strand is the end that is 3 and the other end is the 5. An enzyme , DNA polymerase, replicates DNA by adding nucleotides to the 3 end of the primer strand, which complement the template strand. (Figure 2.)

DNA polymerase has an absolute requirement for a hydroxyl group (OH) on the 3 end of the template strand. If the 3 hydroxyl group is missing no further nucleotides can be added to the template strand. This termination of the elongation of the template strand is the basis for determining the DNA sequence. If the DNA polymerase is presented with a mixture of nucleotides, some of which have 3 OH groups and others of which have no 3 OH group (and are bound to a colored dye), both types of nucleotides are added to the growing template strand. When a nucleotide with no OH group is added to the primer strand, elongation is terminated with the colored dye at the 3 end of the strand.

All essential elements for determining the sequence of nucleotides in the primer DNA strand are in place. A DNA synthesis reaction is set up in a test tube (in vitro), including DNA polymerase, a template DNA strand, a short uniform primer DNA strand, and a mixture of the four nucleotides (A, T, C, and G). The short primer DNA strands are synthesized chemically and are identical so they pair with a specific sequence in the template DNA strand. Each of the nucleotides is present in two forms, the normal form with a 3 hydroxyl group and the terminating form with a colored dye and no 3 hydroxyl group. Each different terminating nucleotide (A, T, C, and G) has a different colored dye attached.

The amount of normal nucleotides present in the reaction is much larger than the terminating nucleotides so that DNA synthesis proceeds almost normally, and only occasionally is the elongation of the primer strand terminated by the incorporation of a dye labeled nucleotide lacking a 3 hydroxyl group. However, eventually all of the primer strands do incorporate a dye labeled nucleotide and their elongation is terminated. Thus, at the end of the reaction there is a vast collection of primer strands of varying lengths each terminated with a nucleotide that has a colored dye specific to the terminal nucleotide.

All of the primer strands start at the same point, specified by the sequence of the short uniform primer DNA. Thus, the length of the primer strand corresponds to the position of the terminal nucleotide in the DNA sequence relative to the starting position of the primer DNA strand. The color of the dye on the primer strand identifies the terminal nucleotide as an A, T, C, or G. Once the primer strands are arranged according to length, the DNA sequence will be indicated by the series of colors on progressively longer primer strands.

The DNA strands can be readily separated according to length by acrylamide gel electrophoresis (see Figure 1). The acrylamide gel is a loose matrix of fibers through which the DNA can migrate. The DNA molecules have a large negative charge and thus are pulled toward the plus electrode in an electric field. The whole collection of primer strand DNA molecules is placed in a well at the top of an acrylamide gel with the plus electrode at the bottom of the gel. When the electric field is applied the DNA molecules are drawn toward the plus electrode, with shorter molecules passing through the gel matrix more easily than longer molecules. Thus the smaller DNA molecules move the fastest.

After a fixed period of time, the DNA molecules are separated according to length with the shortest molecules moving furthest down the gel. All of the molecules of a given length will form a band and will have the same terminal nucleotide and thus the same color. The DNA sequence can be read from the colors of the bands. One reads the sequence of the DNA from the 5 end starting at the bottom of the gel to the 3 end at the top of the gel.

In practice the whole process is automated; the bands are scanned with a laser as they pass a specific point in the gel. These scans produce profiles for each nucleotide, as shown in the lower portion of Figure 3. A computer program then determines the DNA sequence from these colored profiles, as shown in the upper portion of Figure 3. A single automated DNA sequencing instrument can determine more than 100,000 nucleotides of DNA sequence per day and a large sequencing facility can often produce over 10 million nucleotides of sequence per day. This high sequencing capacity has made it feasible to determine the complete DNA sequence of large genomes including the human genome.

see also DNA; Electrophoresis; Human Genome Project; Separation and Purification of Biomolecules

Clifford Brunk

Bibliography

Hartl, Daniel L., and Elizabeth W. Jones. Genetics: Principles and Analysis, 4th ed. Sudbury, MA: Jones and Bartlett, 1998.

Raven, Peter H., and George B. Johnson. Biology. New York: McGraw-Hill, 1999.

Watson, James D., Michael Gilman, Jan Witkowski, and Mark Zoller. Recombinant DNA, 2nd ed. New York: Scientific American Books, 1992.

DNA sequencing

views updated May 29 2018

DNA sequencing (gene sequencing) The process of elucidating the nucleotide sequence of a DNA fragment. Two techniques are used. The Maxam–Gilbert method (named after Allan Maxam and Walter Gilbert) involves cleaving the DNA with a restriction enzyme and labelling each of the resulting smaller fragments with 32P-phosphate at one end. The fragments are subjected to four different sets of reactions, each set specifically cleaving DNA at a particular base or bases. The cleaved fragments are separated by electrophoresis according to their chain length and identified by autoradiography. The base (nucleotide) sequence is deduced from the position of bands in each of the four lanes in the gel. The Sanger method (named after Frederick Sanger), also called the dideoxy method, involves synthesizing a new DNA strand using as template single-stranded DNA from the gene being sequenced. Synthesis of the new strand can be stopped at any of the four bases by adding the corresponding dideoxy (dd) derivative of the deoxyribonucleoside phosphates; for example, by adding ddATP the synthesis terminates at an adenosine; by adding ddGTP it terminates at a guanosine, etc. As in the first method, the fragments, which comprise radiolabelled nucleotides, are finally subjected to electrophoresis and autoradiography. A big advantage of the Sanger method is that it can easily be adapted to sequencing RNA, by making single-stranded DNA from the RNA template using the enzyme reverse transcriptase. This enables, for example, sequencing of ribosomal RNA for use in molecular systematics. Furthermore, by using fluorescent dyes as labels instead of radioisotopes, the Sanger method has been fully automated. After separation of the fragments, the products of all four reactions are detected by fluorescence spectroscopy and analysed by computer, which gives a printout of the base sequence. DNA sequencing is now employed on a major scale, for example in determining the nucleotide sequence of entire genomes (see Human Genome Project).