Cloning Genes
Gene cloning, or molecular cloning, has several different meanings to a molecular biologist. A clone is an exact copy, or replica, of something. In the literal sense, cloning a gene means to make many exact copies of a segment of a DNA molecule that encodes a gene. This is in marked contrast to cloning an entire organism—regenerating a genetically identical copy of the organism—which is technically much more difficult (with animals) and can involve ethical ramifications not associated with gene cloning. Molecular biologists exploit the replicative ability of cultured cells to clone genes.
Purposes of Gene Cloning
To study genes in the laboratory, it is necessary to have many copies on hand to use as samples for different experiments. Such experiments include Southern or Northern blots, in which genes labeled with radioactive or fluorescent chemicals are used as probes for detecting specific genes that may be present in complex mixtures of DNA.
Cloned genes also make it easier to study the proteins they encode. Because the genetic code of bacteria is identical to that of eukaryotes , a cloned animal or plant gene that has been introduced into a bacterium can often direct the bacterium to produce its protein product, which can then be purified and used for biochemical experimentation. Cloned genes can also be used for DNA sequencing, which is the determination of the precise order of all the base pairs in the gene. All of these applications require many copies of the DNA molecule that is being studied.
Gene cloning also enables scientists to manipulate and study genes in isolation from the organism they came from. This allows researchers to conduct many experiments that would be impossible without cloned genes. For research on humans, this is clearly a major advantage, as direct experimentation on humans has many technical, financial, and ethical limitations.
Cloning Techniques
Cloning genes is now a technically straightforward process. Usually, cloning uses recombinant DNA techniques, which were developed in the early 1970s by Paul Berg, of Stanford University, and, independently, by Stanley Cohen and Herbert Boyer, of Stanford and the University of California. These researchers devised methods for excising genes from DNA at precise positions, using restriction enzymes and then using the enzyme known as DNA ligase to splice the resulting gene-containing fragment into a plasmid vector .
Plasmids are small, circular DNA molecules that occur naturally in many species of bacteria. The plasmids naturally replicate and are passed on to future generations of bacterial cells. To replicate, all plasmids must contain a sequence, called an origin of replication, which directs the bacterial DNA
polymerase to replicate the DNA molecule. In addition, recombinant plasmids contain one or more selectable markers. A selectable marker is a gene that confers on the bacterium harboring the plasmid the ability to survive under conditions in which bacteria lacking the plasmid would otherwise die. Usually, such genes encode enzymes that enable the bacterium to live and grow despite the presence of an antibiotic drug.
The recombinant plasmid is then introduced into a host cell, such as an Escherichia coli bacterium, by a process called transformation, and the cell is allowed to multiply and form a large population of cells. Each of these cells harbors many identical copies of the recombinant plasmid. The cells are then cultured in growth media containing the antibiotic to which the plasmid confers resistance. This ensures that only cells containing the recombinant plasmid will survive and replicate. A researcher then harvests the cells and can extract and purify many copies of the plasmid.
Another method to produce many copies of a DNA molecule, which is even simpler than traditional recombinant cloning methods, is the polymerase chain reaction (PCR). PCR amplifies the DNA in a reaction tube without the need for a plasmid to be grown in bacteria.
Importance for Medicine and Industry
The ability to clone a gene is not only valuable for conducting biological research. Many important pharmaceutical drugs and industrial enzymes are produced from cloned genes. For example, insulin, clotting factors, human growth hormone, cytokines (cell growth stimulants), and several anticancer drugs in use are produced from cloned genes.
Before the advent of gene cloning, these proteins had to be purified from their natural tissue sources, a difficult, expensive, and inefficient process. Using recombinant methods, biomedical companies can prepare these important proteins more easily and inexpensively than they previously could. In addition, in many cases the product that is produced is more effective and more highly purified. For example, before the hormone insulin, which many diabetes patients must inject, became available as a recombinant human protein, it was purified from pig and cow pancreases. However, pig and cow insulin has a slightly different amino acid sequence than the
human hormone. This sometimes led to immune reactions in patients. The recombinant human version of the hormone is identical to the natural human version, so it causes no immune reaction.
Gene cloning is also used to produce many of the molecular tools used to study genes. Even restriction enzymes, DNA ligase, DNA polymerases, and many of the other enzymes used for recombinant DNA methods are themselves, in most cases, produced from cloned genes, as are enzymes used in many other industrial processes.
Genomic Versus cDNA Clones
A gene can take varying forms, and so can gene clones. The proteincoding regions of most eukaryotic genes are interrupted by noncoding sequences called introns, which are ultimately excluded from the mature messenger RNA (mRNA) after the gene is transcribed. In addition to the protein-coding sequences, all genes contain "upstream" and "downstream" regulatory sequences that control when, in which tissues, and under what circumstances the gene is transcribed. A clone containing the entire region of a gene as it exists on the chromosome, including introns and nontranscribed regulatory sequences, is called a genomic clone because it is derived directly from genomic, or chromosomal, DNA.
It is also possible to clone a gene directly from its messenger RNA transcript, from which all introns have been removed. This type of clone, called a complementary DNA or cDNA clone, includes only the protein-coding sequences and upstream and downstream sequences that do not code for amino acids but that may control how the mRNA transcript gets translated to protein.
To prepare cDNA a researcher starts with mRNA and then makes a complementary single-stranded DNA copy using the enzyme reverse transcriptase. Reverse transcriptase is a DNA polymerase that synthesizes DNA based on an RNA template that is produced by retroviruses. After the mRNA strand is digested away by another enzyme, called RNase H, DNA polymerase can synthesize a second DNA strand by using the newly made first strand cDNA as a template.
Because cDNAs lack introns, the protein-coding region in a cDNA molecule is contained in a single uninterrupted sequence, called an open reading frame, or ORF. This makes cDNA clones extremely useful for predicting the amino acid sequence of the protein that a gene encodes. It also makes it possible to direct protein synthesis from a eukaryotic cDNA clone in a bacterium, which cannot splice introns. With introns still present in a cloned gene, the bacteria will misinterpret the intron sequences as protein-encoding sequences. The resulting incorrect messanger RNA will encode a protein with an incorrect amino acid.
"Gene Cloning" Usually Means "Gene Identification"
When researchers report in a scientific journal that they have "cloned a gene" they are not referring to the rather mundane process of amplifying copies of a DNA molecule. What they are really talking about is the molecular identification of a previously unknown gene, and determination of its precise position on a chromosome. There are many different methods that
can be used to identify a gene. Two of the most common approaches are discussed below.
A gene can be defined in several ways. In fact, the concept of the gene is undergoing a re-evaluation as scientists are analyzing the complete genomes of more and more organisms and finding that many sequences encode more than one protein product. Gregor Mendel identified genes—for example, he identified the factor that made peas either yellow or green—long before he or anyone else knew that genes were encoded on segments of the DNA that made up chromosomes. Studying genetics in the fruit fly, Drosophila melanogaster, Morgan and Sturtevant demonstrated that genes are entities that reside at measurable locations, or loci, on chromosomes, although they did not yet understand the biochemical nature of genes.
Modern geneticists often use the same methods as Mendel and Morgan to identify genes by physical traits, or phenotypes, that mutations in them can cause in an organism. But today we can go even further. Using a broad range of molecular biology techniques, including gene cloning, researchers can now determine the precise DNA coding sequence that corresponds to a particular phenotype . This capability is tremendously powerful, because discovering the gene responsible for a trait can help humankind understand the cellular and biochemical processes underlying the trait. For example, geneticists have learned a great deal about the basis of cancer by identifying genes that, when mutated, contribute to cancer. By studying these genes, researchers now know that many of them control when cells divide (e.g., proto-oncogenes and tumor suppressor genes) or when they die (e.g., the apoptosis genes). Under some circumstances, when such genes are damaged by mutation, cells divide when they shouldn't, or don't die when they should, leading to cancer.
Positional Cloning
Positional cloning starts with the classical methods developed at the turn of the twentieth century by Thomas Hunt Morgan, Alfred Sturtevant, and their colleagues, of genetically mapping a particular phenotype to a region of a chromosome. A detailed discussion of genetic mapping is beyond the scope of this section, but, in general, it is based on conducting genetic crosses between individuals with two different mutant traits and analyzing how often the traits occur together in the progeny of subsequent generations.
Genetic mapping provides a general idea of where a gene is located on a particular chromosome, but it does not identify the precise DNA sequence that encodes the gene. The next step is to locate the gene on what is called the physical map of the chromosome. A physical map is a high-resolution map of all the DNA sequences that make up a chromosome. One type of physical map is a restriction map, which depicts the order of DNA fragments produced when a large DNA molecule is cut with restriction endonucleases (restriction enzymes).
Restriction maps have been made for the complete genomes of several model genetic organisms, such as the fruit fly (Drosophila melanogaster ), and the roundworm, (Caenorhabditis elegans ). For these organisms, individual large DNA fragments—on the order of forty to one hundred thousand base pairs from the whole genome—have been cloned in bacterial plasmid vectors to make a "library" of the genome. Each fragment is mapped to a known
position, but the identify of the gene or genes it contains is originally unknown. To identify the genes, a cloned fragment is introduced into a mutant fly or roundworm.
To pinpoint the location of a particular gene, a researcher can introduce one or several of the plasmid clones from the physical map that are in the general vicinity of the region on the genetic map where the gene is thought to lie into a mutant that is defective in the gene of interest. If the introduced DNA corrects the mutant's defect, that DNA probably contains a normal copy of the defective gene. But these large clones usually contain several genes. By further "trimming" the DNA into smaller subfragments and testing the ability of each subfragment to rescue mutants, the researcher can eventually home in on the gene. As further confirmation that this gene is the cause of the mutant phenotype, the researcher can isolate the corresponding gene from the mutant and determine its DNA sequence to see if
it contains a mutation (a DNA sequence alteration) relative to the normal gene sequence.
Expression Cloning
In some cases, a researcher becomes interested in studying a gene not because mutations in it cause an interesting phenotype but because the protein it encodes has interesting properties. A prominent example is beta-amyloid protein, which accumulates in the brains of Alzheimer's disease patients.
Expression cloning is a method of isolating a gene by looking for the protein it encodes. If the protein of interest is an enzyme, it can be found by testing for its biochemical activity. A very common method for identifying a particular protein is by using antibodies, or immunoglobulins, that bind specifically to that protein. Expression cloning usually uses a cDNA library, in which protein-coding sequences are uninterrupted by introns. Each cDNA is inserted into an "expression vector," which contains all the necessary signals for the DNA to be transcribed into mRNA. The mRNA can then be translated into protein. Thus the host cell harboring the clone will produce the gene's protein product, and the protein can then be detected by biochemical or immunologic methods. Once the cell making the protein is found, the cDNA can be re-isolated and the gene sequenced by standard means.
Gene cloning techniques continue to advance rapidly, aided by the Human Genome Project and bioinformatics. It is likely that positional cloning will take on a secondary role, and that bioinformatics and proteomics methods will begin to contribute more, as more progress in these fields is made.
see also Bioinformatics; Blotting; Chromosomes, Artificial; Cloning Organisms; Cloning: Ethical Issues; DNA Libraries; Gene; Gene Discovery; Human Genome Project; Linkage and Recombination; Marker Systems; Morgan, Thomas Hunt; Plasmid; Polymerase Chain Reaction; Recombinant DNA; Restriction Enzymes; Reverse Transcriptase; RNA Processing; Sequencing DNA; Transformation.
Paul J. Muhlrad
Bibliography
Alberts, Bruce, et al. Molecular Biology of the Cell, 4th ed. New York: Garland Science, 2002.
Lodish, Harvey, et al. Molecular Cell Biology, 4th ed. New York: W. H. Freeman and Company, 2000.
Micklos, David A., and Greg A. Freyer. DNA Science: A First Course in Recombinant DNA Technology. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press, 1990.
Watson, James D., et al. Recombinant DNA, 2nd ed. New York: Scientific American Books, 1992.