Genomics is the study of the genome , or deoxyribonucleic acid (DNA), of an organism and associated technologies. Genomics evolved from a series of experimental and conceptual advances that allowed researchers to decipher the DNA sequences of whole genomes from virtually any organism, including humans. Single experiments can scrutinize many genes and compare genomes of different species.
Before the development of genomics, DNA studies in humans were mostly relegated to observing highly condensed chromosomes under a microscope. Many human studies were done in medical centers where families with visible chromosomal mutations came for evaluation. While it was possible in a few cases to determine on which chromosome a gene was located or whether two genes were located on the same chromosome, without molecular techniques these were difficult tasks at best.
Tools of Genomics
Genomics developed from advances in recombinant DNA technology, while in turn, developed from earlier progress in biochemistry and genetics. Recombinant DNA technology is the set of tools that make it possible for researchers to study and manipulate DNA, ribonucleic acid (RNA), and protein from any source, both outside of the cells (in vitro ) and inside of the cells (in vivo ) of the well-studied model organisms.
Relatively few techniques are used to study DNA. The basic methods that underlie genomic technologies include DNA sequencing, polymerase chain reaction (PCR), electrophoresis , cloning, and hybridization. The fact that all DNA molecules can be manipulated using a few basic techniques is a major advantage of working with DNA. In contrast, proteins are much more difficult to manipulate, and each must be approached individually.
DNA sequencing determines the order of bases in a segment of DNA, a gene, a chromosome, or an entire genome. PCR can increase the number of copies (even a millionfold) of a single gene or fragment of DNA in vitro within hours. Electrophoresis separates DNA by size in the presence of an electrical field. This is a simple technique used to follow changes in DNA size through different recombinant manipulations.
The process of isolating a piece of DNA for recombinant DNA studies is called cloning. Cloning increases the number of copies of a single gene or fragment of DNA in vivo. Large amounts of DNA are needed for sequencing and manipulation experiments. The purified, unknown DNA is combined with another well-characterized piece of DNA called a cloning vector . The vector DNA has the DNA sequences needed to form an artificial minichromosome. A cell that contains the cloned DNA is called a clone. The clone is used to produce more copies of the DNA of interest and to produce protein encoded by the DNA. In some experiments, a modified DNA segment is returned to the original organism for further studies.
Cloning can be used to isolate and scrutinize a small part of the genome, such as a gene. For example, consider the cloning of an oncogene , which is a gene that, when overexpressed, causes cancer. DNA fragments from human cancer cells were introduced into the cells of a normal mouse. Some of the cells received a piece of human DNA that caused them to develop into cancerous cells. Recombinant DNA techniques were used to identify which piece of human DNA was responsible for converting the normal cells to cancerous cells. DNA methods enabled researchers to isolate the specific human gene that causes the cancer.
Hybridization, which is also known as renaturation or annealing, is the coming together of two complementary , single strands of DNA to form double-stranded DNA. Denaturation is the reverse process, which separates double-stranded DNA into two single strands (see Fig. 1). Double-stranded DNA is denatured when it is incubated at a high temperature. In hybridization experiments, single-stranded DNA of unknown sequence (test DNA) is hybridized to single-stranded DNA of known sequence (probe DNA). Hybridization takes place only when the test DNA contains a complementary DNA sequence.
Hybridization experiments can be used to sequence the test DNA a "word" at a time. A word is equal to the length of the probe sequence, which is usually greater than eight bases. This technique makes it possible to obtain information more rapidly than from ordinary sequencing methods.
DNA hybridization experiments are also used to study the expression of thousands of genes at the same time. In expression studies, messenger RNA (mRNA) from a cell is hybridized to an array (display) of complementary single-stranded DNA probe sequences unique for each test mRNA. Test mRNA hybridizes to DNA probes in the same manner as test DNA. The amount of test mRNA hybridized to each probe is measured to determine the amount of complementary mRNA present. Since the hybridization is done to the entire array simultaneously, information about all test mRNA sequences is obtained in a single experiment. The DNA probes arrayed on a microchip are recorded and the hybridization results are analyzed automatically using computers.
DNA array experiments identify and analyze mRNAs solely on the basis of their sequence. No information is needed about the proteins encoded by the mRNA or their function. However, ultimately, information about the proteins and their functions is the real goal of these experiments. The array experiment is used to identify important mRNAs, which indicate which genes a particular cell is expressing. This may be related to a cell specialization (for example, brain versus heart), or to a disease state. This "whole genome" approach can lead to the discovery of new genes and identify unexpected functions of known genes.
Finding Disease Genes
Positional cloning experiments isolate genes responsible for a specific genetic disease, such as cystic fibrosis, by first identifying their location (position) in the genome (which chromosome they are on). Genetic mapping techniques locate a gene or a DNA sequence among the chromosomes by identifying which regions of the genome are inherited in the same manner as the trait of interest, such as a disease. Once the chromosomal region is found, molecular and computational methods are used to identify all genes in the region and to pick up "candidate" genes for further testing. Some of the candidate genes are identified by determining whether their encoded proteins fit the trait or disease of interest. For example, the isolation and determination of the DNA sequence of the cystic fibrosis gene made it possible to identify its product, a large protein molecule that regulates the transport of chloride across the cell membrane. Researchers were able to explain the clinical symptoms when they discovered that the protein was nonfunctional.
The power of focusing on genetic causes of disease is due to several factors. First, positional cloning is guaranteed to produce results provided enough money is available to do the experiment and large families inheriting the trait of interest are known. It is very rare in research to be sure of success. Second, the identification of a gene responsible for a disease uncovers pathways and genes that are unknown and may play a role in noninherited forms of the same or a similar disease.
The first stage of genomic research is coming to an end as the entire human DNA sequence is known. An understanding of that blueprint will likely take many more years of research. That understanding will require sequencing many genomes to determine how variations in DNA sequences affect protein and cell function. Further, each gene must be understood in the context of the entire repertoire of thirty-five thousand human genes. Because of advances in genomics, scientists are no longer forced to study single genes out of context. Consequently, experiments and the information gathered in each experiment are becoming more complex. New computational tools are needed to understand the massive amount of information that is now being generated. This has developed into the field of bioinformatics.
The DNA sequences of any two individuals can differ in at least one million places. In medicine, variations in the DNA sequences will be used to develop individualized drug treatments. This new field, known as pharmacogenomics, is in its infancy, but already DNA profiles are being used to subtype different cancers, enabling physicians to prescribe the drugs most likely to be effective for a particular patient in the course of treatment.
Genomic studies are not confined to humans, but are used to learn about all organisms. The lessons that nature will provide from all these studies will have an impact on fields far beyond biology and medicine. The tools and knowledge needed to accomplish these cross-traditional boundaries of scientific disciplines. The solutions to future scientific problems will require an immense amount of collaboration and will need to take advantage of talents and knowledge of a large number of individuals.
see also Bioinformatics; Clone; DNA Sequencing; Human Genome Project; Hybridization; Model Organisms: Cell Biology and Genetics; Oncogenes and Cancer Cells; Polymerase Chain Reaction; Recombinant DNA
Cassandra L. Smith and Linda G. Tolstoi
Biomedicine Promises for Health. Washington, DC: Pharmaceutical Research and Manufacturers of America (PHRMA), 1998. http://www.phrma.org/publications/publications/brochure/biomed/.
Casey, Denise. Primer on Molecular Genetics. Washington, DC: U.S. Department of Energy, Office of Energy Research, Office of Health and Environmental Research, 1992.
Hamadeh, Hisham, and Cynthia A. Afshari. "Gene Chips and Functional Genomics." American Scientist 88, no. 6 (2000): 508–515.
Haseltine, William A. "Discovering Genes for New Medicines." Scientific American 276, no. 3 (1997): 92–97.
Lewis Ricki. Human Genetics: Concepts and Applications, 3rd ed. Boston: WCB/Mc-Graw-Hill, 1999.
Tolstoi, Linda, and Cassandra L. Smith. "Human Genome Project and Cystic Fibrosis—a Symbiotic Relationship." Journal of the American Dietetic Association 99, no. 11 (1999): 1421 – 1427.
Vaughan, Douglas. To Know Ourselves. Washington, DC: U.S. Department of Energy and the Human Genome Project, 1996. http://www.ornl.gov/hgmis/.
Genomics is a recent scientific discipline that strives to define and characterize the complete genetic makeup of an organism. Its primary approaches are to determine the entire sequence and structure of an organism's DNA (its genome ) and then to determine how that DNA is arranged into genes. This second goal is accomplished by determining the structure and relative abundance of all messenger RNAs (mRNAs), the middlemen in genetics that encode individual proteins.
From Microorganisms to Human DNA
For many years, genomics has been focused on microorganisms, which have relatively small genomes. However, more recently the field has been energized by the advent of more industrialized, higher-throughput sequencing technologies. By 2001 more than seventy organisms had been completely sequenced, and a working draft of the human genome had been produced. Vigorous efforts have now been initiated to map the mouse genome, and one company already claims to have completed the sequence. From the description of the structure of the genetic material by James Watson and Francis Crick in 1953, it will have taken only about fifty years to determine the complete genetic codes of humans and most of the model organisms that are important in biological research.
|Latin Name||Common Name||Genome Size|
|Eukaryotes (haploid genome)|
|Oryza sativa||Rice||420,000 Kb|
|Homo sapiens||Human||3,200,000 Kb|
|Arabidopsis thaliana||Mustard cress||115,428 Kb|
|Drosophila melanogaster||Fruit fly||137,000 Kb|
|Caenorhabditis elegans||Roundworm||97,000 Kb|
|Saccharomyces cerevisiae||Yeast||12,069 Kb|
|Haemophilus influenzae||-||1,830 Kb|
|Escherichia coli||Human colon bacterium||4,639 Kb|
|Helicobacter pylori||Stomach ulcer bacterium||1,667 Kb|
|Yersinia pestis||Plague||4,653 Kb|
|Halobacterium||Salt-tolerant archaean||2,014 Kb|
|Methanobacterium thermoautotrophicum||Methane-producing archaean||1,751 Kb|
|Kb=one thousand base pairs|
Of what value is the knowledge of these genomes? How are they being used within the scientific community? The first fully sequenced genomes included the fruit fly, a worm, and a number of bacteria and yeast. One of the first analyses performed was to simply compare the sequences between organisms, in order to identify what is shared in common and what is different. This allows the very specific comparison of organisms that will enable the refining of phylogenic relationships. This kind of information is also very valuable for asking questions about how organisms have evolved, how they adapt to different circumstances, and what gene products contribute to their survival in various environmental conditions.
Genomics has brought us to the threshold of a new era in controlling infectious diseases. These studies will likely lead to the development of new disease prevention and treatment strategies for plants, animals, and humans alike. For instance, understanding pathogen genes, their expression, and their interaction will lead to new antibiotics, antiviral agents, and "designer" immunizations. These new DNA-based immunizations are by-products of genomic research and will undoubtedly eventually replace the traditional vaccines made from whole, inactivated microorganisms. This is highly relevant to domesticated animals, where viruses still kill billions of dollars worth of livestock every year.
Understanding the genomes of plants and animals has additional benefits. Gene mapping should allow us to understand the basis for disease resistance, disease susceptibility, weight gain, and determinants of nutritional value. The use of genomic information provides the opportunity to select optimal environments for the healthy growth of plants and animals, to develop disease-resistant strains, and to achieve improved nutritional value such as with the "golden" rice. Success in these species may well provide important insights needed to improve the health of humans.
The Human Genome Project and Future Research
The Human Genome Project reached a major milestone in 2001, with two separate publications of working drafts of the human genome. Although much knowledge has been generated, the sequence is not complete. Neither the actual number of genes nor all their structures have been determined. However, several major lessons have been learned. First the number of genes is estimated to be between 30,000 and 70,000, fewer than previously thought. In addition, it is clear that a very large proportion of our genes are highly similar to those in other organisms, such as the fruit fly and the microscopic worm, C. elegans. The observation that we can build humans with between 30,000 and 70,000 genes and a fruit fly with 15,000 genes suggests that we owe much of the complexity of humans to the fine regulation of genes and not their absolute number.
Genomics has also forced biologists to begin to look at the function of genes in an industrialized mode. This new field of functional genomics takes advantage of a number of new technologies. Since many fly and worm genes are so similar to human genes (homologs), these animals can be used as model systems to study gene function. In these model systems it is possible to mutate (or alter) the structure of every single gene, enabling researchers to determine each gene's function and how several of the genes interact in complex metabolic pathways. Similar efforts using systematic gene mutations are also underway to create DNA "libraries" of two vertebrates, mice and zebrafish, whose genes are surprisingly similar to humans. Once these genomes are fully sequenced and characterized, it will be possible to create animals with disorders that are more precisely like those of humans, allowing for a better understanding of complex diseases and determination of novel and effective therapies.
Genomics allows for the comparison of sequences between individuals, too. These studies can be used as a basis for the understanding and diagnosis of disease, especially of the complex disorders not governed by single genes. Knowledge of the entire human sequence is also the basis of the fields of pharmacogenetics and pharmacogenomics. Pharmacogenomics seeks a broader understanding of how genes influence drug response and toxicity, and the discovery of new disease pathways that can be targeted with tailor-made drugs. Pharmacogenetics is the study of the genetic factors involved in the differential response between patients to the same medicine. Polymorphisms, nucleotide changes that occur in more than 1 percent of the population, are the basis for our individuality but also account for our differential susceptibility to disease and the variable outcome of treatments. Through a variety of research efforts, more than one million polymorphisms have been identified in the human genome. The study of these variants, that occur once every 500 to 1,000 nucleotides in the human genome, should enable pharmacogenetics to define the optimal treatment regimens for subsets of the population, allowing a wider range of patients to be treated and more effective outcomes to be produced with any given drug.
see also Agricultural Biotechnology; DNA Libraries; Genomic Medicine; Genomics Industry; High-throughput Screening; Human Genome Project; Model Organisms; Pharmacogenetics and Pharmacogenomics.
Kenneth W. Culver
and Mark A. Labow
Bloom, Mark V., Greg A. Freyer, and David A. Micklos. Laboratory DNA Science: An Introduction to Recombinant DNA Techniques and Methods of Genome Analysis. Menlo Park, CA: Addison-Wesley, 1996.
Koonin, Eugene V., L. Aravind, and Alexy S. Kondrashov. "The Impact of Comparative Genomics on Our Understanding of Evolution." Cell 101 (2000): 573-576.
O'Brien, Stephen J., et al. "The Promise of Comparative Genomics in Mammals." Science 286 (1999): 458-462, 479-481.
Ye, Xudong, et al. "Engineering the Provitamin A (-Carotene) Biosynthetic Pathway into (Carotenoid-Free) Rice Endosperm." Science 287 (2000): 303-305.
Celera, Inc. <http://www.celera.com>.
"Entrez Genomes." National Center for Biotechnology Information. <http://www.ncbi.nlm.nih.gov/Entrez/Genome/main_genomes.html>.
The term genomics was coined in 1987 by Victor A. McKusick and Frank H. Ruddle as the title for a new journal of that name. McKusick and Ruddle derived it from genome, a concept that had been circulating in biology since the early 1920s. The roots of genome are the Greek genos (class, kind, race) and the suffix -ome (as used in rhizome and chromosome.). Genome is defined as the entire sequence of DNA found in the nucleus of every cell.
In the 1980s and 1990s, genomics primarily referred to large-scale projects to map the genes and sequence the DNA of organisms. As it turned out, a bacteriophage called phi-x174 was the first organism whose complete DNA sequence was revealed. The sequencing of its 5,375 nucleotides was accomplished in 1977 by Frederick Sanger and his colleagues at the University of Cambridge. In the ensuing two decades, a series of further viral genome-sequencing projects were undertaken. In July 1995 Robert David Fleischmann reported the completion of the sequencing of the first genome of a nonviral organism (H. influenzae ).
Encouraged by the chancellor of the University of California, Santa Cruz, Robert Sinsheimer, and the U.S. Office of Health and Environmental Research, biologists began in the mid-1980s to evaluate the possibility of mapping the genes and sequencing the DNA of the human genome. As part of its mission to assess the health effects of radiation, the U.S. Department of Energy established in 1987 three human genome research centers at Los Alamos, New Mexico; Livermore, California; and the Lawrence Berkeley National Laboratory in Berkeley, California. The initiative to first map and then sequence the complete human genome was formally launched in 1990 as a joint program of the Department of Energy and the National Institutes of Health. In France, the Centre d’Étude du Polymorphisme Humain conducted a successful gene-mapping project funded by the Muscular Dystrophy Association. In the United Kingdom, the Wellcome Trust supported human genome research. Germany and Japan soon joined the international efforts in what was called a “race” to map the genes and sequence the 3.1 billion base pairs of the human genome.
In the wake of a successful initial mapping of the human genome, the Human Genome Organization decided by the mid-1990s to decode the DNA of model organisms before sequencing the human genome. In 1997 the complete DNA sequence of the yeast genome was published; a year later the ninety-seven million base pairs of the worm C. elegans followed; in early 2000 an advanced draft of the genome of the fruit fly Drosophila was announced.
Using different approaches, the Human Genome Sequencing Consortium and a team led by Craig Venter at Celera Genomics separately published in February 2001 their preliminary findings, estimating the number of genes in the human genome at 30,000 to 40,000. A later reanalysis reduced the number to approximately 20,000 to 25,000. The completion of the project in April 2003 has led to the identification of millions of sites on the genome where individuals differ.
The challenges in completing gene-mapping and DNA-sequencing projects were primarily technical, organizational, and financial rather than scientific. The problem scientists face today is how to use genomic information to gain biological understanding. Accordingly, genomics has increasingly given way to postgenomic studies focusing on the functions of genes and the complex interactions between cells, systems of cells, multicellular organisms, populations of organisms, and their environment. The three terms functional genomics (the study of genetic function), proteomics (the study of the proteins expressed by a genome), and transcriptomics (the study of RNA transcripts) indicate that the epistemic status of the genome has shifted from an object of analysis to a tool of research. In the emerging world of postgenomics, sequences are used as giant reference tools.
The social relevance of genomics lies primarily in the agricultural and biomedical utilization of genetic information. Knowledge gained from genetic and genomic research has enabled biomedicine to envision the organism at a molecular scale. New diagnostic tests based on a molecular understanding of life reveal susceptibility to a broadening range of diseases. The concept of genetic risk factors has led to a redrawing of the line between the normal and the pathological. Additionally, various patient groups have formed around specific diseases, inflecting new styles of collective thought, action, and passion that entail a redefinition of both the biological and the social. As a consequence, a new moral landscape has emerged that contrasts with the ethical discourse of the public sphere.
The challenge of contemporary molecular biology is to proceed from the generation of genomic information to the assessment of hypothetical propositions in experimental settings. For social scientists, it will be of paramount importance to continue observing and analyzing the unexpected emergence of objects and the unpredictable reconfiguration of forms as they assemble into an ever-shifting understanding of life.
Brent, Roger. 2000. Genomic Biology. Cell 100: 169–183.
Cook-Deegan, Robert. 1995. The Gene Wars: Science, Politics, and the Human Genome. New York: Norton.
Rabinow, Paul. 1999. French DNA: Trouble in Purgatory. Chicago: University of Chicago Press.
█ JULI BERWALD
Genomics is the study of genes and their function in relation to the environment. In contrast to genetics, which focuses on genes and inheritance, the goal of genomics is to understand genes, their products and how, when, and why these products are synthesized.
The genome of every organism is the collection of the genetic information contained in the DNA (deoxyribonucleic acid). DNA is a molecule consisting of long strands of four different molecules called nucleotides: adenine, cytosine, guanine and thymine or A, C, G and T, as they appear in published sequences. The strands of DNA are paired so that A on one strand always corresponds to T on the opposite strand and similarly, C always corresponds to G. These paired strands of DNA are further twisted into the conformation of a double helix. A functional unit of DNA is called a gene. In a gene, the sequence of A, C, G, and T on a strand of DNA specifies the sequence of amino acids that make up a protein. In order for a specific protein to be synthesized, the DNA in a gene is first transcribed to messenger RNA (ribonucleic acid), which is similar to DNA, but single stranded. The messenger RNA is then translated into a sequence of amino acids. In this process, three nucleotides of DNA, for example CGT, are transcribed into three nucleotides of messenger RNA, in this case GCA, which code for one amino acid, in this case alanine. Proteins and products of proteins are fundamentally responsible for all cellular behavior. Protein function is altered by changes in the sequence of amino acids. Genomics investigates how variations in genes affect protein structure and function throughout the life of a cell.
The field of genomics. Although it is a young and evolving field, genomics generally includes at least three key research areas: bioinformatics, proteomics and structural genomics. Masses of DNA sequence data have accumulated though projects like the Human Genome Project, the Mouse Genome Project and over 40 microbial genomes have been sequenced. Not all DNA is made up of genes. In humans, for example, only about 3% of the DNA is actually genes. Some of this non-coding DNA is used by enzymes as markers indicating the beginning and ends of genes. Some of it, the so-called junk DNA, may not have any function at all. Using statistical tools and data-mining techniques, the field of bioinformatics attempts to identify genes in the DNA and to determine the relationships among genes in different individuals. Although the DNA in organisms is essentially constant throughout their lives, the kinds and amounts of proteins that are synthesized at any instant are subject to much variation. The field of proteomics investigates which proteins are expressed at what stages in an organism's life and exactly how and why these proteins are expressed. Translating a sequence of DNA to its corresponding amino acid sequence is only the beginning of understanding the function of a protein. Many amino acid chains are modified after they are synthesized and protein structure changes depending on environmental conditions, e.g. heat, pH or association with other molecules. The study of structural genomics attempts to unravel the molecular structures that result from a sequence of DNA.
Applications of genomics. One of the most promising applications of genomics is improving the ability to fight diseases. Many diseases, such as sickle cell anemia, cystic fibrosis and Huntington's disease, are caused by abnormalities in the sequence of DNA that codes for a specific protein or proteins. Genomics will be able to help in both the diagnosis of these diseases and the treatment of these conditions. It is estimated that only about 500 molecules are actually targeted by drugs currently available. Genomics will hopefully lead to an increase in the number of drug targets used in pharmaceuticals. It may also provide information on the genetic basis for side effects and the effectiveness of treatments that can be used to tailor prescriptions for individuals. Two specific types of gene therapies have been advanced. Somatic cell therapy involves the insertion of therapeutic genes into specific cells in the body. This will hopefully allow those cells to synthesize proteins that they are unable to produce or to turn off genes that are over expressed. Germ line therapy involves the insertion of normal genes into an egg cell, with the hope that the normal gene will be incorporated in to the genome of the offspring and that a genetic disease will not be inherited.
In addition to their importance in medicine, bacteria, viruses and fungi play key roles in agriculture. Because their genomes are small, the genomes of at least 40 species of microorganisms have been sequenced. Understanding the genomics of these organisms has the potential to improve crop yields, decrease damage done by pest species and increase the nutritional value of food. As part of their metabolism, some microorganisms have the ability to break down harmful products and to produce energy as a product. Understanding the gene products involved in these transformations may lead to industrial uses, with the potential for solving different types of environmental problems and providing new energy sources.
Military uses of genomics. Identifying the genes and gene products in the organisms that lead to disease in humans will lead to the development of treatments for these diseases. Characterizing genes responsible for diseases will likely lead to the development of new antibiotics and other drugs used to treat diseases caused by biological warfare. It can also reveal methods for combating drug resistance and preventing the use of this phenomenon by opponents. Genomics should also provide new techniques for identifying biological agents on the battlefield. One of the most promising technologies is the biochip or DNA chip, which is a microarray of molecular probes on a silicon chip that specifically bind to the DNA of biological threats. Once bound, the DNA is then detected using a fluorescent signal. These arrays identify genes that are active in cells, and indicate if a particular immune response is occurring. In the case of a biological attack, this can provide quick, detailed information about the course of the infection to medical personnel.
█ FURHER READING:
American Medical Association. "Proteomics."<http://www.ama-assn.org/ama/pub/category/3668.html#3> (April 3, 2003).
Human Genome Project. "From the Genome to the Proteome." <http://www.ornl.gov/TechResources/Human_Genome/project/info.html> (March 14, 2003).
Pharmaceutical Researchers and Manufacturers of America. "Genomics: A Global Resource." <http://genomics.phrma.org/> (April 3, 2003).
U.S. Department of Energy Joint Genome Institute. "An Introduction to Genomics." <http://www.jgi.doe.gov/education/genomics_1.html> (April 3, 2003).
Weizmann Institute of Science Genome and Informatics. <http://bip.weizmann.ac.il/mb/functional_genomics.html> (April 3, 2003).