Proteins are made up of 20 different amino acids, and the extensive differences in their structures — from the fluid haemoglobin of our blood to the tough keratin of our skin — reflect differences in the number and order of amino acids in their constituent peptide chains. The information responsible for selecting and arranging the amino acids in these peptide chains is encoded by the order of the four bases that make up each strand of the DNA helix — adenine (A), guanine (G), cytosine (C), and thymine (T). Each individual amino acid is represented by a unique three-letter ‘word’, spelled out by a particular sequence of three bases in the gene. Even more remarkable, the code is the same throughout almost all of the animal kingdom.
In recent years it has been feasible to isolate human DNA, cut it into pieces, and insert these into bacterial cells, where they grow and multiply. In this way, it has been possible to prepare ‘libraries’ containing most of a person's genes (the ‘genome’), to isolate individual genes, and to examine their structure and function.
The human genomeNormal human cells (except eggs and sperm) have 46 chromosomes, arranged in pairs, one member of each pair coming from each parent. Twenty-two pairs are called autosomes and the other pair are the sex chromosomes, designated X and Y. Females have two X chromosomes (one from the mother, the other from the father) while males have one maternal X and one paternal Y sex chromosome. ‘Germ cells’ (sperm and eggs) are unusual in having only 23 unpaired chromosomes, each of which is created by a process in which maternal and paternal chromosome pairs become closely wound round each other in the germinal cells that give rise to the eggs and sperm. The closer together a pair of genes are on the same chromosome the less chance they will have to cross over from one to another. Hence the number of crossovers is a measure of the distance between genes. Indeed, many years ago it was realized that if two different genes are on the same chromosome, and particularly if they are close together, they will tend to be inherited together. They are then said to be linked.
Through studies of individual families, and later by generating genetic markers by analysing DNA itself, geneticists have been able to obtain a ‘linkage map’ of the human chromosomes, assigning many different genes to particular chromosomes. In essence, this looks like a road map in which the towns (genes) are clearly marked, although it tells us nothing about the state of the roads (DNA) in between. Early in the new millennium, as part of the Human Genome Project, this work will be extended, and the complete sequence of all the bases that make up the 30–40 000 genes of the human genome will be worked out.
Mutations and human diversityOccasionally, during cell division, when new strands of DNA are synthesized, a different base is inserted, by mistake, into a gene. This is called a mutation. Without this slight imperfection in the mechanism of DNA replication we would still be swimming round in the primeval soup. For, while many mutations, or polymorphisms, are neutral (that is, the slight changes in the proteins produced have no effect on the function of the organism), others may be beneficial. Although it is still a topic of some controversy, there is general agreement that this is the way in which Darwinian evolution has occurred; the occurrence of mutations occasionally leads to changes in organisms that enable them to adapt better to their current environments, or to new ones. While this may not be the only mechanisms for the gradual emergence of different species there is compelling evidence that it has been a major force behind human evolution. The existence of polymorphisms, and our new-found ability to analyse DNA with ease, are providing major insights into the origins of human beings, the ways in which different races have evolved, and, indeed, the whole basis of human diversity. As well as the DNA that resides in the nuclei of our cells, our mitochondria, the chemical dynamos that energize the cell, have their own DNA. Since mitochondrial DNA is all derived from our mothers, it is of particular value in tracing our evolutionary past in a direct line.
Harmful mutationsUnfortunately, not all mutations are simply neutral or even beneficial; occasionally they cause disease, because they affect some vital process in the body. Inherited diseases due to single defective genes usually follow Mendelian patterns of inheritance. Some are said to be ‘dominant’ because they occur with the inheritance of a single defective gene on only one of the pair of chromosomes. Others are ‘recessive’ — it is necessary to inherit the mutant gene from both parents to have the disease. ‘Carriers’, or ‘heterozygotes’, are individuals who have a recessive disease gene on only one of the pair of chromosomes, and who therefore are not affected by the disease but can pass it on their children if they mate with a partner who is also a carrier. Yet other diseases are sex-linked; that is, they are carried on the X chromosome and, if recessive, are not expressed in females (who have two X chromosomes) but may be transmitted to a son to whom the mother passes down an X chromosome bearing the mutant gene.
Although there are over 4000 single-gene (monogenic) disorders, most are very rare because they produce such severe disorders that they usually prevent reproduction, and therefore cannot disseminate within a population. However, a few monogenic diseases — the inherited anaemias, sickle cell anaemia and thalassaemia, for example — are extremely common. This is because carriers are made more resistant to malaria by the presence of the mutation in their cells. Therefore they tend to survive slightly longer than non-affected individuals in malarious countries, and hence have more children. In this way, the frequency of the disease increases until it reaches an equilibrium, at which it is balanced by the loss from the community of severely affected homozygotes (those who have received the defective gene from both parents). This kind of interaction may explain why cystic fibrosis, a disease that affects the lung and bowel, is so common in European populations: the mutation may have made carriers resistant to one or more of the severe infections that swept through Europe in the past, possibly cholera.
Many of the common diseases of Western society — heart attacks, stroke, diabetes, and dementia, for example — are the result of complex interactions between our genetic make-up and the environment. These diseases do not run predictably through families like single-gene disorders, but there may be a familial tendency, which probably reflects the action of several different genes combined with the effects of lifestyle. Other common diseases, cancer in particular, also reflect interactions between our genes and environments.
Many cancers result from the acquisition of mutations in a family of genes called oncogenes, which normally serve important housekeeping functions for our cells. They tell the cells how and when to divide, identify those with damaged DNA and either ensure that it is repaired or programme the potentially harmful cells to die, and regulate how they interact with their fellow cells. Cancer appears to be due to the acquisition of mutations in these genes; the frequency with which this occurs may be related both to exposure to environmental mutagens such as tobacco smoke, and to chemical agents that are by-products of our metabolism.
Other genetic defects are responsible for congenital malformation and for mental retardation. Sometimes these conditions result from major chromosomal abnormalities, involving either their numbers or structure. However, many malformations of this type can be traced to the action of single mutant genes — an observation that is providing clues about the mechanisms that control human development.
Manipulating the human genomeOur new-found ability to analyse human DNA, which has led to the discovery of the mutations that cause many monogenic diseases, means that they can be identified in carriers and appropriate counselling and advice can be given. It is also now possible to diagnose most of these diseases prenatally by chorionic villus sampling — removing and genetically analysing a tiny amount of tissue from round the fetus between 9 and 13 weeks of pregnancy. This allows prenatal diagnosis, and gives the mother the option of termination of a pregnancy if the fetus has a particularly severe condition. Soon it may be feasible to correct genetic disease or alter the genetic machinery of cells in a way that may be used to treat cancer or other acquired diseases. The treatment of single-gene disorders can, in principle, be approached by either germ-cell gene therapy (in which a ‘good’ gene is injected into a fertilized egg and therefore distributes among all the cells of the body, including future germ cells) or somatic-cell therapy (in which the gene is inserted into a particular cell population in the body of one individual — the stem cells of the bone marrow for example). Although germ cell gene therapy offers the prospect of eliminating disease from a family for all future generations, it is currently not permitted because of the possible risk of making damaging errors to the genes. On the other hand, somatic-cell gene therapy poses no major ethical problems beyond those of any form of tissue or organ transplantation. Human genes can also be inserted into cultured cells or whole animals (a process called transgenesis) in order for the recipient cells or animals to produce molecules of therapeutic value — insulin to treat diabetes, or clotting factors to treat haemophilia, for example.
Biological determinism and the futureThe remarkable developments in human genetics over the last half of the twentieth century led to the notion that most of people's mental achievements, personality traits, and behaviour can be explained by their genetic make-up, shaped in the past by Darwinian evolution. Much of this thinking ignores the important role of the environment in making us what we are. But because it carries such conviction, and because it is assumed that we will gradually learn how to manipulate our genetic make-up, this view is causing considerable concern about the possible resurgence of the eugenics movement — a philosophy for the ‘betterment’ of mankind by selective breeding, which stimulated the study of human genetics at the end of the nineteenth century. Although it will be a long time before we know the relative role of nature and nurture in shaping human beings, there is little doubt that our ability to modify the human genome will increase dramatically during the new millennium, and society will be faced with the dilemma of deciding how far it wishes to shape its destiny in this way.
D. J. Weatherall
Bodmer, W. and and McKie, R. (1994). The book of man. Little, Brown and Co, London.
Bowdler, P. J. (1989). The Mendelian revolution. Johns Hopkins University Press, Baltimore.
Jones, S. (1993). The language of the genes. Harper Collins, London.
Lewin, B. (1997). Genes VI. Oxford University Press, Oxford.
Raskó, I. and and Downes, C. S. (1995). Genes in medicine. Chapman and Hall, London.
See also evolution, human; gene therapy; heredity.
The word "genome" means the totality of all the genetic information present in the cells of an organism. Most of this genetic information is contained in chromosomes. A chromosome consists of deoxyribonucleic acid (DNA) molecules wound up into a compact bundle. Humans have 46 chromosomes in their cells, organized into 23 corresponding pairs.*
*If a single strand of human DNA were unwound and stretched out, It would be over two meters long.
The DNA in the chromosomes of an organism consists of a double-stranded molecule formed from four basic units that are repeated many times. The four basic units are called nucleotides. Each nucleotide is made of a sugar, a phosphate group, and a nucleotide base. The sugars in the nucleotides stack up and link together to form a backbone for one strand of the DNA molecule, leaving the nucleotide bases projecting. The four kinds of nucleotide bases found in DNA are adenine, cytosine, guanine, and thymine. A fifth nucleotide base, uracil, is found in ribonucleic acid (RNA) in place of thymine.
The nucleotide bases of DNA are usually designated as A, C, G, and T (or U) respectively. In the double-stranded DNA molecule, each nucleotide base forms a bond with a nucleotide base on the other strand of the molecule, such that A always pairs with T, and C with G. These two strands then wrap around each other, forming a structure known as a double helix. The two strands are antiparallel so that the sequence of bases starting from one end on one strand is repeated, starting from the opposite end on the other strand.
A gene is a unique sequence of bases that occupies a particular position on a chromosome. A single strand of DNA contains hundreds of individual genes. However, there are several different types of genes included in a DNA molecule. Structural genes code for particular amino acid sequences. These are the genes that contain instructions on how to build proteins. Operator genes control the structural genes and regulate their output. Regulatory genes may produce repressor proteins that turn the operator genes on or off or they may act like punctuation marks, signaling the beginning and end of coding sequences. Suppressor genes may suppress the actions of other genes and can reverse the effects of a harmful mutation. Kinetic genes regulate the chromosomes themselves. The 30,000 or so genes in every human cell code for more than one million different proteins including albumin and hemoglobin; brain chemicals like dopamine and serotonin; hormones like insulin, testosterone, and estrogen; and the countless enzymes that keep us alive.
Mathematics of the DNA Code
Structural genes responsible for the production of the amino acid sequences used to build proteins are the best understood of the different kinds of genes. To produce a protein, a section of DNA becomes "unzipped," exposing the nucleotide bases. A molecule of messenger RNA (mRNA) is synthesized by pairing up nucleotide bases. Each group of three nucleotide bases on the RNA molecule, called a codon, codes for a particular amino acid. Since there are four choices (A, C, G, and U) for each of the three-nucleotide bases, each codon can be one of 64 different "words" (4 × 4× 4 = 64). However, there are only 20 different amino acids. Several different codons produce the same amino acid. For example, UUU and UUC both produce the amino acid phenylalanine. There are also three codons, UAA, UGA, and UAG, that act as "stop" signals ending the protein chain.
A protein contains around 100 amino acids, so a structural gene must contain at least 300 nucleotide bases. Using this logic, it might then be expected that the human genetic code would contain 10 million genes. However, DNA also contains stop and start codons, other regulatory molecules, some sequences (called introns) that are removed before protein synthesis begins, duplicate genes, non-coding sequences, redundant genes, and other non-functioning bits of DNA. As a result, there are probably only around 30,000 individual genes among the 3 billion nucleotide units of human chromosomes.
The U.S. Human Genome Project (HGP) is a joint effort of the Department of Energy (DOE) and National Institute of Health (NIH). The goal of the Human Genome Project is to decipher human heredity through the creation of maps for each of the 23 human chromosomes. The first step in this process is to determine the actual DNA code. In June, 2000, then-U.S. President Clinton, leaders of the Human Genome Project, and officers of Celera Genomics (a private biotechnology firm) jointly announced that the rough draft of the human genetic code was ready for publication. The February 16, 2001 issue of Science published articles related to the work of Celera Genomics, and the February 15, 2001 issue of Nature published articles relating to the work of the Human Genome Project.
As of July 30, 2001, only the two smallest human chromosomes, 21 and 22, were completely sequenced to the level of accuracy specified by the protocols established by the Human Genome Project. At the time this article was written, 47.1 percent of human DNA had been mapped to final standards and 51.4 percent had been mapped to preliminary standards for a total of 98.5 percent mapped.
The final publication of the human genome map is expected by 2003. However, this final draft will include only the human genome code. Still to be determined are the exact number and locations of genes, how genes are regulated, how the DNA sequence is organized, how chromosomes are organized, which parts of the DNA are redundant or noncoding, how gene expression is coordinated, how genetic information is conserved, and many other concepts essential to understanding the human genetic code.
see also Human Body.
Elliot Richmond and
Marilyn K. Simon
Davies, Kevin. Cracking the Genome: Inside the Race to Unlock Human DNA. New York: Free Press, 2001.
Drlica, Karl. Understanding DNA and Gene Cloning: A Guide for the Curious, 3rd ed. New York: John Wiley & Sons, 1997.
Lee, Thomas F. The Human Genome Project: Cracking the Genetic Code of Life. New York: Plenum Press, 1991.
Wilson, Edward O. The Diversity of Life. Cambridge, MA: Harvard University Press, 1992.
DOE Human Genome Project. <http://www.ornl.gov/hgmis/>.
Genome Information on the World Wide Web. <http://cib.nig.ac.jp/others/genomeinfo.html>.
THE IMPORTANCE OF MAPPING THE HUMAN GENOME
Many genetic diseases can result from a misspelling in the DNA sequence. Since each DNA codon codes for a particular amino acid, a change in one nucleotide base can result in a different protein being produced. If that protein is essential to health, a genetic disease results. For example, normal hemoglobin differs from the hemoglobin of sickle-cell anemia by only one amino acid out of hundreds.
The genetic maps being developed by the Human Genome Project are available on the Internet and are updated frequently, making accurate genetic information accessible to every researcher and opening up great potential for improving human health through research.