Contributions of Molecular Genetics to Phylogenetics

views updated

Contributions of molecular genetics to phylogenetics

Introduction

Traditional studies of evolutionary relationships among living organisms (phylogenetics) relied predominantly on comparisons of morphological characters. However, phylogenetic studies have increasingly benefited from additional inputs from molecular genetics. Reconstruction of evolutionary relationships among mammals provides a prime example of such benefits, and the broad outlines of the phylogenetic tree of mammals have now been convincingly established. For instance, it has proved possible to identify four major clusters (superorders) of placental mammals: (1) Afrotheria—elephants, manatees, hyraxes, tenrecs, golden moles, elephant shrews, and the aardvark; (2) Xenarthra—sloths, anteaters, and armadillos; (3) Euarchontoglires—rodents, lagomorphs, primates, dermopterans, and tree shrews; (4) Laurasiatheria— artiodactyls (including cetaceans), perissodactyls, carnivores, pangolins, bats, and most insectivores ("eulipotyphlans": hedgehogs, shrews, and moles).

All phylogenetic reconstructions based on molecular data depend on studies of the genetic material DNA or of proteins, whose synthesis is governed by individual DNA sequences. The primary components of the double-stranded DNA molecule are nucleotide bases, sugar groups, and phosphate groups. There are four nucleotide bases (adenine, cytosine, guanine, and thymine), and specific chemical bonds between pairs of these (adenine with thymine; cytosine with guanine) provide the backbone for DNA's double helix structure. These specific bonds between pairs of bases also ensure that, if one strand is separated, the missing strand will be faithfully replicated. Because the basic unit in the double-stranded DNA molecule is hence a bonded pair of bases (one base in each strand), the length of a DNA sequence is measured in base pairs (bp). The sequence of nucleotide bases in the DNA molecule provides the basis for protein synthesis through the genetic code, with a group of three bases ("triplet") in the DNA sequence corresponding to one amino acid in the protein sequence. Protein sequences hence depend directly upon DNA sequences, and a score of different amino acids are combined into chains of specific composition through translation of sequences of nucleotide bases in DNA, assisted by two kinds of RNA (messenger RNA and transfer RNAs). However, the original simple concept of "one gene, one protein" has needed modification. One major reason for this is that a DNA sequence corresponding to a particular protein sequence often contains non-coding regions (introns) between the coding regions (exons). Only the exons are ultimately reflected in the amino acid sequence of the corresponding protein. Furthermore, the products of individual DNA sequences can be spliced together to produce a protein.

Both DNA sequences and protein sequences are particularly suitable for phylogenetic reconstruction because they consist of relatively simple components arranged in linear series (nucleotide bases and amino acids, respectively) that can be easily compared between species. Initially, comparisons between species were based on laborious step-by-step determination of the amino acid sequences of proteins, as there was no straightforward technique for studying DNA sequences themselves. The first phylogenetic trees derived from molecular genetics were therefore based on amino acid sequences of proteins, and relatively few species were included in comparisons because of the time-consuming procedure involved in protein sequencing. At first, it was also technically very difficult to determine DNA sequences. However, a major breakthrough came with development of the capacity for generating large quantities of individual DNA sequences through amplification using the polymerase chain reaction (PCR). This opened the way to relatively straightforward and rapid direct determination of DNA sequences, and heralded the transition from studies of gene products (proteins) to studies of the genes themselves (DNA sequences). In fact, because it became easier and faster to determine DNA sequences directly, a protein sequence is now commonly inferred from the DNA sequence of the corresponding gene rather than from sequencing of the protein.

It is important to note that there are two different sets of genetic material (genomes) in mammalian cells, as in animals generally. The primary genome is contained in the chromosomes in the nucleus (nuclear DNA), but each mitochondrion in the cell cytoplasm also contains a number of copies of a separate small genome (mitochondrial DNA). As several mitochondria are present in each cell, there are numerous copies of the mitochondrial genome, whereas there is only one nuclear genome per cell. However, the basic structure of DNA is the same for nuclear DNA (nDNA) and mitochondrial DNA (mtDNA), with chains of nucleotide bases, although mtDNA is organized in a ring whereas nDNA exists as linear sequences within chromosomes. The mitochondrion is a respiratory power plant found in all organisms with a cell nucleus (eukaryotes). It is, in fact, derived from a free-living bacterium that took up residence in the cell cytoplasm in an ancestral eukaryote more than a billion years ago, in an arrangement that was of mutual benefit (symbiosis). Originally, the mitochondrial genome contained many more genes than are now present in mammals, whose mitichondria retain only a small number of protein-coding genes that are all connected with respiration.

Reconstruction of phylogenetic trees

Regardless of whether morphological or molecular data are analyzed, reconstruction of phylogenetic relationships between species depends on interpretation of shared similarities. In principle, it is relatively easy to survey similarities between species for individual characters. This task is particularly straightforward with molecular data because the individual components at defined positions in sequences of DNA (nucleotide bases) or proteins (amino acids) are relatively simple and directly comparable. Furthermore, because the primary process underlying evolutionary change is point mutation (random replacement of one nucleotide base by another in a DNA sequence), comparison of DNA sequences directly reveals basic evolutionary steps. Most changes in DNA sequences lead to changes in corresponding protein sequences, but there is some degree of redundancy in the genetic code, because up to six different triplet sequences of nucleotides can correspond to a single amino acid. For this reason, about 25% of point mutations in DNA are "silent" and do not lead to a change in protein sequences. Such redundancy applies particularly to the third base position in DNA triplets.

Analysis of similarities between species to construct phylogenetic trees is more difficult than it seems at first sight. In the first place, similarities can arise independently through convergent evolution at any time after the separation between two lineages. For instance, rodent-like incisor teeth have developed several times independently during the evolution of mammals. But reconstruction of the relationships between species depends on exclusion of convergent similarities and identification of homologous similarities that have been inherited through descent from a common ancestor. In the case of morphological characters, it is often possible to identify convergent similarities directly because development of similar characters is typically driven by similar functional requirements. For example, rodent-like incisors develop in response to selection pressure for gnawing behavior. For complex morphological characters, convergent similarity is typically only superficial because it merely needs to meet a particular functional requirement. Hence, detailed examination of such characters commonly reveals fundamental differences. With incisors, for instance, a rodent-like pattern can develop without altering the structure of enamel that characterizes a particular group of mammals. With molecular characters, by contrast, each type of nucleotide base or amino acid shows complete chemical identity, so it is impossible to determine from direct examination whether convergent evolution has occurred. Instead, convergence in molecular

evolution is recognizable only from the phylogenetic tree after it has been generated on the assumption that the tree requiring the smallest total amount of change (the most parsimonious solution) is the correct one. Because there are so few possibilities for evolutionary change at the molecular level (4 nucleotide bases; 20 amino acids), convergent evolution is very common. As a rule, about half of the similarities between species recorded in any tree that is generated must have arisen independently through convergent evolution. Convergence is therefore a major problem with any tree derived from molecular data, particularly because functional aspects of changes in nucleotide bases and amino acids are rarely considered (thus excluding any possibility of identifying functional convergence). Moreover, precisely because there are so few possibilities for change in DNA base sequences, repeated point mutation at a given site will mask previous changes and can easily lead to chance return to the original condition. Although it is now standard practice to make a global correction for repeated mutation at a given site in molecular trees, it is virtually impossible to reconstruct the mutational history of individual sites if repeated change has occurred.

In fact, there is a further problem in interpreting similarities for the reconstruction of phylogenetic trees. Even if it is possible to exclude certain cases of convergent evolution, as is often true with complex morphological similarities, an important distinction remains with respect to inherited homologous similarities. For any group of species considered, a particular set of features will be present in the initial common ancestor. If such a primitive feature is retained as a homologous similarity in any descendants, it reveals nothing about

branching relationships within the tree. The only homologous features that provide information about branching within a tree are novel features that arise at some point and are subsequently retained by descendants as shared derived similarities. This crucial distinction between primitive and derived homologous similarities is particularly relevant if there are marked differences between lineages in the rate of evolutionary change. For instance, members of two slowly evolving lineages can retain many primitive similarities and would thus be grouped together on grounds of overall homologous similarity if no special attempt were made to identify derived similarities. It was once believed that rates of change are reasonably constant at the molecular level, thus reducing the need to distinguish between primitive and derived homologous similarities, but the availability of large molecular data sets has revealed that there can be major differences in rates between lineages.

In conclusion, the increasing availability of molecular data has provided a major benefit for the reconstruction of phylogenetic trees. The large numbers of directly comparable characters included in molecular data sets provide a highly informative basis for quantitative comparisons. On the other hand, because the methods used do not explicitly tackle the crucial distinction between convergent, primitive, and derived similarities, the results are subject to error. Accordingly, if there is a conflict between a tree based on molecular data and one based on morphological data, it should not be automatically assumed that the latter is necessarily incorrect. After all, there is quite often a similar conflict between trees based on two different molecular data sets. The safest procedure is therefore to take a balanced approach that gives due consideration to both morphological and molecular evidence. Combined studies that do precisely this with comprehensive data sets are becoming increasingly common.

Mitochondrial DNA

The ring-shaped, double-stranded mtDNA molecule has the same basic structure in all mammals. It is approximately 16,500 bp in length and contains coding sequences for 13 genes, 2 ribosomal RNA molecules (12S and 16S), and 22 transfer RNA molecules, together with a non-coding control region (D-loop). In contrast to nuclear genes, there are no introns in mtDNA. Furthermore, mtDNA differs from nDNA in another crucial respect that simplifies analysis of its evolution. In mammals, mtDNA is exclusively or almost exclusively inherited maternally (i.e., from the mother), and there is no recombination of genes when the mitochondrion

divides. Phylogenetic reconstructions may be based on part of mtDNA (e.g. using an individual gene, such as cytochrome b) or on the entire molecule, and many complete mtDNA sequences are now available for analysis. Overall, mtDNA tends to accumulate changes more rapidly than nDNA (about five times faster overall), and for this reason it is more suitable for analyses of relatively recent changes in the evolutionary tree of mammals. Because rapidly evolving DNA sequences become saturated with changes at an earlier stage, they are unsuitable for probing early parts of the tree. However, there are differences in rate of evolution between individual parts of the mtDNA molecule, so it is possible to select regions that are suitable for particular stages of mammalian evolution. Mitochondrial DNA sequences can be crudely divided into those that evolve relatively rapidly, hence being useful for comparisons of quite closely related species (e.g. control region, ATPase gene) and those that evolve relatively slowly, thus being useful for comparisons of more distantly related species (e.g. ribosomal genes, tRNA genes, cytochrome b gene). For example, golden moles are of presumed African origin. This implies that there was an extensive African radiation from a single common ancestor that gave rise to ecologically divergent adaptive types. DNA studies suggest that the base of this radiation occurred during Africa's isolation in the Cretaceous period before land connections were developed with Europe in the early Cenozoic era. In another study, scientists examined the mtDNA of 654 domestic dogs, looking for variations. They were trying to determine whether dogs were domesticated in one or several places, and then attempting to identify the place and time that such domestication occurred. Their results show that our common domestic dog population originated from at least five female wolf lines. They went on to speculate that while the archaeological record cannot define the number of geographical origins or their locations, their own data indicate a single origin of domestic dogs in East Asia some 15,000 to 40,000 years ago.

Nuclear DNA

Surprisingly, only a small fraction of the nuclear DNA (nDNA) contained in chromosomes consists of gene sequences that code for production of proteins. It is estimated that less than 5% of human DNA consists of genes that code for approximately 30,000 different proteins. Much of the rest (95%) has no well-established function and is often labeled "junk DNA". A large part of this DNA consists of repetitive sequences that in some cases are present as many thousands of copies. Such DNA sequences that have been inserted into the genome are known as "retroposons", but their function remains essentially unknown.

Reconstruction of phylogenetic relationships using DNA sequences that code for protein sequences (or using the protein sequences themselves) hence involves only a small part of the nuclear genome. Nevertheless, there are many different nuclear genes available for analysis, and the sequence data set for mammals is increasing rapidly. Certainly, the potential total sequence information that can be obtained from the 30,000 protein-coding genes in the nuclear genome is vastly greater than that provided by the 13 protein-coding genes in the mitochondrial genome. As a general rule, the reliability of phylogenetic trees generated with molecular data increases both with the number of species included in comparisons and with the number of DNA sequences analyzed. However, there are some unresolved problems with the methods currently employed for reconstruction of trees using molecular data. Furthermore, there are practical limits to the quantity of data that can be effectively analyzed, so various short-cuts are necessary.

In addition to protein-coding DNA sequences, some categories of retroposons (inserted sequences) are becoming increasingly useful as tools for reconstructing phylogenetic relationships. This is particularly true of inserted sequences known as short interspersed nuclear elements (SINEs) and long interspersed nuclear elements (LINEs), respectively.

Because SINEs and LINEs apparently arise at random and occur widely throughout the genome, they are almost ideal derived characters. Given the vast array of DNA sequences in the nuclear genome, the probability of convergent evolution in the insertion of a SINE or LINE is exceedingly small. It is highly improbable that one of these sequences will be inserted at exactly the same site in the genome in two separate lineages. Secondly, because each insertion is a unique event that is unlikely to be reversed, SINEs and LINEs provide excellent markers for the recognition of groups of related organisms descended from ancestors possessing specific insertions. A very good example of the use of such evidence comes from discussion of the relationships between cetaceans (dolphins and whales) and artiodactyls (even-toed hoofed mammals). It has been accepted for some time that cetaceans are in some way related to artiodactyls. However, accumulating evidence from DNA sequences (both mtDNA and nDNA) indicated that cetaceans are, in fact, specifically related to hippopotamuses and thus nested within the artiodactyl group. This interpretation conflicts with the standard interpretation of the morphological evidence, according to which cetaceans constitute the sister-group to all artiodactyls. Certain fossil forms (mesonychians) that were regarded as relatives of whales and dolphins lacked a characteristic double-pulley adaptation of the ankle joint that is found in all artiodactyls (and was most probably present in their common ancestor). It had therefore been concluded that cetaceans branched away before the emergence of ancestral artiodactyls. This conflict of evidence was convincingly resolved by the discovery that cetaceans and hippopotamuses share a number of SINEs that are not found in any other mammals. Subsequently, early whale fossils possessing the typical artiodactyl ankle joint were discovered. Hence, it is now well established that cetaceans and hippopotamuses are sister-groups and that mesonychians are not direct relatives of the cetaceans after all.

Gene duplication

Studies of the evolution of DNA and protein sequences have generally concentrated on changes arising through point mutations of individual nucleotide bases. Indeed, molecular evolution has often been portrayed essentially as the progressive accumulation of point mutations in genes. However, evolution of DNA can also take place in other ways. One of the most important changes that can occur is tandem duplication of genes. This arises through slippage during the replication of DNA during cell division. Once a gene has been duplicated, the way is open for divergent evolution of the original and its copy. Indeed, over long periods of evolutionary time, gene duplication can occur repeatedly, such that quite large families of genes can result. A prime example is provided by globin genes, which are thought to have arisen from an original single gene through repeated duplication. The hemoglobin molecule, which plays a vital role in respiration, consists of four globin chains. In the blood of adult humans, the hemoglobin molecule contains two alpha-chains and two beta-chains. Sequence comparisons indicate that the beta-chain arose from the alpha-chain through an ancient duplication. There are also special hemoglobins that are temporarily present during the embryonic and fetal stages. Embryonic hemoglobin contains two epsilon-chains, while fetal hemoglobin contains two gamma chains. Both the epsilon-chains and the gamma chains also arose from the beta-chain through relatively recent duplications that took place during the evolutionary radiation of the placental mammals. This illustrates how gene duplication can provide an alternative route for the evolution of new functional properties of genes.

During the long history of evolution of living organisms with a cell nucleus containing chromosomes (eukaryotes), there have also been cases where the entire set of chromosomes has been multiplied (ploidy), for example through doubling of their number. Once sex chromosomes became established, as is the case with all living mammals (XX for females and XY for males), such doubling of the entire set of chromosomes became virtually impossible. Doubling of a male set of chromosomes (to XXYY) would result in the presence of two X chromosomes, thus disrupting the normal process of sex determination in which males have only a single X chromosome. However, at an earlier stage of evolution,

prior to the development of typical mammalian sex chromosomes, multiplication of chromosome sets would still have been possible. Although the evidence is controversial, there is a strong possibility that two successive duplications of the entire chromosome set took place during vertebrate evolution leading up to the emergence of the mammals. For example, two successive doublings of an original set of 12 chromosomes could have let to a set of 48 chromosomes, which is the modal condition found in placental mammals. Although duplications of an entire chromosome set can, of course, be subsequently masked by secondary modifications of individual chromosomes, quadrupling of the chromosomes prior to the emergence of the ancestral placental mammals should still be reflected in the presence of four copies of many individual genes. This does, indeed, seem to be the case for many sets of genes, such as the homeobox genes that play an important part in development.

Duplication of individual genes or entire chromosomes in fact poses an additional problem for reconstruction of phylogenetic trees using molecular data sets. When a tree is based on nucleotide sequences for any individual gene, care must be taken to ensure that it is really the same gene that is being compared between species. If there are multiple copies of a particular gene in the genome, there is always the danger that comparisons between species might involve different copies. A striking example of this danger is provided by mitochondrial genes. Although gene duplication has never been recorded within the mitochondrial genome, individual mitochondrial genes have been repeatedly copied into the nuclear genome, where they generally remain functionless. Inadvertent inclusion of such redundant nuclear copies in comparisons of mitochondrial genes between species has led to serious errors in interpretation. For instance, a supposed mitochondrial gene sequence reported for a dinosaur turned out to be an aberrant nuclear copy of that sequence in human DNA.

The molecular clock

In addition to permitting reconstruction of relationships among species to generate a phylogenetic tree, molecular genetics can also yield valuable information with respect to the timescale for that tree. With the very first reconstructions conducted using molecular data, it was observed that the rate of change in amino acid sequences of particular proteins (and hence in the DNA sequences of the genes responsible) seemed to be relatively constant along different lineages. This led on to the notion of the "molecular clock", according to which the degree of difference between DNA sequences or amino acid sequences in any two species can provide an indication of the time elapsed since their separation. However, it should be noted that accumulating evidence has indicated that the rate of molecular change is in fact quite variable. In the first place, it was obvious from the outset that some genes evolve faster than others, and it was then shown that the overall rate of change in mtDNA is considerably greater than that in nDNA. Moreover, it also became clear that rates of change differ markedly even within individual genes. Some of this variation in rate of change within genes is to be expected. For

example, "silent mutations" (notably in third base positions) and mutations within non-coding regions of genes (introns) are inherently likely to accumulate faster because they do not lead to changes in amino acid sequences and are hence not subject to natural selection. In sum, it is now widely recognized that the concept of the "molecular clock" must be used with caution and that there may be quite marked differences between lineages in the rates of molecular change. Methods have therefore been developed to identify differences in rates of change between lineages and to apply the notion of "local clocks".

It should be noted that molecular data cannot directly yield information on elapsed time and that phylogenetic trees produced with such data always require calibration using information from the fossil record. Once a tree that is characterized by relatively uniform rates of change has been calibrated with at least one date from the fossil record, it is possible to convert genetic distances into time differences. However, conversion of genetic distances into time differences requires that genetic change should be linearly related to time. This generally seems to be the case once a global correction has been made for repeated mutation at a given site. Unfortunately, even if rates of molecular change along lineages are approximately linear (as is required for reliable application of a clock model), calibration dates derived from paleontological evidence introduce an additional source of error. The problem is that the fossil record can only yield a minimum date for the time of emergence of a particular lineage, by taking the age of the earliest known member of that lineage. The lineage may have existed for some considerable period of time prior to the earliest known fossil representative. Clearly, the size of the gap between the actual date of emergence of a lineage and the age of the earliest known representative of that lineage will vary according to the quality of the fossil record. If the fossil record is relatively well documented, as is probably the case with large-bodied hoofed mammals, the earliest known fossil representative may be quite close to the time of origin. In other cases, however, use of the age of the earliest known fossil to calibrate a phylogenetic tree may lead to considerable underestimation of dates of divergence. For instance, the earliest known undoubted primates are about 55 million years old, but statistical modeling indicates that we have so far discovered less than 5% of extinct fossil primate species. Correction for the numerous gaps in the primate fossil record indicates that the common ancestor of living primates existed about 85 million years ago (mya), rather than 60–65 mya as is commonly assumed. Molecular evolutionary phylogenetic trees have also been accurately determined for the common chimpanzee, pygmy chimpanzee, gorilla, and orangutan.

Overall molecular trees for mammals

Large-scale combined studies of nDNA and mtDNA have yielded phylogenetic trees for mammals that generally fit the conclusions derived from traditional morphological comparisons, but also show some differences in detail. For instance, molecular data have generally confirmed that the monotremes branched away first in the mammalian tree and that there was a subsequent division between marsupials and placentals. Interestingly, however, comparisons of complete mtDNA sequences have suggested that the monotremes and marsupials form a group separate from the placentals (Marsupiontia). As this aberrant result conflicts with other molecular evidence as well as with a well-established body of morphological evidence, it probably reflects an artifact of some kind. Indeed, it is noteworthy that the main points of conflict between different molecular trees involve relatively deep branches in the mammalian tree, which are precisely the branches that have posed the greatest challenges in morphological studies. Nevertheless, there is a gathering consensus from broad-based molecular studies that there are four major groups of placental mammals (Afrotheria, Xenarthra, Euarchontoglires and Laurasiatheria). As there are a number of consistent novel features of these groups, some modifications of conclusions based on morphological studies are undoubtedly required. For instance, the existence of the assemblage "Afrotheria" had not been identified from morphological comparisons and was first revealed by molecular studies. Moreover, it would seem that the order Insectivora is not only an artificial grouping of relatively primitive mammals (as has long been expected) but in fact includes widely separate lineages that belong either in Afrotheria (tenrecs, golden moles) or in Laurasiatheria (hedgehogs, shrews, and moles).

Overall phylogenetic trees for mammals based on molecular data have been calibrated in a variety of ways, and a fairly consistent picture has emerged. This conflicts with the long-accepted interpretation, according to which the evolutionary radiation of modern mammals did not begin until the dinosaurs died out at the end of the Cretaceous, 65 mya. Instead, it would seem that the four major groups of placental mammals began to diverge over 100 mya and that many (if not all) modern orders of placental mammals had become established by the end of the Cretaceous. For instance, numerous lines of evidence indicate that primates diverged from other placental mammals about 90 mya. This revised interpretation of the timing of mammalian evolution is significant not only because it indicates that dinosaurs and early relatives of modern placental mammals were contemporaries, but also because it suggests that continental drift may have played a major part in the early evolution of mammals. Contrary to the long-accepted interpretation that the evolutionary radiation of modern mammals began after the end of the Cretaceous, the new interpretation based on molecular data indicates that the early evolution of both placental mammals and marsupials took place at a time when the southern supercontinent Gondwana was undergoing active subdivision. As one outcome of this process, it seems that the endemic group Afrotheria became isolated in Africa.


Resources

Books

Avise, J. C. Molecular Markers, Natural History and Evolution. London: Chapman & Hall, 1994.

Easteal, Simon, C. Collett, and D. Betty. The Mammalian Molecular Clock. Austin: Texas, R.G. Landes, 1995.

Gillespie, J. H. The Causes of Molecular Evolution. Oxford: Oxford University Press, 1992.

Givnish, T. I., and K. Sytsma. Molecular Evolution and Adaptive Radiations. Cambridge: Cambridge University Press, 1997.

Hennig, W. Phylogenetic Systematics. (Reprint of 1966 edition with a foreword by Donn E. Rosen, Gareth Nelson, and Colin Patterson) Urbana: University of Illinois Press, 1979.

Hillis, David M., and Craig Moritz. Molecular Systematics. Sunderland: MA, Sinauer Associates, 1990.

Kimura, M. The Neutral Theory of Molecular Evolution. Cambridge: Cambridge University Press, 1983.

Lewin, Roger. Patterns in Evolution: The New Molecular View. San Francisco: W. H. Freeman, 1997.

Li, W.-H. Molecular Evolution. Sunderland: MA, Sinauer Associates, 1997.

Li, W.-H., and D. Graur. Fundamentals of Molecular Evolution. Sunderland: MA, Sinauer Associates, 1991.

Nei, M. Molecular Evolutionary Genetics. New York: Columbia University Press, 1987.

Ohno, S. Evolution by Gene Duplication. Berlin: Springer Verlag, 1970.

Scheffler, I. E. Mitochondria. New York: Wiley-Liss, 1999.

Periodicals

Allard, Mark W., R. L. Honeycutt, and M. J. Novacek. "Advances in higher level mammalian relationships." Cladistics 15 (1999): 213–219.

Anderson, S., M. H. L. de Bruijn, A. R. Coulson, I. C. Eperon, F. Sanger, and I. G. Young. "Complete sequence of bovine mitochondrial DNA: Conserved features of the mammalian mitochondrial genome." Journal of Molecular Biology 156 (1982): 683–717.

Arnason, Ulfur, Anette Gullberg, S. Gratarsdottir, B. Ursing, and A. Janke. "The mitochondrial genome of the sperm whale and a new molecular reference for estimating eutherian divergence dates." Journal of Molecular Evolution 50 (2000): 569–578.

Bromham, L., M. J. Phillips, and D. Penny. "Growing up with dinosaurs; molecular dates and the mammalian radiation." Trends in Ecological Evolution 14 (1999): 113–118.

Cao, Y., J. Adachi, Axel Janke, Svante Pääbo, and M. Hasegawa. "Phylogenetic relationships among eutherian orders estimated from inferred sequences of mitochondrial proteins: instability of a tree based on a single gene." Journal of Molecular Evolution 39 (1994): 519–527.

Gatesy, Y., C. Hayashi, M. A. Cronin, and P. Arctander. "Evidence from milk casein genes that cetaceans are close relatives of hippopotamid artiodactyls." Molecular Biology and Evolution 13 (1996): 954–963.

Gray, M. W. "Origin and evolution of mitochondrial DNA." Annual Review of Cell Biology 5 (1989): 25–50.

Hedges, S. Blair, Patrick H. Parker, Charles G. Sibley, and S. Kumar. "Continental breakup and the ordinal diversification of birds and mammals." Nature 381 (1996): 226–229.

Janke, Axel, X. Xu, and U. Arnason. "The complete mitochondrial genome of the wallaroo (Macropus robustus) and the phylogenetic relationship among Monotremata, Marsupialia and Eutheria." Proceedings of the National Academy of Sciences, USA 94 (1997): 1276-1281.

Kumar, Sudhir, and S. B. Hedges. "A molecular timescale for vertebrate evolution." Nature 392 (1998): 917–920.

Li, W.-H., M. Gouy, P. M. Sharp, C. O'h Uigin, and Y.-W. Yang. "Molecular phylogeny of Rodentia, Lagomorpha, Primates, Artiodactyla, and Carnivora and molecular clocks." Proceedings of the National Academy of Sciences, USA 87 (1990): 6703–6707.

Li, W.-H., M. Tanimura, and P. M. Sharp. "An evaluation of the molecular clock hypothesis using mammalian DNA sequences." Journal of Molecular Evolution 25 (1987): 330–432.

Madsen, Ole, et al. "Parallel adaptive radiations in two major clades of placental mammals." Nature 409 (2001): 610–614.

Miyamoto, M. M. "A congruence study of molecular and morphological data for eutherian mammals." Molecular Phylological Evolution 6 (1996): 373-390.

Murphy, William J., et al. "Molecular phylogenetics and the origins of placental mammals." Nature 409 (2001): 614–618.

Murphy, William J., et al. "Resolution of the early placental mammal radiation using Bayesian phylogenetics." Science 294 (2001): 2348–2351.

Nikaido, M., A. P. Rooney, and N. Okada. "Phylogenetic relationships among certartiodactyls based on insertions of short and long interspersed elements: Hippopotamuses are the closest extant relatives of whales." Proceedings of the National Academy of Sciences, USA 96 (1999): 10261–10266.

Rambaut, A., and L. Bromham. "Estimating dates of divergence from molecular sequences." Molecular Biological Evolution 15 (1998): 442–448.

Schmitz, Jürgen, Martina Ohme, and H. Zischler. "The complete mitochondrial genome of Tupaia belangeri and the phylogenetic affiliation of Scandentia to other eutherian orders." Molecular Biological Evolution 17 (2000): 1334–1343.

Shedlock, A. M., and N. Okada. "SINE insertions: powerful tools for molecular systematics." Bio Essays 22 (2000): 148–160.

Shimamura, Mitsuru, et al. "Molecular evidence from retroposons that whales form a clade within even-toed ungulates." Nature 388 (1997): 666–670.

Springer, Mark S., et al. "Endemic African mammals shake the phylogenetic tree." Nature 388 (1997): 61–64.

Tavaré, Simon, Charles R. Marshall, Oliver Will, Christophe Soligo, and R. D. Martin. "Using the fossil record to estimate the age of the last common ancestor of extant primates." Nature 416 (2002): 726–729.

Ursing, B. M., and U. Arnason. "Analyses of mitochondrial genomes strongly support a hippopotamus-whale clade." Proceedings of the Royal Society, London B 265 (1998): 2251–2255.

Waddell, P. J., Y. Cao, M. Hasegawa, and D. P. Mindell. "Assessing the Cretaceous superordinal divergence times within birds and placental mammals by using whole mitochondrial protein sequences and an extended statistical framework." Systemic Biology 48 (1999): 119–137.

Robert D. Martin, PhD