The rate of DNA evolution: Effects of body size and temperature on the molecular clock
See allHide authors and affiliations

Communicated by Murray GellMann, Santa Fe Institute, Santa Fe, NM, November 9, 2004 (received for review March 26, 2004)
Abstract
Observations that rates of molecular evolution vary widely within and among lineages have cast doubts on the existence of a single “molecular clock.” Differences in the timing of evolutionary events estimated from genetic and fossil evidence have raised further questions about the accuracy of molecular clocks. Here, we present a model of nucleotide substitution that combines theory on metabolic rate with the nowclassic neutral theory of molecular evolution. The model quantitatively predicts rate heterogeneity and may reconcile differences in molecular and fossilestimated dates of evolutionary events. Model predictions are supported by extensive data from mitochondrial and nuclear genomes. By accounting for the effects of body size and temperature on metabolic rate, this model explains heterogeneity in rates of nucleotide substitution in different genes, taxa, and thermal environments. This model also suggests that there is indeed a single molecular clock, as originally proposed by Zuckerkandl and Pauling [Zuckerkandl, E. & Pauling, L. (1965) in Evolving Genes and Proteins, eds. Bryson, V. & Vogel, H. J. (Academic, New York), pp. 97–166], but that it “ticks” at a constant substitution rate per unit of massspecific metabolic energy rather than per unit of time. This model therefore links energy flux and genetic change. More generally, the model suggests that body size and temperature combine to control the overall rate of evolution through their effects on metabolism.
Completion of the modern evolutionary synthesis will require better understanding of the molecular processes of evolutionary change. The speed of molecular evolution can be measured as the rate of genetic divergence of descendants from a common ancestor, so the rate of molecular evolution can be quantified in terms of the changes in the nucleotide sequences that comprise the genome. Observations that rates of molecular evolution vary widely within and among lineages have raised doubts about the existence of a single “molecular clock,” as originally proposed by Zuckerkandl and Pauling (1). The accuracy of molecular clocks is further called into question because molecular estimates of divergence time often disagree with the fossil record (2, 3). Understanding the factors responsible for rate heterogeneity is key to resolving differences between molecular and fossilbased estimates of important evolutionary events [e.g., Cambrian explosion (4, 5) and proliferation of modern mammalian orders (2)]. More generally, understanding rate heterogeneity may yield insight into the factors affecting overall rates of evolution.
Variations in rates of nucleotide substitution have been correlated with body size, metabolic rate (6), generation time (7), and environmental temperature (8, 9). Differences also have been observed between endotherms and ectotherms (6, 10). This rate heterogeneity most often is attributed to one of two causes, metabolic rate or generation time. According to the metabolic rate hypothesis, most mutations are caused by genetic damage from free radicals produced as byproducts of metabolism, so mutation rates should be related to cellular or massspecific metabolic rates (6). According to the generation time hypothesis, most mutations are caused by errors in DNA replication during cell division, so mutation rates should be related to the number of divisions in germ cell lines and hence to generation times (7). Distinguishing between these hypotheses has been difficult because free radical production and generation time both vary with metabolic rate (6, 11), which in turn varies with body size and temperature (12).
Here, we propose a model that predicts heterogeneity in rates of molecular evolution by combining principles of allometry and biochemical kinetics with Kimura's neutral theory of evolution. The model quantifies the relationship between rates of energy flux and genetic change based explicitly on the effects of body size and temperature on metabolic rate. Although the model does not distinguish between the metabolic rate and generation time hypotheses, it accounts for much of the observed rate heterogeneity across a wide range of taxa in diverse environments. Recalibrating the molecular clocks by using metabolic rate reconciles some fossil and molecularbased estimates of divergence.
The Model
Metabolic rate is the rate at which energy and materials are taken up from the environment and used for maintenance, growth, and reproduction. It ultimately governs most biological rate processes, including the two generally thought to control mutation rate: free radical production rate and generation time (6, 7, 12, 13). Metabolic rate likely affects other processes, such as DNA repair and environmentally induced mutagenesis, that influence rates of nucleotide substitution (6). Massspecific metabolic rate (B) varies with body size, M, and temperature, T, as where b_{o} is a coefficient independent of body size and temperature (12). The body size term, M ^{1/4}, has its origins in the fractallike geometry of biological exchange surfaces and distribution networks (14). The Boltzmann or Arrhenius factor, e ^{E/kT} , underlies the temperature dependence of metabolic rate, where E is an average activation energy for the biochemical reactions of metabolism (≈0.65 eV) (12), k is Boltzmann's constant (8.62 × 10^{5} eV·K^{1}), and T is absolute temperature in degrees Kelvin. Eq. 1 explains most of the variation in the metabolic rates of plants, animals, and microbes (12).
When combined with assumptions of the neutral theory (15), Eq. 1 also can be used to characterize rates of molecular evolution. The first assumption is that molecular evolution is caused primarily by neutral mutations that randomly drift to fixation in a population, resulting in nucleotide substitutions (15). This assumption is consistent with theory and data demonstrating that deleterious mutations have only a negligible chance of becoming fixed in a population because of purifying selection (16), and that favorable mutations occur very rarely (17). Under this assumption, the rate of nucleotide substitution per generation is equal to the neutral mutation rate per generation and is independent of population size (15). The second assumption is that point mutations, and therefore substitutions, occur at a rate proportional to B. This idea assumes that most mutations are caused by some combination of free radical damage, replication errors, and other processes that ultimately are consequences of metabolism. Together, these two assumptions imply that the nucleotide substitution rate, α, defined as the number of substitutions per site per unit time, varies with body size and temperature as where f is the proportion of point mutations that are selectively neutral, and ν is the number of point mutations per site per unit of metabolic energy expended by a unit mass of tissue (g mutations site^{1}·J^{1}). Thus, the product fν is the neutral mutation rate per unit of massspecific metabolic energy and, following Kimura's neutral theory, the substitution rate (see Appendix 1, which is published as supporting information on the PNAS web site). If the body size and temperature dependence of substitution rate is controlled by B, then fν is predicted to be a constant independent of M and T. Consequently, Eq. 2 predicts the existence of a molecular clock that “ticks” at a constant rate per unit of massspecific metabolic energy flux rather than per unit of time. On average, a certain quantity of metabolic energy transformation within a given mass of tissue causes a substitution in a given gene regardless of body size, temperature, or taxon. Eq. 2 therefore predicts a 100,000fold increase in substitution rates across the biological size range (≈10^{8} g of whales to ≈10^{12} g of microbes) and a 34fold increase in substitution rates across the biological temperature range (≈0–40°C).
Rearranging terms in Eq. 2 and taking logarithms yields: or where C = ln(fνb_{o} ).
Model Predictions
Eqs. 3 and 4 correct for mass and temperature, respectively, and lead to three explicit predictions. The first prediction is that the logarithms of masscorrected substitution rates should be linear functions of 1/kT with slopes of E ≈ 0.65 eV (Eq. 3 ), reflecting the kinetics of aerobic metabolism. The second prediction is that the logarithms of temperaturecorrected substitution rates should be linear functions of lnM with slopes of approximately 1/4 (Eq. 4 ), reflecting the allometric scaling of massspecific metabolic rate (14). Finally, if these first two predictions hold, then the third prediction is that, for a given gene, the number of substitutions per site per unit massspecific metabolic energy, fν, should be approximately invariant across taxa.
Methods
Calculation of Substitution Rates. Estimated rates of substitution, α, were compiled from multiple published sources for mitochondrial and nuclear genomes (respectively, Appendixes 2 and 3, which are published as supporting information on the PNAS web site). Together, these data represent several major taxonomic groups (e.g., invertebrates, fish, amphibians, reptiles, birds, and mammals), which span 10 orders of magnitude in body size and the biological temperature range 0°C to 40°C. Sequence divergence, D, was estimated by using direct sequencing methods for all sequences considered here except for the entire mitochondrial genome, where the restriction fragment length polymorphism technique was used. For the mitochondrial genome, estimates of sequence divergence were from four different coding regions (12s rRNA, 16s RNA, cytochrome b, and whole genome). For the nuclear genome, estimates of sequence divergence were from two published sources based on rates of silent substitution in coding regions. In the first source, divergence estimates were calculated for 11 pairs of primates based on globin gene data (6). In the second source, estimates were obtained for 23 pairs of mammalian taxa that encompass 17,208 proteincoding DNA sequences from 5,669 nuclear genes and 326 species (18) (Appendix 3).
Times of divergence, τ in millions of years (My), were independently estimated by using paleontological data (e.g., fossil records and geological events), and varied by ≈2 orders of magnitude for the mitochondrial data (0.43–38 My) and 1 order of magnitude for the nuclear data (5.5–56.5 My) (Appendixes 2 and 3). Substitution rates were then calculated as α = D/2τ, which are the average for the two lineages over time τ (Appendix 4, which is published as supporting information on the PNAS web site). Although not all sources used the same mathematical model to estimate D in mitochondrial genomes, variation caused by differences in methodology is small (19) compared with the predicted effects of body size and temperature.
Body Size and Temperature Estimates. The formula for estimating substitution rate (α = D/2τ) is an average for two descendent lineages that may differ in body mass. To account for differences in substitution rates caused by differences in body mass between the two lineages, we take the “quarterpower average,” which controls for the greater influence of the smaller, more rapidly evolving lineage on the calculated substitution rate (Appendix 4). Body temperatures of endothermic birds and mammals were estimated from the literature and varied between ≈35°C and 40°C. Body temperatures of ectotherms were estimated as the mean annual ambient temperature where the organisms presently occur, or in the case of some fishes, the temperature of the preferred habitat. This estimation assumes that extant ectotherms are approximately in thermal equilibrium with their environment, and that they occur in a similar thermal environment as their ancestors.
Assessing Effects of Body Mass and Temperature. Our methods for estimating body size and temperature likely introduce substantial error into these predictor variables. This violates the assumptions of type I regression (20). We therefore used type II regression to assess the quantitative effects of body size and temperature on substitution rates. However, before fitting type II regression models, we first determined whether body size and temperature had significant independent effects on substitution rates. This process was necessary because, for the data considered here, the largest animals all are endotherms, resulting in a positive correlation between body size and temperature. We therefore fitted type I multiple regression models for all of the data shown below. For the mitochondrial data that includes both ectotherms and endotherms, we fitted a model of the form ln(α) = βln(M) E(1/kT) + C, and for the mammalian nuclear data, we fitted a model of the form ln(α) = βln(M) + C. This procedure simultaneously estimates the allometric scaling exponent, β, and activation energy, E, of substitution rates. Multiple regression analyses indicated significant independent effects for body size and temperature (P < 0.05) for all data except those for cytochrome b data shown in Figs. 1C and 2C . We note, however, that, on average, the type I regression coefficients were lower than the predicted values of 0.25 for β (, n = 6) and 0.65 eV for E (Ē = 0.40 eV, n = 4) (see Appendix 4).
Results and Discussion
Data support each of the model's three predictions. First, the logarithm of masscorrected substitution rate is a linear function of inverse absolute temperature for the four molecular clocks from the mitochondrial genome (Fig. 1). Temperature accounts for 25–80% of the variation in masscorrected substitution rates among diverse organisms, including endotherms (body temperatures of ≈35–40°C) and ectotherms from a broad range of thermal environments (≈0–30°C). The type II regression slopes of these lines all are close to the predicted value of 0.65 eV based on the kinetics of metabolism (Table 1). Thus, contrary to some recent reports (21), our results, which incorporated a wide range of body temperatures, indicate that nucleotide substitution rates are strongly temperaturedependent. Second, loglog plots of temperaturecorrected substitution rates versus body mass all are well fitted by straight lines (r ^{2} = 0.23–0.74) for these four clocks, and the slopes all are close to the predicted value of 1/4 (Fig. 2 and Table 1). Substitution rates therefore show the same M^{1/4} allometric scaling as massspecific metabolic rate B. Third, both endotherms and ectotherms (vertebrates and invertebrates) fall on the same lines in these relationships, supporting the prediction that fν is approximately invariant across taxa for a given gene. Building on previous work showing correlations of substitution rate to body size (6), these results show that all animals cluster around a single line that is predicted by our model. Note that the model quantifies the combined effects of body size and temperature. Analyses that consider these variables separately, like much of the previous literature, explain much less of the observed variation in substitution rates (Table 2).
Still further support for the predicted mass dependence of molecular evolution (prediction 2) comes from analysis of two data sets on rates of silent substitutions in coding sequences of mammalian nuclear genomes. For globin data in primates, a loglog plot of substitution rate versus body mass gives a straight line with a slope close to the predicted value of 1/4 (0.27, 95% confidence interval: 0.20 to 0.34; r ^{2} = 0.85, Fig. 3A ). For a broader assortment of mammals and sequences (Appendix 3), a loglog plot also gives a straight line with a slope close to the predicted value of 1/4(0.21, 95% confidence interval: 0.18 to 0.23; r ^{2} = 0.77, Fig. 3B ). And as predicted, both lines show very similar intercepts (24.79 and 24.81). Thus, it appears that mammalian nuclear genomes have slopes for the mass dependence of substitution rates that are similar to those observed in mitochondrial genomes for a broader range of taxa (Fig. 2), but intercepts which are slightly lower. We note, however, that in Figs. 1, 2, 3, observed values deviated by as much as 2.7fold from the predicted values (Table 1). This residual variation likely indicates the importance of factors other than body size and temperature that affect measured substitution rates. Yet, these deviations of up to 2.7fold are small compared with the ≈100fold variation explained by our model.
The fact that the model predicts empirically observed substitution rates supports the hypothesis that there is a direct relationship between the rate of energy transformation in metabolism and the rate of nucleotide substitution. The number of substitutions per site per unit of massspecific metabolic energy, fν, can be calculated from the yintercepts (C) in Figs. 1, 2, 3: fν = e ^{C}/b_{o} (Eqs. 3 and 4 ). Taking the fitted intercept of C ≈ 26 for mtDNA (Table 1), and b_{o} = 1.5 × 10^{8} W·g^{3/4} (12), we obtain fν ≈ 4 × 10^{13} g·substitutions·site^{1}·J^{1}. Thus, ≈2.4 × 10^{12} J of energy must be fluxed per g of tissue to induce one substitution per site in the mitochondrial genome.
Differences in the fitted intercepts, and therefore fν, among genes, genomes, and types of substitutions may reflect the influence of other factors in addition to body size and temperature. For example, f is known to vary from near 1 for synonymous codon sites and noncoding regions to near 0 for nonsynonymous sites, and ν differs between mitochondrial and nuclear genomes (19). The model could be finetuned to incorporate these and other possible sources of variation. In Table 1, the calculated intercepts for overall rates of substitution for mtDNA, rRNA, and cytochrome b are all ≈26. The intercept for cytochrome b transversions is lower (24.61), as are those for silent nuclear substitutions (24.79 and 24.81). These differences are consistent with current theory and data finding lower rates of transversions than transitions and lower overall rates of substitution in nuclear than in mitochondrial genomes (19).
We illustrate some of the evolutionary implications of this model with three examples. First, Fig. 4 shows estimates of a proposed molecular clock for mammalian divergence times (18), some of which differ substantially from fossilbased estimates. Molecular and fossilbased estimates are in close agreement for humans and chimpanzees (Homo and Pan, 5.5 My) because the clock calibrated in ref. 18 was disproportionately influenced by the preponderance of data for these and other similarly large mammals. However, their clockestimated divergence date for Hystricognath rodents predates the fossil estimate by >2fold (115 My vs. 56.5 My), and for the much smaller rodent genera Mus and Rattus by >3fold (41 My vs. 12.5 My). Our model largely reconciles these discrepancies by incorporating the effects of body size and obtaining a date close to the fossil estimate (Fig. 4; we corrected only for mass, because these mammals have similar body temperatures). The procedure of taking the quarterpower average corrects for the greater influence of smaller taxa on rates of divergence because of their higher massspecific metabolic rates (see Appendix 4).
Second, our model suggests how differences in body size might explain the “hominoid slowdown hypothesis,” which proposed that rates of molecular evolution have slowed in hominoids since their split from Old World monkeys (22). Based on differences in average body mass between extant hominoids (50 kg) and Old World monkeys (7 kg), our model predicts a ≈0.6fold slowdown [=(7 kg/50 kg)^{1/4}], close to the estimated 0.7 (22).
Third, our model suggests that differences in temperature may account for the nearly 4fold discrepancy between a molecular and a geological estimate of the age of notothenioid antarctic fishes (11 My vs. 38 My) (23). Assuming that the temperatezone ectotherms used to calibrate the clock occurred at ≈15°C, whereas the notothenioid fishes occurred at ≈0°C, our model appears to reconcile this discrepancy (e ^{} ^{E} ^{/} ^{k} ^{(273+15)}/e ^{} ^{E} ^{/} ^{k} ^{(273+0)} ≈ 4). These three examples illustrate how calibrating molecular clocks for body size and temperature may provide insights into evolutionary history. Metabolic rates of plants and microbes show similar body size and temperature dependence as animals (12). We expect that the theory developed here should be applicable to these organisms. This expectation is supported by a recent study showing the temperature dependence of mutation rates in plants (9).
These results also may have broader implications for understanding the factors controlling the overall rate of evolution. The central role of metabolic rate in controlling biological rate processes implies that metabolic processes also govern evolutionary rates at higher levels of biological organization where the neutral molecular theory does not apply. So, for example, the rate and direction of phenotypic evolution ultimately depends on the somewhat unpredictable action of natural selection. However, the overall rate of evolution ultimately is constrained by the turnover rate of individuals in populations, as reflected in generation time, and the genomic variation among individuals, as reflected in mutation rate (16, 24). Both of these rates are proportional to metabolic rate, so Eq. 1 also may predict the effects of body size and temperature on overall rates of genotypic and phenotypic change. Such predictions would be consistent with general macroevolutionary patterns showing that most higher taxonomic groups originate in the tropics where temperatures are high (25), speciation rates decrease with decreasing temperature from the equator to the poles (26, 27), biodiversity is highest in the tropics (28), and smaller organisms evolve faster and are more diverse than larger organisms (29).
Acknowledgments
We thank F. Allendorf, E. Charnov, H. Olff, V. Savage, T. Turner, and W. Woodruff for comments or discussions that improved this manuscript and S. Kumar for providing us with his data. J.F.G., G.B.W., and J.H.B. acknowledge support of the Thaw Charitable Trust and a Packard Interdisciplinary Science Grant. G.B.W., A.P.A., and J.H.B. were supported by the National Science Foundation. G.B.W. acknowledges the hospitality of the Mathematics Department at Imperial College, London, and the support of the Engineering and Physical Sciences Research Council.
Footnotes

↵ † To whom correspondence should be addressed. Email: gillooly{at}unm.edu.

Author contributions: J.F.G., A.P.A., G.B.W., and J.H.B. performed research; J.F.G., A.P.A., and G.B.W. contributed new reagents/analytic tools; J.F.G. and A.P.A. analyzed data; and J.F.G., A.P.A., G.B.W., and J.H.B. wrote the paper.

Abbreviation: My, millions of years.
 Copyright © 2005, The National Academy of Sciences
References

↵
Zuckerkandl, E. & Pauling, L. (1965) in Evolving Genes and Proteins, eds. Bryson, V. & Vogel, H. J. (Academic, New York), pp. 97166.

↵
Alroy, J. (1999) Syst. Biol. 48 , 107118. pmid:12078635
 ↵

↵
Wray, G. A., Levinton, J. S. & Shapiro, L. H. (1996) Science 274 , 568573.

↵
Ayala, F. J., Rzhetsky, A. & Ayala, F. J. (1998) Proc. Natl. Acad. Sci. USA 95 , 606611. pmid:9435239

↵
Martin, A. P. & Palumbi, S. R. (1993) Proc. Natl. Acad. Sci. USA 90 , 40874091. pmid:8483925
 ↵

↵
Bleiweiss, R. (1998) Proc. Natl. Acad. Sci. USA 95 , 612616. pmid:9435240
 ↵

↵
Rand, D. M. (1994) Trends Ecol. Evol. 9 , 125131.

↵
Savage, V. M., Gillooly, J. F., Brown, J. H., West, G. B. & Charnov, E. L. (2004) Am. Nat. 163 , E429E441.

↵
Gillooly, J. F., Brown, J. H., West, G. B., Savage, V. M. & Charnov, E. L. (2001) Science 293 , 22482251. pmid:11567137
 ↵

↵
West, G. B., Brown, J. H. & Enquist, B. J. (1997) Science 276 , 122126. pmid:9082983
 ↵

↵
Fisher, R. A. (1930) The Genetical Theory of Natural Selection (Clarendon, Oxford).

↵
Dobzhansky, T. (1951) Genetics and the Origin of Species (Columbia Univ. Press, New York).

↵
Kumar, S. & Subramanian, S. (2002) Proc. Natl. Acad. Sci. USA 99 , 803808. pmid:11792858

↵
Li, W. H. (1997) Molecular Evolution (Sinauer, Sunderland, MA).
 ↵
 ↵
 ↵

↵
Eastman, J. T. & McCune, A. R. (2000) J. Fish Biol. 57 , 84102.

↵
Kimura, M. (1983) The Neutral Theory of Molecular Evolution (Cambridge Univ. Press, Cambridge, U.K.).
 ↵

↵
Flessa, K. W. & Jablonski, D. (1996) in Evolutionary Paleobiology, eds. Jablonski, D., Erwin, D. H. & Lipps, J. H. (Univ. of Chicago Press, Chicago), pp. 376397.

↵
Stehli, F. G., Douglas, D. G. & Newell, N. D. (1969) Science 164 , 947949.

↵
Allen, A. P., Brown, J. H. & Gillooly, J. F. (2002) Science 297 , 15451548. pmid:12202828
 ↵