John R Speakman
ABSTRACT Scientists have been measuring energy jars over water and charted the change in volume of the air inside
expenditure by using gas exchange for the past 200 y. This the jar from the change in the movement of the water. He observed
technique is based on earlier work in the 1660s. Gas exchange in that mice died when they had consumed about one-fourteenth of
respirometers provides accurate and repeatable measures of the air in the bell jar. Mayrow accordingly established the idea that
resting metabolic rate. However, it is impossible to duplicate in the air consists of different parts, only some of which are usable
a respirometry chamber the diversity of human behaviors that for the process of respiration. A significant discovery, for which he
influence energy expenditure. The doubly labeled water is not generally credited, is the invention of a chamber that
technique is an isotope-based method that measures the energy allowed the quantification of the consumed portion; this was the
expenditure of unencumbered subjects from the divergence in first respirometer. The significance of Mayrow’s observations was
enrichments of 2 isotopic labels in body water—1 of hydrogen not widely recognized, mostly because of the prevailing alchemic
and 1 of oxygen. The method was invented in the 1950s and framework within which they were interpreted.
applied to small animals only until the early 1980s, mostly A century after the innovative work of Boyle and Mayrow, the
because of the expense. Since 1982, when the first study in French chemists Lavoisier and Seguin started systematic investi-
humans was published, its use has expanded enormously. gations of respiration as a process analogous to combustion. In
Although there is some debate over the precise calculation the intervening century, important advances had been made
protocols that should be used, the differences between toward the understanding of the chemistry of gases: in 1757
alternative calculations result in relatively minor effects on total Joseph Black had discovered carbon dioxide (“fixed air”), and in
energy expenditure estimates (<6%). Validation studies show 1774 Joseph Priestly had discovered oxygen. The procedures
that for groups of subjects the method works well, but that used by Lavoisier and Seguin mimicked closely those developed
precision is still relatively poor (8–9%) and consequently the by Mayrow, the key difference being the framework within which
method is not yet sufficiently refined to provide estimates of the observations were interpreted. The methods involved confin-
individual energy expenditures. Am J Clin Nutr 1998; ing animals or humans in chambers to quantify their consumption
68(suppl):932S–8S. of oxygen and production of carbon dioxide. The chambers they
used were relatively unsophisticated. Animals were still confined
KEY WORDS Doubly labeled water, gas exchange, in bell jars, and human subjects were confined in varnished silk
respirometry, energy expenditure, carbon dioxide production, bags secured around the nose and lips with pitch.
indirect calorimetry, dilution spaces Lavoisier and Seguin made several important discoveries
about oxygen consumption. First, they found that larger persons
consume more oxygen than smaller persons. Second, they found
INTRODUCTION that people sitting quietly at rest consumed less oxygen than
The knowledge that respiration and ventilation are essential those standing up or moving about. Finally, they discovered that
functions of life dates back to at least biblical times. In the Old after a meal, oxygen consumption was elevated even in a person
Testament, Book of Psalms, for example, it is stated with respect sitting at rest. Perhaps most importantly, the work of Lavoisier
to animals, “When thou takest away their breath they die” (Psalm and Seguin established the methodology of indirect calorimetry
104). Although it was clear to ancient humans that animals had to that has remained the benchmark method for the quantification
breathe to live, the exact physiologic function of breathing was of animal and human energy expenditure to this day.
obscure. For many years it was widely believed that inspired air Since the end of the 18th century, the chambers in which ani-
served to cool the heat generated by a fire burning in the heart (1). mals and humans are confined have become increasingly sophis-
Scientific study of animal respiration only commenced in the ticated. Moreover, sealed systems have been replaced with open
1600s. In 1660 Robert Boyle observed that mice that had been flow systems linked to advanced gas analysis equipment, so
sealed into bell jars expired at the same time that a burning flame
became extinguished. Boyle thus established 2 important princi- 1
From the Department of Zoology, University of Aberdeen, Scotland, United Kingdom.
ples: the equivalence of fire and life as combustion processes and United Kingdom.
the requirement of air to support these processes. 2
Reprints not available. Address correspondence to JR Speakman, Depart-
Less well known, but probably of greater significance, was the ment of Zoology, University of Aberdeen, Aberdeen AB24 2TZ, Scotland,
work of John Mayrow. In 1668 Mayrow placed mice in sealed United Kingdom. E-mail: J.Speakman@abdn.ac.uk.
Am J Clin Nutr 1998;68(suppl):932S–8S. Printed in USA. © 1998 American Society for Clinical Nutrition
subjects no longer have to die to provide a measurement. Never- involved comparisons of estimates of carbon dioxide production
theless, it is clear that no matter how intricately designed such by standard respirometry and the DLW method in 15 laboratory
chambers become they will never be able to reproduce the com- mice, was published in 1955 (8). The technique provided an esti-
plexity of activities in which people are engaged as they go about mate that differed from the standard method by <2–3% on aver-
their routine lives. Yet, our understanding of many of the nutri- age, depending on the assumptions made about the behavior of
tional problems facing humans depends on reliable and accurate isotopes both in the body and during elimination from it. It was
measurements of their energy demands as they go about their immediately clear that the method worked and that it had a
everyday business. The inadequacy of traditional calorimetry tremendous advantage over standard respirometry measure-
apparatus has been recognized for a long time, and there have ments. As Lifson et al (8) put it in the introduction to their 1955
been many attempts to develop methods, such as heart-rate mon- paper, “a measurement is made possible merely by taking 2
itoring, that enable the energy demands associated with free-liv- blood samples to reconstruct the isotope elimination curves.”
ing activities to be monitored (2–4). The doubly labeled water This meant the animal could perform a whole variety of natural
(DLW) technique is a method that allows the measurement of behaviors between the taking of the 2 samples unconstrained by
energy demands of free-living subjects. The success of this confinement in a respirometry chamber. Yet it would be another
method prompted Prentice (5) to remark that its development 9 y before this potential advantage of the method was actually
was as significant an event in the history of animal and human exploited to study metabolism of a wild animal (9). The method
nutrition as the work of Lavoiser and Seguin had been. was used to measure the energy costs of flying in homing
pigeons (Columba livia) (9) in 1964. The author was also based
at Minnesota and associated with Lifson’s group. Indeed, Lif-
HISTORY OF THE DOUBLY LABELED WATER son’s group was the only group in the world to use the method
TECHNIQUE between its original development and 1970, and during that time
The discovery of rare heavy isotopes of oxygen and hydrogen they produced only 9 papers using it, of which 6 were validations
in the late 1920s and the 1930s transformed many areas of biol- (all on small rodents), 2 were applications, and 1 was a complete
ogy. This is because these substances were chemically and theoretical analysis of the method and its underlying assump-
physically almost identical to the common light forms of the tions (10). This theoretical paper has formed the basis of all the
elements. The rare isotopes were thus ideal as tracers for the subsequent theoretical discussions of the method. Few of the
behavior of oxygen and hydrogen, and therefore water, in the advances made since the mid-1960s are not addressed in this
body. Until their discovery, scientists relied on water-soluble paper. By the mid-1990s, 30 y after publication, records from the
dyes to investigate water dynamics. For example, known Science Citation Index (Institute for Scientific Information, Uni-
amounts of the dye Evans blue were injected into subjects to see versity of Aukland, New Zealand) indicate that this paper was
how diluted it became and thus evaluate their plasma blood vol- cited at least 50 times/y.
ume (eg, 6). In the early years after the discovery of isotopes, In the 1970s, several other groups started to use the method.
the major problems with their use related to their expense, lack Studies by these groups exploited the ability of the method to
of availability, and, for stable forms, the lack of equipment measure the free-living energy demands of small animals. The
capable of accurately measuring their abundance. These prob- first measures on a free-ranging mammal (pocket mouse, Perog-
lems were solved principally as a byproduct of the need to nathus formosu) were made in 1970 (11) and the first on a free-
understand more about the behavior of isotopes and to artifi- living reptile (lizard, Sauromalus obesus) in 1972 (12). All the
cially enrich them during the development of atomic weapons at animals on which the method was used between 1955 and 1975
the end of the Second World War. weighed < 1 kg. There were no attempts to apply the method to
By 1949, Lifson et al (7), at the University of Minnesota, had humans to evaluate free-living energy expenditures for 2 rea-
performed several experiments in mice that involved injecting sons. The first reason is that it was prohibitively expensive. In
them with stable isotopes of oxygen in water or forcing them to 1973, for example, it was estimated that to use the method on a
breathe air enriched with 18O. These experiments clearly showed single 70-kg human would cost <US$50 000 (13). For compari-
that the oxygen in body water was in complete isotopic equilibra- son, the most expensive production automobile in 1973 was the
tion with the oxygen in respiratory carbon dioxide. This equili- Rolls Royce Phantom VI that cost £17 817 (<US$30 000 at cur-
bration occurred principally because of isotope-exchange reactions rent exchange rates). The second reason was that clinical prob-
catalyzed by carbonic anhydrase during blood gas transport. The lems involving disorders of energy balance (eg, obesity) were
consequence of these exchange reactions is that a dose of oxygen less prevalent in Western populations at the time (14, 15), and
isotope introduced into the body of an animal is removed by the links between such disorders and other clinical problems were
flow rates of water, inspired oxygen, and expired carbon dioxide not fully understood. Thus, in the early 1970s we had available a
through that body. Although this consequence was not stated by method that would have cost the equivalent of 2 Rolls Royce
Lifson et al (7), its significance was clearly not lost on the automobiles per subject to implement and no immediately appar-
authors. If an isotopic label of oxygen is eliminated from the ent clinical problem of sufficient importance to which the results
body by both carbon dioxide and water, an isotopic label of could be applied.
hydrogen would be eliminated only by the water. Thus, the dif- The cost of 18O declined throughout the 1970s, and develop-
ference in the elimination of the 2 labels simultaneously intro- ments in mass spectrometer technology made it feasible to label
duced into the animal (thus DLW) would provide a measure of subjects at much lower levels and still retain precision and accu-
the carbon dioxide production and hence indirectly the energy racy of analysis. By 1980, the costs of a study of a 70-kg subject
expenditure. had declined to <US$2000. Moreover, the impetus to perform
It took 6 y to develop this simple theory into a working such studies was increasing because the prevalence of morbid
method. The seminal publication on the DLW technique, which obesity increased during the previous decade, and links between
extra elimination route is not; proportionally, it is much larger in the final estimate of carbon dioxide production (36, 38, 39).
the small mammal (33). Theoretical analyses of the pool struc- I reviewed previously all the data available as of July 1996 on
ture show that the extent of the extra elimination offsets the both animals and humans for this ratio (31). This survey revealed
effect of the extra pool (8, 33, 34). Thus, in small mammals with that across 15 studies in mammals, the mean (± SD) dilution
a high extra elimination route, the single-pool model works best, space ratio was 1.041 ± 0.025 and across 9 studies in birds the
but in larger subjects the 2-pool model works best. Subsequent ratio was 1.0427 ± 0.033. Across 19 studies comprising 590
validations in animals of a range of body sizes confirmed this adult humans the weighted mean was 1.0387 (31). This value
effect and suggest that the single-pool model is probably appro- includes all data, irrespective of whether analyses were per-
priate in animals weighing ≤ 5 kg (35). formed directly on urine or on water distilled from the urine.
The effect of body size on the appropriateness of different Debate over the most appropriate population estimate is certain
pool models raises the interesting question of what might be the to evolve as more data are published. However, a consensus is
most appropriate method to calculate the metabolism of small likely to fall between the limits of 1.03 (25) and 1.0427 (36).
infants (< 3–4 kg). In fact, validation studies in infants clearly
show that the 2-pool formulation gives a close fit to the calori-
metric data even in preterm infants weighing < 2 kg (20, 21), A summary of all the validation comparisons of DLW to indi-
although calculations using the single-pool model have not been rect calorimetry in human subjects between 1982 and 1996 is
assessed. It seems likely that the 2-pool model works in small provided in Table 1. It is not possible to compare the different
humans because of a combination of 2 factors. First, small formulations of the equations directly on a matched sample of
babies do not necessarily behave physiologically like adult subjects because the data necessary to recalculate early valida-
small animals of the same body mass with respect to the magni- tions were not published in the papers. Overall (Table 2), across
tude of the subsidiary hydrogen elimination. Second, the extent 201 subjects the Schoeller et al (25) equation, with the lowest
of their extra hydrogen space is much smaller than that found in population ratio (1.03), overestimated the simultaneous indirect
adults; consequently, the difference between the single- and 2- calorimetry by 2.1%. In contrast, in 54 subjects the Speakman et
pool model is reduced. Indeed, in very small infants the ratio of al (36) equation, with the highest ratio (1.0427), underestimated
hydrogen to oxygen dilution space does not differ from unity, simultaneous indirect calorimetry by 2.2%. The equation of
thus 2-pool and single-pool models give identical results. Racette et al (38) with an intermediate ratio (1.034) overesti-
mated by on average 1.2% (n = 36). A limited matched sample
Using individual or population-based dilution spaces in the can be compared. In the same 36 individuals reviewed by Racette
2-pool model. et al (38), the Schoeller et al (25) equation overestimated indirect
An issue that has emerged over the past decade is whether to calorimetry results by 3.3% and the Speakman et al (36) equa-
use individual values of No and Nd in the 2-pool model equation tion underestimated these results by 3.1%. Differences between
or to replace these with a population-based average (25). If the equations are consequently relatively minor, but should per-
observed variation in No and Nd is biological, then it would make haps be borne in mind when relatively small differences are com-
sense to use individual values for the dilution spaces. Alterna- pared between studies that might have used different methods, or
tively, if the variation is mostly analytic it would make sense to when comparisons are made between estimates of energy expen-
use a population-based ratio of the 2 pools. Some authorities diture using DLW and other methods for determining energy
continue to use individual ratios. However, it is now widely demands, such as food intake.
accepted that the dominant source of variation in the observed It is important to recognize that the average comparisons
pool size ratio (Nd/No) is analytic. This being the case, a popula- across groups indicate only the efficacy of the method for
tion-based average would be better used in the equation. defining the energy demands of groups. Individual compar-
Schoeller et al (25), on the basis of a limited sample of animal isons of DLW to indirect calorimetry have much greater devi-
studies, suggested that the oxygen pool exceeds N by 1.01 and ations than the group means. Across studies the precision is
the hydrogen pool exceeds N by 1.04, and thus the hydrogen <10% on average (Table 1). Thus, at the extremes, some indi-
pool exceeds the oxygen pool by 1.03. Therefore, the 2-pool viduals may have DLW estimates > 20% divergent from simul-
model equation (6) can be reformulated as taneous indirect calorimetry. There is some evidence that part
of this variation can be attributed to analytic problems in cer-
rCO2 = (N/2.076) · (1.01ko 2 1.04kd) 2 tain laboratories; a multiple laboratory comparison (40)
0.0246 · 1.05 · N(1.01ko 2 1.04kd) (7) showed wide discrepancies in the isotope enrichment determi-
nations of the same samples analyzed by different groups.
where n = (No/1.01 + Nd/1.04) /2. Nevertheless, precision in the best laboratories is still rela-
The appropriateness of the population-based average of 1.03 is tively poor and the method is not yet sufficiently refined to
debated. This value was rounded from a true value of 1.034 (25), make confident estimates of individual energy requirements.
but it was based on the relatively small sample size available at Repeatability of measurements made by the method at
the time. Subsequent reevaluations using larger data sets have <6–10% confirm this (53).
resulted in different population averages, ie, 1.0427 (36), 1.027 The reasons for this lack of precision are not yet entirely clear.
(37), 1.034 (38), and 1.035 (39). The differences in these esti- Precision may improve as the technical instrumentation for mass
mates depend on the inclusiveness of the reviews. Some authors spectrometric determination of isotope abundances improves.
have concluded that analyses of deuterium directly from urine are However, in many circumstances the extents of deviations of
inaccurate and have eliminated these from their compilations of estimates made by DLW and simultaneous indirect calorimetry
data (eg, 38). The consequent differences in the formulation of already exceed the theoretical precision of the DLW measure-
the equation can result in biologically significant differences in ment, given the precisions of the component analyses (54). This
Validation studies of the doubly labeled water method in human subjects in comparison to indirect calorimetry (IC) and food intake/mass balance (FI) or
parenteral nutrition (PN)1
n nobs Mass Com Eqn Abs Prec Arith n1 n2 Min Max Reference
kg % %
4 4 78.1 FI L+M 5.0 5.6 12.1 3 1 25.8 7.1 19
2 2 — IC ?L+M 3.5 3.5 22.5 1 1 26.0 1.0 40
1 1 70.5 IC L+M — — 24.6 0 1 — — 23
5 5 72.9 IC L+M 8.5 6.8 15.9 4 1 26.5 14.1 22
4 4 — IC C — 2.0 12.0 — — — — 31
6 6 68.9 IC L+M 8.0 9.0 15.0 5 1 — — 25
14 14 — IC S — 7.0 11.0 10 4 — — 25*
14 14 — IC L+M — 8.0 14.0 11 3 — — 25*
5 5 53.2 PN S 5.7 5.9 13.3 4 1 — — 41
4 4 1.59 IC C 4.2 4.8 21.4 1 3 24.8 5.8 20
9 10 3.2 IC S — 6.2 20.9 5 5 — — 21
5 5 — IC S 4.3 3.9 11.4 3 2 — — 42
4 8 — IC S 4.8 7.0 21.0 3 5 — — 43
8 8 2.67 IC S 10.7 12.9 28.7 — — 229.3 4.8 44
33 33 — ICFI L+M 6.9 6.6 14.3 26 7 28.7 20.2 30*
33 33 — ICFI S 5.1 6.3 10.6 21 12 212.1 16.7 30*
1 1 66.9 FI S? — — 20.8 0 1 — — 45
4 4 84.6 IC S 2.5 2.5 115.3 4 0 — — 46
12 12 108.0 IC S 6.4 6.9 22.5 5 7 — — 47
27 27 79.8 FI S — 0.7 12.5 17 10 — — 48
8 8 74.1 FI S 9.7 14.4 12.2 5 3 235.0 9.0 49
9 9 71.8 IC S — 4.6 11.8 6 3 24.0 12.0 50
6 6 50.1 IC C 15.0 16.8 17.0 4 2 223.0 32.0 51
18 18 — IC L+M — — 123.1 — — — — 52
18 18 — IC S — 10.7 110.3 — — — — 36*
18 18 — IC SNG — 9.3 20.4 — — — — 36*
36 36 — IC S — 8.9 13.3 — — — — 37*
36 36 — IC SNG — 7.9 23.1 — — — — 37*
36 36 — IC R — 8.5 11.2 — — — — 37*
Abs, absolute mean deviation (average percentage error with all signs made positive); Arith, arithmetic percentage error including signs; C, Coward 2-
pool model using individual dilution spaces; Eqn, equation used to calculate the doubly labeled water results; L+M, Lifson and McClintock single-pool model;
Mass, mean mass of subjects in kg; Max, maximum deviation; Min, minimum deviation; n+, number of observations in which the DLW method exceeded the
reference method; nobs, number of comparisons; n +, number of observations in which the DLW method exceeded the reference method; n2, number of obser-
vations in which the DLW method was lower than the reference method; Prec, precision calculated as the SD of individual estimates used to derive Arith; R,
Racette et al, 1994 (2-pool model fixed ratio 1.034); S, Schoeller et al, 1986, equation A6 (2-pool model fixed pool ratio 1.03); SNG, Speakman et al, 1993
(2-pool fixed pool ratio 1.0432); *, review of previously published validations.
