Lethal Genes
Lethal Genes
Lethal Genes
GENETICS | INVESTIGATION
ABSTRACT The effects of inbreeding on human health depend critically on the number and severity of recessive, deleterious mutations
carried by individuals. In humans, existing estimates of these quantities are based on comparisons between consanguineous and
nonconsanguineous couples, an approach that confounds socioeconomic and genetic effects of inbreeding. To overcome this
limitation, we focused on a founder population that practices a communal lifestyle, for which there is almost complete Mendelian
disease ascertainment and a known pedigree. Focusing on recessive lethal diseases and simulating allele transmissions, we estimated
that each haploid set of human autosomes carries on average 0.29 (95% credible interval [0.10, 0.84]) recessive alleles that lead to
complete sterility or death by reproductive age when homozygous. Comparison to existing estimates in humans suggests that
a substantial fraction of the total burden imposed by recessive deleterious variants is due to single mutations that lead to sterility or
death between birth and reproductive age. In turn, comparison to estimates from other eukaryotes points to a surprising constancy of
the average number of recessive lethal mutations across organisms with markedly different genome sizes.
KEYWORDS autosomal recessive disease; consanguinity; inbreeding; recessive lethal mutation; human
functionally important sites that, if mutated, would give rise ¼ mbp 2pNe li ;
i¼1
to recessive lethal alleles. We term this quantity the “target Pm
size” for recessive lethal disorders and infer it from our es- where i¼1 li is the target size for all autosomal recessive
timate of R. lethal mutations.
Assuming that the founders of S-leut Hutterites were
The completely recessive case: Using a diffusion model and drawn from a random-mating population at equilibrium,
a low mutation rate approximation, Li and Nei (1972) de- each of them should have carried twice that number of
rived the expectation for the total number of heterozygotes recessive lethals. Based on our estimate of R = 0.29, a mu-
that carry a recessive deleterious mutation in a finite pop- tation rate of 1.2 3 1028/bp per generation (Campbell et al.
ulation [designated by n1(p)], 2012), and a diploid effective population size of 20,000, the
rffiffiffiffiffiffiffiffiffiffi target size is then estimated to be 6.8 3 104 bp.
p In a population with larger Ne, the efficacy of purifying
n1 ðpÞ ¼ 4NNe p ; for Ne s $ 5
2Ne s selection against recessive alleles increases (proportional to
Ne 1=2 ), but the increase in the mutational input (proportional
(equation 7b in Li and Nei 1972), where p is the initial to Ne) is greater than the increase in the efficacy of selection
frequency of the mutation, N and Ne the actual and effective (Simons et al. 2014). Therefore, at equilibrium, an individual
population sizes, respectively, and s the selection coefficient in a population with larger Ne will have a greater expected
against homozygotes (Li and Nei 1972). number of recessive lethal mutations. The recent population
The total number of heterozygotes affected by a single growth experienced by humans represents a transition from
recessive lethal mutation, n1 ; can be obtained by replacing p small Ne to large Ne, which will therefore lead to an increase
and s by 1/(2N) and 1 in the above equation: in the average number of recessive lethal mutations per indi-
pffiffiffiffiffiffiffiffiffiffiffiffi vidual. As a result, the estimated target size after recent growth
n1 ¼ 2pNe : will be smaller than that estimated from the long-term Ne.
For a gene with a mutation rate to recessive deleterious The partially recessive case: If a deleterious mutation leads
alleles of m; the expected frequency of heterozygotes in to complete lethality (or sterility) in homozygotes (s = 1)
a random generation is then and a decrease of hs in fitness of heterozygotes, selection
against the deleterious allele would mainly come from the
f1 ¼ 2mn1 death of heterozygotes, because the death of homozygotes is
a rare event (when the allele frequency is low). As in the
(equation 18 in Li and Nei 1972). For simplicity, we assume completely recessive case, we assume that there are m au-
that there are m autosomal genes in the genome that each tosomal genes in the genome that can mutate to deleterious
can lead to complete sterility or lethality between birth and alleles with these properties and that gene i has li such sites.
reproductive age and that gene i has li sites at which muta- Under these assumptions, the expected total number of het-
tions will give rise to recessive lethal alleles. We further erozygotes affected by a mutation in a finite population at
assume that each site has the same per generation mutation gene i can be approximated as
rate, mbp, so the total mutation rate to recessive lethal alleles
1 1
at gene i is n1 ¼ ¼ ; for 4Ne sh $ 1
hs h
mi ¼ li mbp :
(equation 6b in Li and Nei 1972). So the average frequency
The expected frequency of heterozygotes carrying a recessive of those partially recessive lethal mutations is
lethal allele at this gene i is mi
qi ¼ mi n1 ¼ ;
pffiffiffiffiffiffiffiffiffiffiffiffi h
f1i ¼ 2mi n1 ¼ 2mi 2pNe ;
which is independent of the effective population size and the
and the expected frequency of recessive lethal alleles at this same as obtained assuming mutation–selection balance
gene is approximately (Haldane 1935).
autosomal, recessive lethal mutations carried by the 64 and Europeans despite the out-of-Africa bottleneck. A simi-
founders is 3/0.081 = 37 or an average of 0.29 recessive lar concern might be that a long period of endogamy could
lethal alleles in each haploid human genome (Figure 1A). have purged recessive deleterious alleles from the popula-
To assess the uncertainty in this estimate, we estimated the tion that led to the Hutterites (Keller and Waller 2002).
posterior distribution of the mean number of mutations per There is no evidence that such a demographic scenario oc-
haploid human genome conditional on observing exactly curred, but if it did, it is again unlikely to have had much of
three diseases since 1950 (Materials and Methods and Figure an effect: over the 15 generations between the origin of
1B). If a uniform prior distribution is used, the posterior the Hutterites in the 1520s and the founding event, even
distribution has a mode of 0.29 and a 95% CI of [0.10, relatively high levels of human inbreeding (F = 0.03) should
0.84]. We also considered a uniform prior on the logarith- decrease the mean allele frequency of recessive lethals only
mic scale to allow for uncertainty in the order of magnitude, by 30% (Overall et al. 2002). Moreover, such a decrease
and a similar 95% CI is obtained (i.e., [0.060, 0.70] muta- would be lessened or nullified by reproductive compensa-
tions per haploid genome). The point estimate and 95% CI tion (Overall et al. 2002), as might occur in the Hutterites
were similar, when we used simulation results from the (Ober et al. 1999). These considerations suggest that esti-
larger pedigree (see Materials and Methods). mates based on the Hutterites should be broadly applicable
Simulations further indicate that only a small fraction of and would, if anything, be slightly lower than the mean
the surviving recessive lethal mutations have been seen in number of recessive lethals carried by larger, outbred
homozygotes, so there are more hidden, recessive lethal populations.
mutations that are segregating among extant individuals in In that regard, we note that our estimate of the average
the pedigree. In fact, carrier screening has identified number of recessive lethal mutations per haploid genome is
heterozygotes for three more recessive lethal mutations in lower than the previous estimates of the total number of
the S-leut Hutterites in South Dakota, which have man- “lethal equivalents” per haploid genome (0.56–0.7 in Bittles
ifested themselves in Hutterites outside the pedigree under and Neel 1994; Bittles and Black 2010b). A lethal equivalent
study (Table 2) (Chong et al. 2012). Based on our simula- is defined as a locus or a set of loci that, when in the homo-
tion results, we expect quite a few more recessive lethal zygous state, would cause on average one death, e.g., one
mutations in addition to these cases, most of which remain lethal mutation or two mutations each with 50% probability
unknown. of causing death (Morton et al. 1956). In other words, the
In generalizing from the results for the Hutterites to other total number of lethal equivalents in a haploid genome can
human populations, one concern might be that their de- be thought of as the sum of the deleterious effects of all
mographic history prior to the founder event in the 18th– recessive mutations carried by an individual. Comparison
19th centuries was atypical in ways that influence the num- to estimates of this quantity suggests that, as expected, re-
ber of recessive lethals carried by the founders. While tran- cessive lethal mutations are only a subset of the recessive
sient demographic changes can have a marked impact on mutational burden. Interestingly, however, the difference
patterns of genetic variation, they are not expected to have between our point estimate and previous estimates is only
a substantial effect on the average number of recessive le- about twofold; even if we consider the lower bound of our
thal alleles carried by an individual, because their equilib- credible interval on the mean number of recessive lethals, it
rium frequency is reestablished on a relatively short is still about one-sixth of the total number of lethal equiv-
timescale (Balick et al. 2014). For instance, after a bottle- alents. Thus, insofar as previous estimates are reliable, it
neck, this quantity of interest returns to the equilibrium appears that a substantial portion of the total burden of re-
pffiffiffiffiffiffiffiffiffi
value within 4N0 generations (where N0 is the original cessive mutations carried by humans is attributable to single
population size before the bottleneck); for this reason, this mutations that, when homozygous, lead to sterility or death
quantity is expected to be very similar for modern Africans between birth and reproductive age.
manifest after birth should be increased by 0.29/16 = 1.8% sites are of particular interest, because they are of critical
in offspring of first-cousin couples (assuming no difference functional importance; on the other hand, mutations at
in environmental factors). This prediction agrees well with those sites are haplosufficient, in that one functional copy of
the estimated 3.5–4.4% increased risk for prereproductive the gene is enough to maintain fitness. Assuming a random-
mortality and 1.7–2.8% increased incidence of congenital mating, diploid population with constant effective popula-
anomalies in children of first cousins above the general pop- tion size of 20,000 (as a proxy for the population from which
ulation risk (Hamamy et al. 2011). Hutterite founders derived), a mutation rate of 1.2 3 1028/bp
Beyond the Hutterites, this approach can be applied to per generation, and an estimate of 0.29 recessive lethal mu-
other isolated founder populations with limited immigra- tations per haploid set of autosomes, we predict that there
tion, for which there is reliable genealogical information should be 68,000 autosomal base pairs at which mutations
since the founding and close to complete disease phenotyp- lead to recessive lethal disorders on or after birth (see Mate-
ing in the relatively recent past, such as the Amish and the rials and Methods; using theory from Li and Nei 1972).
inhabitants of Norfolk Island (Macgregor et al. 2010; Hinckley Based on this estimate of the target size, we do not expect
et al. 2013). de novo recessive lethals that arose since the founding to
In interpreting our estimates, an important consideration is manifest themselves as diseases since the 1950s. Consider-
that they are limited to lethal diseases that manifest themselves ing a model of population growth that leads the current
after birth. This issue is common to most studies that estimate effective population size to be .20,000, the expected fre-
the mutational burden in humans, because of the limited quency of recessive lethal mutations in the population
availability and reliability of data on prenatal loss. Studies that should be higher (see Materials and Methods), and the esti-
considered data on the frequency of miscarriages (i.e., a gesta- mated size of target size smaller. While these estimates
tion age of $28 weeks) reported no or little effect of consan- should not be taken too literally, as many recessive disease
guinity on prenatal losses, while detecting clear-cut effects on mutations are not point mutations (e.g., Boone et al. 2013),
postnatal mortality (Schull et al. 1970; Bittles and Makov they provide a sense of the number of sites at which reces-
1988; Bittles and Black 2010b). This negative finding cannot sive lethal disease mutations may be discovered.
be taken as strong evidence for the absence of embryonic re- Moreover, this estimate of target size provides comple-
cessive lethals in humans, as most losses due to embryonic mentary information to population genetic approaches that
lethals may occur during earlier stages of pregnancy. Even if aim to estimate the distribution of fitness effects of muta-
the data on early pregnancy loss were available, the high rate tions from polymorphism and divergence and mostly learn
of spontaneous pregnancy failure due to other causes (Leridon about weaker selection coefficients (Eyre-Walker and Keightley
1977) may obscure the difference between consanguineous 2007). These methods find that 25–40% of all amino acid
and nonconsanguineous groups due to embryonic recessive changes in humans are strongly deleterious (i.e., have s .
lethals. In contrast to how little is known in humans, extensive 1% in a genic selection model) (Yampolsky et al. 2005; Eyre-
mutation screens in mice reveal a high proportion (40–50%) of Walker et al. 2006; Boyko et al. 2008). Combining these
autosomal knockout mutations that cause deaths in prenatal estimates with our estimated target size would then suggest
stages when homozygous (Mitchell et al. 2001; White et al. that 0.5–0.8% of strongly deleterious mutations are reces-
2013). If the proportion of embryonic lethals is similar for sive lethals that are fatal between birth and reproductive
spontaneous mutations in humans, each human individual car- age.
ries approximately one to two recessive lethal mutations that An important caveat is that recessive disease-causing
act across ontogenesis. mutations may not be completely recessive, in that carriers
of one copy may also have a slight decrease in their fitness
Genomic target size for recessive lethal mutations
that is too subtle to be detectable in clinical diagnosis. If so,
Our results provide insight into the total number of the mutations will segregate in the population at much
autosomal sites in the human genome that, if mutated, give lower frequencies due to selection against heterozygotes,
rise to recessive (or nearly recessive) lethal alleles. Those and the target size could be larger. For instance, if the