Environment dominates over host genetics in shaping human gut microbiota


Human gut microbiome composition is shaped by multiple factors but the relative contribution of host genetics remains elusive. Here we examine genotype and microbiome data from 1,046 healthy individuals with several distinct ancestral origins who share a relatively common environment, and demonstrate that the gut microbiome is not significantly associated with genetic ancestry, and that host genetics have a minor role in determining microbiome composition. We show that, by contrast, there are significant similarities in the compositions of the microbiomes of genetically unrelated individuals who share a household, and that over 20% of the inter-person microbiome variability is associated with factors related to diet, drugs and anthropometric measurements. We further demonstrate that microbiome data significantly improve the prediction accuracy for many human traits, such as glucose and obesity measures, compared to models that use only host genetic and environmental data. These results suggest that microbiome alterations aimed at improving clinical outcomes may be carried out across diverse genetic backgrounds.

Figure 1: Genetic ancestry is not significantly associated with microbiome composition.
Figure 2: Genetic kinship is weakly associated with microbiome composition.
Figure 3: Limited evidence for microbiome associations with specific SNPs.
Figure 4: The gut microbiome can be used to infer a significant fraction of the variance of several human phenotypes.

We thank the Segal and Elinav group members for discussions; J. Goodrich for sharing the processed twins microbiome data with us; and participants and staff of the LifeLines DEEP cohort for their collaboration. S.C. thanks the Abisch–Frenkel Foundation. This study makes use of data generated by the Wellcome Trust Case Control Consortium. A full list of the investigators who contributed to the generation of the data is available from www.wtccc.org.uk. Funding for the project was provided by the Wellcome Trust under awards 076113 and 085475. E.S. is supported by the Crown Human Genome Center; the Else Kroener Fresenius Foundation; D. L. Schwarz; J. N. Halpern; L. Steinberg; and grants funded by the European Research Council and the Israel Science Foundation. E.E. is supported by Y. and R. Ungar, the Gurwin Family Fund for Scientific Research, the Leona M. and Harry B. Helmsley Charitable Trust, the Israel Science Foundation and the Helmholtz Foundation. E.E. holds the Sir Marc and Lady Tania Feldmann Professorial Chair in Immunology, is a senior fellow of the Canadian Institute for Advanced Research, and is an international scholar at the Bill and Melinda Gates Foundation and Howard Hughes Medical Institute. D.R. received a Levi Eshkol PhD Scholarship for Personalized Medicine by the Israeli Ministry of Science. LLD was made possible by grants from the Top Institute Food and Nutrition (GH001) to C.W. C.W. is funded by a European Research Council (ERC) advanced grant (FP/2007-2013/ERC grant 2012-322698), a Netherlands Organization for Scientific Research (NWO) Spinoza prize (NWO SPI 92-266) and the Stiftelsen Kristian Gerhard Jebsen foundation (Norway). A.Z. holds a Rosalind Franklin Fellowship (University of Groningen), ERC starting grant (715772) and NWO Vidi grant (178.056). J.F. is funded by an NWO Vidi grant (NWO-VIDI 864.13.013). A.Z. and J.F. are also funded by CardioVasculair Onderzoek Nederland (CVON 2012-03).

D.R., O.W. and E.B. conceived the project, designed and conducted all analyses, interpreted the results, wrote the manuscript and are listed in random order. A.K., A.V.V., J.F., C.W. and A.Z. performed the analyses of the Dutch cohort and interpreted the results. T.K., D.Z. and A.W. designed protocols and supervised data collection. T.K., D.Z., P.I.C., A.G., I.N.K. and N.B. conducted microbiome analyses. S.S. and D.L. designed nutritional and drug databases. N.Z., M.P.-F, D.I. and Z.H. coordinated and supervised clinical aspects of data collection. N.K., G.M. and B.C.W. coordinated and designed data collection. T.A.-S., M.L.-P. and A.W. developed protocols and performed genotyping and microbiome sequencing. S.C. designed the genetic analyses. E.E. and E.S. conceived and directed the project and analyses, designed the analyses, interpreted the results and wrote the manuscript.

Corresponding authors

Correspondence to Eran Elinav or Eran Segal.

Competing interests

The authors declare no competing financial interests.

Reviewer Information Nature thanks M. Georges and the other anonymous reviewer(s) for their contribution to the peer review of this work.

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended Data Figure 1 Limited evidence for microbiome associations with genetic ancestry or kinship across multiple functional and taxonomic levels.

a–p, Each row is similar to Figs 1b, d–e, 2b, but is based on the abundance of bacterial genes (a–d), genera (e–h), genera based on 16S rRNA gene sequencing data (i–l) or phyla (m–p). a, d, e, h, m, p, n = 715 genotyped individuals; i, l, n = 481 individuals with 16S rRNA gene sequencing data; b, f, n, n = 737 individuals for whom the ancestries of all grandparents are known; j, n = 509 individuals with 16S rRNA gene sequencing data for whom the ancestries of all grandparents are known; c, g, o, n = 946 individuals; and k, n = 650 individuals with 16S rRNA gene sequencing data.

Source data

Extended Data Figure 2 Limited evidence for associations between microbiome β-diversity and specific SNPs.

The quantile–quantile plot shows that only two SNPs are significantly associated with microbiome β-diversity at P < 5 × 10−8, computed using a distance-based F test with n = 715 unrelated genotyped individuals. λGC, genomic inflation factor.

Extended Data Figure 3 Individuals who share a household at present or have shared one in the past have significantly similar microbiomes.

First-degree relatives and individuals with present household sharing have significantly similar species and bacterial gene abundances (P < 0.01; permutation testing). a–c, Box plots depict the distribution of Bray–Curtis dissimilarities across pairs of individuals at the phylum (a), species (b) and bacterial genes (c) level. Each panel shows the Bray–Curtis dissimilaries among all pairs of (i) first-degree relatives, who are likely to have experienced present or past household sharing (n = 55 pairs); (ii) second-to-fifth-degree relatives, who are unlikely to have experienced present or past household sharing (n = 24 pairs); (iii) unrelated individuals self-reported to currently share a household (n = 32 pairs); and (iv) all other individuals (n = 255,891 pairs). The lower and upper limits of the boxes represent the 25% and 75% percentiles, respectively, and the top and bottom whiskers represent the 5% and 95% percentiles, respectively. The P value ranges for all panels are: **P < 0.01 and ***P < 0.005.

Source data

Extended Data Figure 4 The gut microbiome is significantly associated with multiple environmental factors.

The fraction of variance of the microbiome β-diversity matrix that can be inferred from different categories of environmental factors is shown. n = 715 individuals (Supplementary Table 17); numbers in parentheses indicate the number of features in each category. The fraction of inferred variance can reflect both the information that the category conveys on the microbiome as well as the number of factors in the category, which depends on the questionnaire used in the study.

Source data

Extended Data Figure 5 b2 estimates and phenotype prediction results when using various data sources.

Each row is similar to Fig. 4c–e, but is based on a different data source. a–c, Relative abundance of genera, obtained from 16S rRNA gene sequencing (using n = 464 individuals). d–f, Relative abundance of genera, obtained from metagenomic sequencing (using n = 715 individuals). g–i, Relative abundance of phyla (using n = 715 individuals). j–l, Relative abundance of species (using n = 715 individuals). m–o, Relative abundance of bacterial genes in the LLD cohort (using n = 836 individuals). Note that two phenotypes that were analysed in the Israeli cohort (lactose consumption and glycaemic status) were not available for the LLD cohort, and two phenotypes available for the LLD cohort and shown here (LDL cholesterol and triglycerides) were not available for the Israeli cohort. The P value ranges for all panels are: *FDR < 0.05, **FDR < 0.01 and ***FDR < 0.001.

Source data

Extended Data Table 1 Baseline characteristics of the cohort
Extended Data Table 2 No significant association between ancestral or genetic similarity and the gut microbiome

Life Sciences Reporting Summary (PDF 74 kb)

Supplementary Information

This file contains Power Simulations: A detailed description of our power simulations procedure and Statistical Aspects of the Microbiome-Association Index: A clarification regarding the assumptions behind the derivation of the microbiome-association index, and their statistical implications. (PDF 1083 kb)

This file contains Supplementary Tables 1-28. (XLSX 510 kb)

Rothschild, D., Weissbrod, O., Barkan, E. et al. Environment dominates over host genetics in shaping human gut microbiota. Nature 555, 210–215 (2018). https://doi.org/10.1038/nature25973

