Genetic contributions to NAFLD:

leveraging shared genetics to uncover
systems biology
Mohammed Eslam   * and Jacob George   *
Abstract | Nonalcoholic fatty liver disease (NAFLD) affects around a quarter of the global
population, paralleling worldwide increases in obesity and metabolic syndrome. NAFLD arises
in the context of systemic metabolic dysfunction that concomitantly amplifies the risk of
cardiovascular disease and diabetes. These interrelated conditions have long been recognized
to have a heritable component, and advances using unbiased association studies followed by
functional characterization have created a paradigm for unravelling the genetic architecture of
these conditions. A novel perspective is to characterize the shared genetic basis of NAFLD and
other related disorders. This information on shared genetic risks and their biological overlap
should in future enable the development of precision medicine approaches through better
patient stratification, and enable the identification of preventive and therapeutic strategies.
In this Review, we discuss current knowledge of the genetic basis of NAFLD and of possible
pleiotropy between NAFLD and other liver diseases as well as other related metabolic disorders.
We also discuss evidence of causality in NAFLD and other related diseases and the translational
significance of such evidence, and future challenges from the study of genetic pleiotropy.

Metabolic syndrome
There is increasing interest in nonalcoholic fatty liver dis- Hence, it is not surprising that NAFLD is a determinant
A cluster of risk factors that ease (NAFLD), a condition in which liver fat exceeds 5% of extrahepatic diseases including those of the cardio­
are associated with insulin of hepatocytes in the absence of secondary causes of lipid vascular system, diabetes risk and cancer9–12. In this
resistance and future accumulation or clinically significant alcohol intake1 Review, we discuss the concept of pleiotropy between
cardiovascular disease risk.
(Box 1). This focus is due to both its growing burden NAFLD and other related liver and non-​liver disorders
According to the Adult
Treatment Panel-​III, metabolic (around a quarter of the world’s population is affected) and the translational implications and challenges that
syndrome is defined as the and its contribution to liver-​related and extrahepatic stem from this knowledge.
presence of abnormalities morbidity2,3, and the lack of effective pharmacotherapy.
in at least three of the five
The increased prevalence of NAFLD is fuelled by the Genetics of NAFLD
components: elevated fasting
glucose, high blood pressure,
dramatic escalation in obesity and metabolic syndrome, Heritability of NAFLD. As with other complex traits,
hypertriglyceridaemia, low HDL with concomitant reductions in physical activity at a the phenotypic manifestations and severity of NAFLD
cholesterol level and elevated societal level4. Epidemiological, basic and translational are the outcome of gene–environment interactions13.
waist circumference. research efforts have led to substantial progress in Data derived from epidemiological, familial aggre-
understanding the pathophysiology of NAFLD, to the gation and twin studies provide strong evidence for
recognition of lifestyle interventions as a cornerstone of the heritability of NAFLD. Heritability estimates range
management, and to a plethora of clinical trials of drugs from 20% to 70%, with an estimated shared genetic
for its treatment5. Despite these advances, NAFLD is set effect or determination between steatosis and fibro-
Storr Liver Centre, Westmead to remain the principal cause of liver disease in most sis of 75%14, depending on ethnicity, study design,
Institute for Medical countries and is on target to become the main cause of environmental factors and the methodology used for
Research, Westmead Hospital liver transplantation by 2030 in the USA2,6. The primary NAFLD characterization15–20. Similar ranges of herita-
and University of Sydney, causes of death in NAFLD, however, are cardiovascular bility have been observed for other related metabolic
Sydney, NSW, Australia.
disease (CVD) and cancer7. traits, such as BMI, blood lipid levels, type 2 diabetes
*e-​mail: mohammed.eslam@
The liver is a principal organ for the regulation mellitus (T2DM), blood pressure levels and CVD21–25.
sydney.edu.au; jacob.george@
sydney.edu.au of metabolic homeostasis, mediating the interaction of These findings, together with their close clinical inter-​
https://doi.org/10.1038/ the external environment including dietary intake and relationships, suggest the potential for an overarching
s41575-019-0212-0 gut-​derived microbial signals with the internal milieu8. overlap in their genetic architecture (Table 1).

Key points identified by the first GWAS in NAFLD in 2008 (ref.28).

It remains the most robust variant associated with
• Nonalcoholic fatty liver disease (NAFLD) is a liver disorder with high heritability, the entire spectrum of NAFLD (steatosis, fibrosis and
and no approved pharmacotherapy to date. hepatocellular carcinoma) across diverse geographical
• Although our understanding of the genetic underpinnings of NAFLD has advanced, regions and ethnicities, including for example, different
known risk variants explain only a small fraction of heritability, suggesting the European, Hispanic and Asian populations29–31. PNPLA3
existence of ‘missing heritability’. is a multifunctional enzyme implicated in lipid regula-
• There is evidence for shared genetic modifiers and common pathophysiological tion and has both triacylglycerol lipase and acylglycerol
pathways that link NAFLD, other liver diseases and related metabolic disorders. O-​acyltransferase activity32,33, as well as retinyl ester
• Research has now progressed beyond genome-​wide association studies (GWAS) activity in hepatic stellate cells (HSCs) based on in vitro
to broader, causal and functional discovery via multi-​trait GWAS, phenome-​wide and in vivo studies33,34.
association studies (PheWAS), Mendelian randomization and functional annotation
The TM6SF2 rs58542926 C>T polymorphism codes
for an E to K substitution at position 167 resulting in
• The next wave of genetic studies should have substantial translational implications for
loss of function, and is associated with reduced hepatic
both drug discovery and personalization of medicine.
TM6SF2 mRNA and protein expression35–39. TM6SF2 is
predominantly expressed in the liver and small intestine,
Genetic contributions to NAFLD. The study of the with low expression in other tissues37,40. The exact func-
A statistical analysis that genomic basis of complex diseases including NAFLD tion of TM6SF2 is not well understood, but it regulates
estimates the proportion has evolved in phases. Following initial candidate gene cholesterol synthesis and the secretion of lipoproteins40,41.
of trait variation that is studies, the release of the human reference genome A variant in the GCKR gene that controls de novo
attributable to genetic
sequence in 2003 was followed rapidly by the first lipogenesis by regulating the influx of glucose into hepato­
variation among individuals.
Heritability varies according genome-​wide association study (GWAS) in 2005 (ref.26). cytes has been associated with NAFLD20,42. The causal
to the studied population. Since then, GWAS has become the default methodol- variant seems to be a common missense loss-​of-function
ogy to determine genotype–phenotype correlations, in mutation (rs1260326) coding for the P446L protein vari­
Genome-​wide association which tests for association are performed between hun- ant, which regulates glucokinase in response to fructose-
(GWAS). An examination
dreds of thousands to over a million single-​nucleotide 6-phosphate, boosting the lipogenic pathway via pro-
of a large number (hundreds of polymorphisms (SNPs) across the genome and a single viding further substrates for fatty acid biosynthesis and,
thousands) of common single-​ trait27. In the past decade, an explosion in large-​scale, therefore, increasing the risk of NAFLD43.
nucleotide polymorphisms hypothesis-​free method-​based discoveries, including The latest addition to the genes that contribute to
across the genome of many
GWAS, whole-​genome and whole-​exome sequencing NAFLD, based on an exome-​wide association study in
cases and controls of a
particular trait to determine have led to the identification of an enormous number humans, is a protein-​truncating variant in the HSD17B13
whether any variant is of genes that modify an individual’s predisposition to gene (rs72613567:TA) that seems to be robustly associ-
associated with the trait. complex, rare and common human traits. ated with decreased serum liver enzyme levels (alanine
Our understanding of the genetic underpinnings transaminase (ALT) and aspartate aminotransferase
Lipoproteins are complex
of NAFLD susceptibility, progression and outcomes (AST)) and a reduced risk of nonalcoholic steato­hepatitis
particles with a core containing has exponentially increased in the past few years16. (NASH) resulting from production of an unstable pro-
cholesterol esters and Currently, at least five variants in different genes have tein with reduced function 44. HSD17B13 has been
triglycerides surrounded been robustly associated with the susceptibility to and demonstrated to have lipid droplet-​associated retinol
by a lipid membrane; they
progression of NAFLD, namely, PNPLA3, transmem- dehydrogenase activity, suggesting that it is probably
contain proteins called
apolipoproteins, which brane 6 superfamily member 2 (TM6SF2), glucokinase implicated in NAFLD via this enzymatic activity, rather
enable lipoprotein formation regulator (GCKR), membrane bound O-​acyltransferase than through its association with changes in hepatic fat
and function. domain-​containing 7 (MBOAT7) and hydroxysteroid accumulation45. Further functional studies are required
17β- dehydrogenase (HSD17B13)16. to clarify the role of this gene.
The PNPLA3 I to M substitution at position 148 Several other genetic variants implicated in the regu­
(rs738409 C>G encoding for PNPLA3 I148M) was lation of insulin signalling, adipokines and myo­kines,
oxidative stress, lipid metabolism and inflammation have
Box 1 | What is naFlD?
been demonstrated to be associated with NAFLD suscep-
tibility and progression, as reviewed elsewhere13,16,46. The
Nonalcoholic fatty liver disease (NAFLD) is defined as fat accumulation in at least 5% focus of this Review, however, is on pleiotropy between
of hepatocytes, without excess alcohol intake (daily intake <30 g in men and <20 g NAFLD and other interrelated diseases.
in women)1,158. The spectrum of disease extends from bland, simple steatosis to liver
fat accompanied by inflammation and ballooning hepatocyte cell death termed
Shared genetics with other liver diseases
nonalcoholic steatohepatitis (NASH). NASH can progress to fibrosis, cirrhosis and
hepatocellular carcinoma158. The prevalence of NAFLD is increasing dramatically
There is mounting evidence for shared genetic modifiers
and currently affects a quarter of the world’s population. NAFLD is the most common and common pathophysiological pathways for steatosis,
cause of liver disease in Western countries and is projected to become the primary inflammation and fibrosis across different liver dis-
reason for liver transplantation2. Several modalities are available for NAFLD diagnosis. eases. Thus, many of the variants for NAFLD have also
Liver biopsy is the reference standard. Imaging modalities include ultrasonography, been implicated in other liver diseases and vice versa.
MRI proton density fat fraction (MRI-​PDFF)159, transient elastography160 with controlled Variants in PNPLA3 and TM6SF2 have been identified
attenuation parameter and blood tests such as liver enzyme levels (for example, by GWAS as risk loci for alcoholic cirrhosis47, but are also
AST and ALT), the NAFLD fibrosis score and collagen marker-​based scores (such as associated with hepatic steatosis in patients with viral
ADAPT) that incorporates various clinical and laboratory parameters in predictive hepatitis (both hepatitis B and hepatitis C), although
mathematical models161,162.
their role in fibrosis is less clear in this subgroup35,47–49.


Table 1 | Heritability levels of metabolic traits Polymorphisms in this region have since been demon-
strated by multiple independent groups to be associated
phenotype Heritability (%) number of participants ref. with hepatic inflammation and fibrosis in patients with
NAFLD 50 60 pairs of twins 18
NAFLD, as well as in those with viral hepatitis61–64.
38 33 cases and 11 control and parents 15

Genetic pleiotropy
22–34 3,973 individuals 20
Although GWAS has achieved success in identifying
26–27 6,629 individuals 42
loci involved in complex traits, the majority of variants
BMI 77–84 4,071 pairs of twins 21
demonstrate a modest effect size and account for only a
66–70 673 pairs of twins 163 minor fraction of the overall heritability. This implies the
existence of ‘missing heritability’65. NAFLD exemplifies
88.1 a
101 pairs of twins 164
this attribute, as all known variants only explain a small
48–61 495 pairs of twins 165
proportion of its known heritability (~10–20%)66. The
80 4,884 twins 166 missing heritability could relate to common variants
Type 2 26–61 606 pairs of twins 22 that do not reach genome-​wide significance, to rare and
diabetes structural variations that are not considered on commer-
16–34 13,888 pairs of twins 167
mellitus cial SNP arrays, and to gene–environment, gene–gene16
58 514 pairs of twins 168
and epigenetic programmes67. Hence, the complete
17–76 44 pairs of twins 169 genetic architecture of complex traits are probably the
30–69 5,810 individuals from 942 families 170 result of a very large number of variants68.
A study published in 2017 highlighted the contri-
Hypertension DBP 48–60, SBP 34–67 1,244 pairs of twins and 333 of their 23
butions of gene–environment interactions to NAFLD
with a synergistic interaction between obesity and
DBP 37–52, SBP 19–56 703 pairs of twins 171
three NAFLD risk variants (PNPLA3 I148M, TM6SF2
DBP 57.7, SBP 57.1 101 pairs of twins 164
E167K and GCKR P446L) across the entire spectrum
24–37 1,006 pedigree participants 172 of NAFLD69. Changes in epigenetic reprogramming in
response to environmental and nutritional conditions
Blood lipids 58–66 172 pairs of twins 24
might also account for a substantial fraction of the miss-
46–64 495 pairs of twins 165
ing heritability of multiple complex diseases67, although
44.5–84 101 pairs of twins 166
their role in NAFLD remains to be defined.
Coronary 38–57 20,966 pairs of twins 25 Arising from this burgeoning area of complexity
artery disease unexplained by GWAS variants is the identification of
53–58 7 ,955 pairs of twins 173
potential pleiotropic effects of known variants. In con-
DBP, diastolic blood pressure; NAFLD, nonalcoholic fatty liver disease; SBP, systolic blood pressure. trast to studying single phenotypes that have been the
Heritability of body weight as cardiometabolic risk factor.
focus of most previous GWAS, in the past few years
there has been a shift towards broad multi-​trait analyses.
Notably,  most of these studies were conducted in ‘Pleiotropy’ is a broad term with many aspects and refers
European populations. to the phenomenon of a gene or genetic variant influenc-
An MBOAT7 variant, encoding a protein in the ing multiple traits70 (Box 2). Two main forms of biological
Lands cycle and involved in the remodelling of arachi- pleiotropy, genic and allelic, can indicate shared biology
donic acid to phosphatidylinositol, was identified by between traits. Genic pleiotropy refers to the altered
GWAS in alcoholic cirrhosis47. The same variant has function of a gene that influences multiple traits, whereas
been associated with the entire histological spectrum of allelic pleiotropy refers to the pheno­menon of one vari­
NAFLD in individuals of European descent50, the risk ant influencing multiple traits. Pleiotropy in which the
of NAFLD-​related hepatocellular carcinoma in an Italian effect of a variant on one trait is secondary to its effects
cohort51, as well as liver injury (liver inflammation and on another is termed mediated pleiotropy71.
fibrosis) in patients with viral hepatitis (both hepatitis B It has been suspected that pleiotropy is common in
and hepatitis C)52,53. MBOAT7 is highly expressed in the human genome given that a finite number of gene
inflammatory and immune cells50,52. Similarly, another products mediate the entire spectrum of genetic influ-
variant rs4374383 G>A in the MER proto-​oncogene, ences on health and disease72,73. However, a systematic
tyrosine kinase (MERTK) gene, encoding a primary analysis of pleiotropy has only been feasible74 since a
The metabolic process phagocytic receptor in macrophages implicated in HSC wealth of information became available from compre-
of synthesizing fatty acids activation, has been shown by GWAS to be protective hensive publicly available data sets such as the GWAS
from acetyl-​CoA subunits for against fibrosis in patients of European descent with catalogue and the UK Biobank and electronic health
storage as fat.
chronic HCV infection54. This variant was later demon- records75. Analysis of these data sets has demonstrated
Lands cycle strated to display similar effects in NAFLD and pre- that pleiotropy is a common, if not ubiquitous phenom-
A metabolic remodelling dicted 9-year incident NAFLD and T2DM55,56. Soluble enon, with a study in 2018 estimating that up to half
pathway in the endoplasmic circulating MERTK has been observed to be elevated the genes in the GWAS catalogue are pleiotropic74. This
reticulum. The cycle is one (circulating and urinary levels) in patients with dia- proportion will probably increase as additional studies
of the major routes for acyl
remodelling to modify the
betic nephropathy57. Genetic variation in the interferon are added to these databases. As expected, pleiotropy is
fatty acid composition (IFN)-λ3/IFN-​λ4 region that regulates innate immunity commonly found for variants associated with traits in
of phospholipids. was initially identified by GWAS for HCV clearance58–60. the same ‘domain’ (a category of related traits (diseases);

Box 2 | What is pleiotropy? between hepatic steatosis and related metabolic traits,
namely, lipid traits (HDL cholesterol (rG = 0.451) and
Pleiotropy, a common phenomenon, is considered to be triglycerides (rG = 0.678)), blood pressure (rG = 0.444),
present when one genetic locus influences more than one BMI (rG = 0.534) and T2DM (glucose (rG = 0.716) and
phenotype. Two subtly distinct forms of pleiotropy have HbA1c (rG = 0.588))14. In another cross-​sectional analy­
been suggested: genic pleiotropy and allelic pleiotropy.
sis of a prospective cohort including 156 twins and
Genic pleiotropy refers to altered function of a gene that
influences multiple traits. Allelic pleiotropy refers to the families, a microbial metabolite, 3-(4-hydroxyphenyl)
situation in which one gene variant influences multiple lactate, was associated with liver fibrosis (defined as
traits. Pleiotropy in which the effect of a variant on one a measurement of >4.17 kPa on magnetic resonance
trait is secondary to the effects on another trait is termed elastography, corresponding to the 95th percentile) and
mediated pleiotropy. Another distinction is horizontal with the abundance of several gut microbiota species
versus vertical pleiotropy. Vertical pleiotropy refers to (for example, Firmicutes, Bacteroides caccae, Clostridium
genetic variants that influence multiple biomarkers on spp. and Escherichia coli) related to fibrosis. The rG of
the same pathway from exposure to disease. Horizontal 3-(4-hydroxyphenyl) lactate was 0.54–0.57 for hepatic
pleiotropy refers to the situation in which a genetic fibrosis and steatosis, respectively86. Thus, these find-
variant influences distinct pathways that are causal in
ings provide a strong link between serum metabolites,
the traits 70,71.
the gut microbiome and shared gene effects on hepatic
steatosis and fibrosis86.
for example, immune diseases), which might be acting
via dependent or independent biological pathways. For NAFLD risk variants to related diseases. In spite of
instance, shared genetic variants have been reported strong epidemiological evidence that NAFLD is signi­
between immune-​related diseases (such as type 1 dia­ ficantly correlated with T2DM and CVD87, there seems,
betes mellitus, coeliac disease, Crohn’s disease and to date, only modest overlap in genome-​wide significant
rheumatoid arthritis)76, across psychiatric disorders associations. Furthermore, known NAFLD risk variants
(including, for example, autism spectrum disorder, show divergent metabolic and disease effects (Fig. 2).
learning difficulties, major depressive disorder and A meta-​analysis42 of a GWAS in 7,176 individuals with
schizophrenia)77 and between chronic obstructive pul- NAFLD has suggested that the PNPLA3 variant is not
monary disease and pulmonary fibrosis78. Pleiotropy associated with serum lipid levels or glycaemic traits
can also be identified at the level of individual variant (blood glucose or homeostatic model for insulin resist-
alleles, or genetic correlations between diseases can be ance (HOMA-​IR)). By contrast, NCAN–TM6SF2 is asso-
estimated genome-​wide to calculate the proportion of ciated with reduced triglycerides and LDL cholesterol
shared associated loci between traits79. levels, with no effect on glycaemic traits. The NAFLD
Early investigations of human pleiotropy of complex risk allele at GCKR is associated with higher levels of
phenotypes were rooted in comparing cross-​phenotype plasma LDL cholesterol and triglycerides and lower fast-
associations from GWAS and candidate genes across ing glucose and HOMA-​IR42. However, a fine mapping
seemingly discrete traits. As a complement to GWAS, study published in 2018 in 81,412 patients with T2DM
phenome-​wide association studies (PheWAS) have become and 370,832 healthy individuals as controls of diverse
an additional tool to investigate pleiotropy, helped by the ancestry confirmed the previous association of the
availability of biorespositories with deep clinical pheno- GCKR rs1260326 variant, but also demonstrated asso-
typing80. PheWAS is used as a reverse GWAS for iden- ciations for PNPLA3 rs738409 and TM6SF2 rs58542926
tifying comprehensive genetic associations between a with T2DM88.
variant or variants and hundreds or thousands of pheno­ An exome-​wide association study in 5,643 individu-
types80, providing a paradigm for novel genetic discover- als with replication in 4,666 individuals also showed that
Phenome-​wide association ies (Fig. 1) and giving hope to drug discovery by helping the variant at TM6SF2 influences total cholesterol levels
studies with drug repurposing and predicting adverse effects. and is associated with coronary artery disease (CAD)36.
(PheWAS). An unbiased
The same opposite effect of the TM6SF2 E167K variant
systematic approach to test for
associations between a specific NAFLD and metabolic traits. Studying shared genet- on NAFLD and CAD was observed in another study in
genetic variant or series of ics is highly relevant to NAFLD as a heritable complex 1,819 individuals39. In a fourth analysis that included
variants, and a wide range trait with close inter-​relationships to the dysfunctional 60,801 patients with CAD and 123,504 healthy indi-
of phenotypes in large cohorts. milieu that characterizes the metabolic syndrome. viduals as controls, the protective effect of the TM6SF2
Gene effects
For example, elevation in the liver enzymes ALT and variant on CAD was confirmed and interestingly showed
The estimation of the genetic γ-​glutamyltransferase (GGT) is now suggested to be similar but modest protective effects for PNPLA3
determination for a particular a robust marker of metabolic derangement81. NAFLD rs738409 that was more profound under the recessive
trait using mathematical is the most common cause of their elevation 82–84. model. The MBOAT7 rs641738 variant had no effect89.
models that allows one
An earlier study of 362 twins suggested that GGT has A similar pattern was observed in a smaller cohort of
to distinguish between
environmental and genetic statistically significant shared genetic co-​determination 270 patients undergoing coronary angiography, since
contributions. with traits of the metabolic syndrome, that is, insulin the PNPLA3 rs738409 G allele associated with severity
resistance, hypertension and levels of triglycerides of liver disease had a fairly favourable cardiovascular
HOMA-​IR and LDLs85. This finding, as expected, was confirmed risk profile90.
Homeostatic Model
Assessment of Insulin
in a cross-​s ectional study published in 2016 that A larger exome-​wide association study published in
Resistance, a surrogate included 45 monozygotic and 20 dizygotic twin pairs. 2017 including >300,000 participants, with replication
measure of insulin resistance. That study demonstrated substantial shared gene effects (rG) in >280,000 individuals extended these findings showing


Mendelian randomization
that both TM6SF2 rs58542926 and PNPLA3 rs738409 risk of acne, gout and gallstones. These associations were
studies are associated with lower lipids levels and a lower risk independent of elevated liver enzyme levels93.
An analysis that incorporates of CAD, but an increased risk of fatty liver and T2DM91. Another PheWAS-​derived association is that the
genetic variants that are Consistently, a study published in 2018 in 4,081 adults in a HSD17B13 rs72613567:TA variant, in addition to asso-
predicted to be independent
population-​based cohort with median follow-​up of ciations with reduced AST and ALT levels, is also associ-
of confounding factors into
epidemiological studies as 11.3 years demonstrated that, although the PNPLA3 ated with higher platelet counts, probably reflecting the
instrumental tools to infer rs738409 G allele is associated with a fourfold increase in association with chronic liver disease, but not with any
causality of a risk factor the hazard of liver disease-​related mortality, it is associ- other phenotype44.
or of a biomarker in a
ated with a reduced risk of death from CAD and overall
particular disease.
mortality92. Causal relationships between traits
In total, known NAFLD loci are associated with Mendelian randomization approaches. There is strong
increased risk of fatty liver and T2DM, but with decreased epidemiological evidence that adiposity is associated
serum lipid levels and a reduced risk of CAD. with NAFLD and cardiometabolic disease12. However,
these associations were generally derived from observa-
PheWAS of NAFLD risk variants. PheWAS has added to tional studies in which it would not have been possible to
knowledge garnered from the studies discussed above distinguish whether NAFLD is an ‘upstream’ causal fac-
with confirmation of the pleiotropic effects of PNPLA3 tor or a ‘downstream’ consequence, or which would have
rs738409 (ref.93). In that study, integrated data from four been confounded by factors associated with both the
cohorts with detailed health information from 697,815 exposures and the outcomes. Genetic methods, in par-
individuals investigated 145 mapped disease end points. ticular Mendelian randomization studies, have the potential
PNPLA3 rs738409 G as expected was associated with ele- to unravel causality because genetic variants are present
vated liver enzyme levels, an increased risk of T2DM and from birth and therefore unlikely to be confounded by
a lower risk of high cholesterol and intake of cholesterol-​ environmental factors. In such analyses, information on
lowering medications. Novel associations were noted genetic variants is used to determine whether the asso-
including that these individuals were more prone to ciation between a risk factor (non-​genetic) for a disease
develop NSAID-​induced liver injury, but had a decreased and disease-​related outcomes goes beyond association
to causation (Fig. 3).
Mendelian randomization analysis has become popu­
lar and efficient for a plethora of genetic data in the
post-​GWAS era. This approach is particularly useful for
Variant inferring credible causal associations when randomized
SNP genotype clinical trials are not feasible. For example, a functional
T variant in the phospholipase A2 group VII (PLA2G7) gene
Control Cases A that encodes lipoprotein-​associated phospholipase A2
(Lp-​PLA2) was not associated with any of the major
vascular events (including vascular death, myocardial
infarction and stroke) in 91,428 individuals from the
Phenotype 1 Phenotype 2 China Kadoorie Biobank94. These findings are consist-
ent overall with results of phase III trials of the Lp-​PLA2
inhibitor darapladib, indicating that considering this
genetic analysis earlier could have saved billions of dollars,
and time95. As with other statistical analyses, Mendelian
GWAS chip Phenotype 3 randomization relies on the accuracy of the assumption
that the instrumental tools used have no causal pathway to
Variants associated with a phenotype Phenotypes associated with a variant the outcome, other than via the exposure (no direct asso-
10 10 ciation with outcome beyond the exposure). Pleiotropy is
another confounder in such an analysis.
–log P

–log P

5 5
Adiposity and metabolic traits. A Mendelian random-
ization study using the fat mass and obesity-​associated
0 0 FTO gene variant rs9939609 as an instrumental variable

for BMI has provided evidence for a causal relationship

NA lar
a D
fe es
rm mo us
Tr olo ry
Ha N cer l
at pla s
Hy opo stic

ge n
rte tic
ly ca
em eo ide

Di nsio

at na
In bet
De Pul tio

ig gi

pe ie

between adiposity and elevated ALT and GGT levels96.
i ov

Similarly, adiposity has a causal relationship with meta­


Phenotypes bolic syndrome, T2DM, dyslipidaemia, hypertension

and CVD96.
Fig. 1 | schematic representation of a genome-​wide association study and a
phenome-​wide association study. A genome-​wide association study (GWAS) starts
with a phenotype of interest followed by unbiased analyses of hundreds of thousands to ‘Favourable adiposity’ genes. Although obesity is a major
millions of common genetic variants across the entire genome for association with the risk factor for many metabolic diseases, it does not always
phenotype. A phenome-​wide association study (PheWAS) starts with one or more genetic entail adverse metabolic consequences. Reflecting this
variants of interest and systematically analyses multiple phenotypes for association with phenomenon, obesity is currently classi­fied as metabol-
the variant. NAFLD, nonalcoholic fatty liver disease; SNP, single-​nucleotide polymorphism. ically unhealthy and healthy obesity (MUO and MHO,

NAFLD lipoproteins by variants in genes such as PNPLA3 and

Triglycerides MBOAT7 TM6SF2 that are associated with lower lipid levels
and risk of CAD, these variants are associated with
an increased risk of fatty liver and T2DM. By con-
trast, the peripheral lipolysis-​related genes (LPL and
TM6SF2 ANGPTL4) are associated with reduced lipid levels
and risks of both T2DM and CAD, but have no effect
Stroke on liver fat91. However, Angptl4 deletion protects against
fatty liver development in response to a high-​fat diet in
mouse models107.
Causally related traits
DILI NAFLD and T2DM. On the basis of Mendelian random­
ization, a causal association between GGT and insu-
lin resistance and T2DM has been suggested in some
reports108,109, but not others110. In another study, Mende­
lian randomization analysis using a genetic risk score
Gallstones (PNPLA3, TM6SF2, GCKR and MBOAT7) for steatosis
as a proxy demonstrated a causal association between
Gout hepatic fat and fibrosis development, independent
Liver enzymes of hepatic inflammation. This accumulation of hepatic
Platelet count fat and progression of fibrosis had secondary wors-
ening effects on insulin resistance and diabetes 111.
Fig. 2 | the pleiotropic effects of naFlD risk loci. A summary of pleiotropic effects Likewise, a family history of diabetes was significantly
for all robustly associated candidate genes for nonalcoholic fatty liver disease (NAFLD). associated with the presence of NASH and fibrosis
On the right side are shown all associated candidate genes for NAFLD. Thick lines (OR 1.51, 95% CI 1.01–2.25 (P = 0.04), and OR 1.49,
connect these candidate genes with other corresponding traits. This figure is a schematic 95% CI 1.01–2.20 (P = 0.04), respectively)112, whereas
representation and should not be interpreted as a formal directed analysis graph. DILI,
NAFLD was associated with an increased risk of CVD
drug-​induced liver injury ; IHD, ischaemic heart disease.
and any cancer diagnosis, and mortality among people
with T2DM113.
respectively); the latter represents ~45% of obese indi-
viduals97,98. By contrast, up to 30% of individuals of Lipids and T2DM. LDL cholesterol and statin use have
normal weight demonstrate cardiometabolic risk fac- directionally opposite effects on T2DM and CVD. Statin
tors and are termed metabolically obese normal weight therapy, a cornerstone in the management of dyslipidae-
(MONW)99,100. Although no GWAS has been undertaken mia and in the prevention of CVD, has been reported to
to specifically explore genetic variants associated with increase the risk of new-​onset T2DM114. Similarly, mul-
MUO, MHO and MONW, multiple studies have identi- tiple studies have shown causal protective effects of LDL
fied variants that regulate body fat distribution, a major cholesterol against T2DM115–117. Likewise, a study of 15
determinant of these differences101,102. lipid loci, including GCKR, MLXIPL, APOB and CYP7A1,
A combined GWAS in 188,577 individuals identi- showed a statistically significant pleiotropic association
fied 53 loci that confer a higher risk of insulin resist- with glucose traits (fasting blood glucose levels, HbA1c
ance phenotypes (higher fasting insulin and triglyceride and HOMA-​IR), with opposite allelic directions for
levels and lower HDL cholesterol) and cardiometabolic effects on glucose levels and dyslipidaemia118.
disease, but lower levels of peripheral adiposity103. This By contrast, no causal relationship was found
finding suggests that limited storage capacity of periph- between elevated circulating triglyceride levels and risk
eral adipose tissue will lead to ectopic fat accumulation of T2DM or elevated fasting glucose or insulin levels
in tissues such as the liver and skeletal muscle. Ectopic in non­diabetic individuals119. This finding implies that
fat is considered pivotal for the development of whole-​ elevated circulating triglyceride levels are secondary to
body and hepatic insulin resistance, lipotoxicity and, T2DM rather than causal119.
ultimately, CVD103. By contrast, 14 so-​called favourable
adiposity genes that are associated with higher adiposity NAFLD and cardiovascular disease. In spite of strong
(subcutaneous fat), but a lower risk of liver fat, T2DM, epidemiological evidence, the causal association
hypertension and heart disease have been identified between NAFLD and CVD is still unclear. A study in
(including PPARG and LYPLAL1; Fig. 4)100,104,105. 19,925 participants from the Guangzhou Biobank
Cohort Study (GBCS) used SNPs in HSD17B13/
Divergent metabolic effects. The role of triglycerides in MAPK10 (rs6834314) and PNPLA3 (rs738409) as instru-
the pathogenesis of metabolic disorders might also be mental variables in an analysis that suggested that ALT
mechanism-​dependent. Triglyceride levels depend on reduces the risk of ischaemic heart disease, probably
two main processes, hepatic production and periph- through reducing triglyceride levels120. Another study on
eral lipolysis106. An exome-​wide association study in diabetes (DIAGRAM) and CAD or myocardial infarc-
>300,000 individuals demonstrated that, in contrast to tion (CARDIoGRAMplusC4D 1000 Genomes) con-
the pattern of hepatic production of triglyceride-​rich firmed that high ALT levels increase the risk of T2DM


Assumption 1: No association
No association between genetic variants and confounders Confounders
Assumption 2: Robust association
Robust association between genetic
variants and exposure
Instrumental variable Outcomes
SNP or polygenic risk Exposure Cardiovascular disease
score for NAFLD NAFLD Type 2 diabetes mellitus

Assumption 3: No independent association

Genetic variants influence the outcome via the exposure, not via other mediators

Mendelian randomization Randomized controlled trial

Confounders Confounders
Effect allele equal amongst Other allele Intervention equal amongst Placebo
groups groups

Outcome comparison between groups Outcome comparison between groups

Fig. 3 | schematic representation of Mendelian randomization and its use in naFlD research. a | The three key
assumptions underpinning a Mendelian randomization analysis for a causal association between nonalcoholic fatty liver
disease (NAFLD) and diabetes and coronary artery disease. The assumptions are that: the instrumental variable (a genetic
variant or a combination polygenic score) must associate reliably with the exposure (that is, NAFLD), illustrated by the solid
arrow; the instrumental variable must not associate with confounders, such as obesity , hypertension or dyslipidaemia or
even unknown confounders; and the genetic variant(s) is not associated with outcome except via the exposure of interest,
illustrated by dotted arrows. b | Comparison of the design of a Mendelian randomization approach and a randomized
controlled trial. In a randomized controlled trial, individuals are randomly assigned to either the intervention arm or
the placebo arm; thus, theoretically confounders are equally distributed among groups. Analogous to this design, in
Mendelian randomization, segregation occurs at conception randomly assorting alleles (one representing the effect
and the other the control). SNP, single-​nucleotide polymorphism.

(OR 2.68, 95% CI 1.48–4.86) with a trend to a decrease in associated with other conditions or states such as BMI,
the risk of an association with ischaemic heart disease121. CVD and lipid levels16. This aspect is probably due to the
Similarly, another cohort study in the Danish gene­ decreased power of studies in NAFLD, stemming from
ral population (n = 94,708; ischaemic heart disease in substantially smaller cohorts than those available for
10,897) demonstrated that PNPLA3 I148M, as a proxy investigations of other phenotypes. Moreover, NAFLD
for liver fat content, is not associated with the risk of risk variants have shown pleiotropic effects and pleio­
ischaemic heart disease122. tropy is considered a limitation of Mendelian random-
Although the causal association between NAFLD ization analysis123. Finally, there are multiple methods
and T2DM seems clear, a causal association with CVD for the diagnosis of NAFLD, ranging from liver biopsy
is less obvious. There are several possible explanations to imaging modalities with variable accuracy, and this
for these findings. First, it could be that excess hepatic might have inhibited identification of novel risk vari-
fat is unlikely to cause CVD and the reported epide- ants. Notably, only patients with NASH, the severe and
miological association is probably confounded by obe- inflammatory phenotype of NAFLD, rather than those
sity or T2DM. The association between TM6SF2 and with simple steatosis, had increased CVD mortality
PNPLA3 risk alleles with increased liver fat and a low (increase by 86%; standardized mortality ratio 1.86,
risk of cardio­vascular end points91 might also argue 95% CI 1.19–2.76; P = 0.007) compared with individ-
against liver fat as a cause of CVD. Nevertheless, these uals from the general population matched for sex and
findings might be confounded by the fact that these risk age124,125. Thus, it would be expected that with increases
alleles are associated with reduced plasma levels of both in cohort size and the feasibility of larger-​scale genomic
cholesterol and triglycerides that are classic risk factors studies with dramatically decreased analysis costs, both
for CVD91. the number of statistically significant variants and the
A second possible explanation for the lack of causal­ percentage variance they account for in NAFLD herita­
ity is the inappropriateness of the instruments used. bility will grow. This progress will enable more precise
In contrast to other complex metabolic traits, genetic and powerful genetic tools for liver fat content interro-
susceptibility to NAFLD is currently associated with gation (and potentially for the detection of liver fibrosis),
only a handful of variants with relatively larger pheno- to delineate better the correlation between NAFLD and
typic effects compared with the hundreds of variants CVDs and other metabolic traits.

‘Favourable adiposity’ genes pattern among the quantitative traits. Interestingly,

• IRS1 • ARL15 • ANKRD55 • GRB14 • FAM13A • RSP03 GCKR (2p23.3), ABO (9q34.2) and RGS12 (4p16.3)
and/or were associated with 9–18 traits across multiple catego-
• PEPD • PDGFC MAP3K1 • PPARG • LYPLAL1 • TET2 ries, all including liver-​related traits. The specific role of
these variants in NAFLD requires further investi­gation.
PNPLA3 was associated with liver and haemato­logical
traits, but the haematological traits might be secondary
to the liver disease126. In a separate PheWAS utilizing data
from the National Health and Nutrition Examination
Surveys (NHANES) for 80 SNPs and 1,008 phenotypes,
13 SNPs showed evidence of pleiotropy, including for
MHO MUO variants in GCKR127.
Multiple earlier GWAS identified associations
between intronic variants in the FTO gene and obesity
Peripheral fat storage Peripheral fat storage (in both children and adults), as well as with T2DM128,129.
Other studies using multiple cell lines and mouse mod-
↓ Liver fat ↑ Liver fat els have demonstrated an association of these variants
↓ Blood pressure ↑ Blood pressure with obesity via regulation of Iroquois homeobox 3
↓ Triglycerides ↑ Triglycerides
↓ Insulin resistance ↑ Insulin resistance
(IRX3) and IRX5, rather than through modulation
↓ Fasting glucose level ↑ Fasting glucose level of FTO itself130,131. In accordance with these findings,
↓ Meta-inflammation ↑ Meta-inflammation a meta-​analysis of the populations in a PheWAS explor-
ing electronic health data showed that FTO variants are
↓ NAFLD ↑ NAFLD associated not only with obesity and T2DM but also
↓ Type 2 diabetes mellitus ↑ Type 2 diabetes mellitus with NAFLD, sleep apnoea, fibrocystic breast disease
↓ Hypertension ↑ Hypertension and Gram-​positive bacterial infections132. In another
↓ Cardiovascular disease ↑ Cardiovascular disease
PheWAS, the proprotein convertase subtilisin/kexin
type (PCSK9) missense variant rs11591147 was investi­
Fig. 4 | Genetic variants related to the ‘metabolically healthy obese’ phenotype.
gated in 337,536 individuals of British ancestry in the
In spite of similar total fat mass, differences in body fat distribution probably shape
UK Biobank. The T allele of rs11591147 showed a
differences between the two obesity phenotypes. Individuals with metabolically healthy
obesity (MHO) have a normal blood pressure and serum lipid profile, preserved insulin protective effect against hyperlipidaemia, coronary
sensitivity and a lower degree of meta-​inflammation and fatty liver than individuals with heart disease and ischaemic stroke, but was associ-
metabolically unhealthy obesity (MUO). These changes probably underlie protective ated with an increased risk of T2DM (OR 1.24 ± 0.10;
mechanisms against the complications of obesity, such as cardiovascular diseases, P = 1.98 × 10−7)133; effects on NAFLD were not reported.
nonalcoholic fatty liver disease (NAFLD) and type 2 diabetes mellitus. Genetic However, previous data have shown that elevated intra-
background is probably implicated in this difference. hepatic or circulating PCSK9 levels are significantly
associated with increased hepatic steatosis and fibro-
sis stage134, whereas network analyses have shown that
Overall, these findings indicate more complex PCSK9 is implicated in the pathogenesis of NAFLD135.
causal relationships among adiposity, NAFLD, lipid Data from human and animal studies suggest that
levels, T2DM and CVD than anticipated. In many PCSK9 inhibitors ameliorate NAFLD and decrease
cases, the associations might be directionally differ- hepatic steatosis and inflamamtion136.
ent, with divergent effects. These processes seem to
be highly regulated and fine-​tuned to maintain meta­ Other metabolic pathways relevant to NAFLD. Relevant
bolic health. Interestingly, genetic approaches can factors in maintaining metabolic homeostasis to pre-
help to better untangle the web of associations among vent NAFLD development, in addition to the neuro­
the different related metabolic traits with consequent endocrine axis, include dietary intake, components of
appropriate dissection of the underlying shared and the enterohepatic circulation (including bile acids and
differential pathways. The study of the role of NAFLD their metabolites) and the gut microbiota137. Studies
in cardiovascular risk has shown that biologically rele- aiming to characterize genetic effects on these traits
vant insights can be obtained by investigating individual with insights into probable implications for NAFLD
genetic loci that are pleiotropic or have different allele are now emerging.
directions that affect the outcome. Clearly, more work Bile acid regulation, a novel therapeutic target for
is needed. NAFLD5, might have shared genetics with other meta­
bolic traits. For instance, fibroblast growth factor 21
Insights from other diseases to NAFLD (FGF21), a hormone secreted principally by the liver
Multi-​trait GWAS and PheWAS. Studies of other that regulates bile acid homeostasis, is in phase II trials;
metabolic traits have been informative for NAFLD treatment with FGF21 analogues is associated with
pathogenesis. A multi-​trait GWAS published in 2018 reduced liver fat in patients with NASH (n = 75 in the
investigated 58 quantitative traits in 162,255 Japanese formal analysis)138. A study of 451,000 individuals from
individuals. Liver-​related traits investigated were mainly the UK Biobank demonstrated that the FGF21 rs838133
liver enzymes (including serum levels of GGT, AST A allele is associated with pleiotropic effects, including a
and ALT). Of the unique loci, 41% showed a pleiotropic higher percentage sugar intake, blood pressure and


waist-​to-hip ratio, despite an association with a lower NAFLD. In this context, studies of pleiotropy provide
total body fat percentage, with neutral effects on BMI a paradigm for novel genetic discovery to both validate
and T2DM139. Regarding its role in NAFLD, preliminary GWAS findings and identify novel associations. In turn,
findings (reported in abstract form) in a small num- this inquiry should help generate new hypotheses and
ber of patients with NAFLD (n = 20) demonstrate that help elucidate to a greater depth, shared and differential
the rs838133 A allele is associated with the severity of pathogenic mechanisms in complex diseases and untan-
hepatic fibrosis140. Similarly, in a study of 340 Chinese gle causality between related traits. Such investigation
individuals without diabetes, the rs838133 minor will also enable vertical forms of pleiotropy (the genetic
allele was associated with increased ALT levels141. An variant has cross-​phenotype effects via one biological
exome-​chip association analysis in 5,169 Chinese indi- pathway) to be distinguished from horizontal forms
viduals also showed that a missense variant of GCKR, (the genetic variant has multiple separate phenotypic
rs1260326 (p.Pro446Leu) is associated with circulating effects via divergent biological pathways)152. Although
FGF21 levels142. Likewise, variants in the gene encoding still in its infancy, integration of these genetic data with
cholesterol 7 α-​hydroxylase (CYP7A1), a rate-​limiting inputs across several -omics, including the epigenome,
enzyme in bile acid synthesis, were associated with transcriptome, metabolome, proteome and microbiome,
increased levels of LDL cholesterol, the risk of myocar- so-​called multi-​omics approaches, holds great promise
dial infarction and gallstone disease, but not liver fat143. as we move into the era of precision medicine.
To summarize, although not well-​defined, current data Complex diseases such as NAFLD are probably a het-
indicate that genetic variants in genes regulating bile acid erogeneous phenotype of multiple other sub-​phenotypes.
homeostasis have diverse metabolic effects, including Genetic information could therefore aid disease reclas-
effects on NAFLD. sification to reflect functional consequences and thereby
The human microbiome has been implicated in help improve clinical management. For example, a study
complex human traits including NAFLD137,144. In the using high-​dimensional electronic medical records and
past few years, a role for human genetic variation in genotype data from 11,210 individuals identified three
influencing inter-​individual differences in gut micro- distinct subgroups of T2DM, with distinct clinical phe-
biomes has been revealed145–147. Whether genetic vari­ notypes153. Another study identified five clusters of
ants that shape microbiome composition have a role T2DM loci and traits, one of which disrupts liver lipid
in NAFLD and other metabolic traits is still unclear. metabolism (low triglyceride production)154. Similarly,
However, a metagenome-​w ide association study genetic adiposity variants have been classified based on
demonstrated common denominators and features of their co-​association with BMI and waist-​to-hip ratio, and
the gut microbiome in patients with atherosclerotic have been used to differentiate four subtypes of anthro-
CVD, obesity, T2DM and liver cirrhosis, but not in pometry and modes of fat deposition155. The subtypes
patients with autoimmune diseases such as rheuma- were found to have distinct metabolic implications and
toid arthritis. Both cirrhosis and atherosclerotic CVD to distinguish metabolically unfavourable and metabol-
demonstrated less fermentative and more inflammatory ically favourable adiposity with high precision155. Such
microbiota signatures148. data support the use of genetics to deconstruct NAFLD
In addition, a multi-​trait GWAS of macronutrient heterogeneity and its complex relationship with other
intake identified 12 novel loci for increased dietary intake metabolic disorders. As the number of known NAFLD-​
of carbohydrate, fat and protein, which showed overlap associated variants increases, the construction of poly-
with the risk loci for T2DM and CAD149. This associa- genic scores will also enable more accurate measures
tion awaits investigation in NAFLD and other cardio- of NAFLD genetic susceptibility that can help identify
metabolic diseases. Finally, altered retinol meta­bolism in relevant subgroups, and possibly predict disease pro-
the lipid droplets of HSCs and/or hepatocytes has been gression and regression. In addition, such information
implicated in the pathogenesis of NAFLD and NAFLD-​ should help in the prevention and treatment of disease,
related fibrosis150. This finding can explain, at least as well as in predicting adverse reactions to drugs.
partially, the influence of the PNPLA3 and HSD17B13 Another consequence of our current understand-
variants on liver fibrosis 33,34,45,151. Similarly, another ing of genetic influences on disease is the knowledge
study demonstrated alterations in the hepatic expres- that focusing on the effect of variants in a single dis-
sion of genes involved in retinol metabolism in NASH150. ease might be inadequate. Accumulating evidence
Collectively, these data suggest a role for altered retinol exists that many variants display strong associations
homeostasis in the pathogenesis of NAFLD, which in opposite directions (divergent effects) with multi-
requires further research. ple traits. This type of information is especially salient
Although current data are suggestive, further work in the context of identifying therapeutic targets and
is required to better characterize the role of risk vari- considering their potential for predicting function-​
ants in other metabolic pathways that are relevant to the based adverse events. For example, global inhibition
pathogenesis of NAFLD. of TM6SF2 to decrease serum lipid levels might pre-
vent CVD, but would be expected to result in hepatic
Translational implications and challenges steatosis and liver disease. Similarly, although non-​
Nonlinear and complex interactions among multiple fac- targeted inhibition of PNPLA3 could limit hepatic
tors that occur simultaneously at several molecular levels steatosis and its consequences, concomitant inverse
in diverse cell types, tissues and organs probably shape associations, including high plasma cholesterol levels
the ultimate outcome of complex diseases, including and acne, and perhaps CVD, might be expected, and

Fig. 5 | an illustration of how human genetics can

Control Cases guide the efforts to achieve personalized medicine.
A genome-​wide association study (GWAS) into the
heritability of a phenotype of interest (for example,
GWAS a metabolic trait) has identified a number of variants
that are associated with disease risk. Following this
SNP SNP initial step, the key is to prioritize the variants and
identify underlying causal mechanisms. The process
comprises an array of approaches including utilizing
Cellular studies
in vitro cellular and preclinical animal models and the
construction of human and mouse gene interaction
Preclinical models networks. Such a process can reveal the physiological
and pathological basis for the effects mediated by a
Human and mouse risk allele. Integration of this information helps proper
networks stratification of complex heterogeneous phenotypes
into sub-​clusters. Once a causal gene and pathway
Human physiology have been identified, the encoded protein can be
investigated as a potential therapeutic target. Revealing
Causal gene and disease pathway pleiotropy through phenome-​wide association (PheWAS)
data additionally helps accelerate the process of drug
PheWAS for variant in causal gene development and/or drug repurposing and the prediction
Revealing pleiotropy
10 of probable adverse or beneficial effects of the new
targets. NAFLD, nonalcoholic fatty liver disease;
SNP, single-​nucleotide polymorphism.
–log P

0 and molecular and clinical profiles. The latter is possible

given the current plethora of studies of targeted human
NA r
Di D

De lm s
at ary

lyc cal

Ha eo es

rte ic
Di ion

In ete
Pu iou

at sti

pe et
N id

rm on

Tr log

em pla

Hy poi

gene editing and tissue-​specific drug targeting156,157.





Extending the investigation of pleio­tropy to human


Phenotypes traits also holds the potential to uncover additional

associations (Fig. 5).
Although there remain numerous benefits to inves-
Accurate patient stratification Target validation guided by tigating human genetic pleiotropy, the field also faces
the human genetic data
multiple challenges, including in phenomics, standardiz-
High ing phenotyping, and integrating the wealth of genomic
Associated phenotypes

Efficacy development
Cluster 3
and phenomic data available for any particular individ-
ual. Statistical innovation and the development of novel
computational tools for handling the massive genotyped
data sets, deep learning and data mining are other chal-
Toxicity lenges. Increasing cross-​disciplinary collaborations to
Low High enable multi-​trait studies with enhanced power and
Cluster 1 Cluster 2 Target perturbation larger sample sizes of the cohorts under investigation will
additionally provide the opportunity to better leverage
large-​scale analysis of multi-​traits. Looking to the future,
Targeted personalized therapy perhaps advances in our understanding of the genet-
ics of complex diseases will offer a rich therapeutic
inference toolbox.

The trajectory in the prevalence of NAFLD suggests that
it is the liver disease of this century. NAFLD has a high
require long-​term cohort studies of at-​risk populations. heritable component and its genetic underpinnings are
Thus, balancing the benefit-​to-risk ratio will be critical beginning to be characterized, although missing heri­
as we move forward with the new discoveries. tability still exists. A body of literature suggests that
On the basis of current genetic knowledge, it can be NAFLD shares genetic associations, with some being
argued that understanding pleiotropic effects is pivotal causal, with other liver diseases and other metabolic
when considering novel therapeutic targets. Another traits and disorders. In the next few years the shared
observation from the data is that most genes, when genetics will be better characterized, leading to pivo­
targeted, will have pleiotropic effects (that is, adverse tal translational outcomes for both drug development
effects). Future research should, therefore, aim to unravel and drug repurposing, as well as for personalization
The systematic study of
underlying shared and differential molecular pathways of medicine.
phenomes, a set of various in different tissues and diseases, to enable therapeutic
phenotypes. interventions to be better tailored to individual risk, Published online xx xx xxxx


