Key Points
-
The architecture of inherited genetic susceptibility to cancer is defined by a spectrum of predisposition alleles that have differing frequencies and impact.
-
Genome-wide association studies (GWAS) provide an agnostic approach to the identification of genetic variation influencing cancer risk. For most cancers, GWAS have been performed, and hundreds of risk alleles have been identified, most of which are common and individually confer a modest increase in risk.
-
Most cancer risk loci identified through GWAS locate to non-coding regions of the genome and influence gene expression through diverse mechanisms.
-
As well as improving our understanding of cancer, information from GWAS has direct clinical relevance in identifying nongenetic aetiological risk factors, optimising population screening, identifying therapeutic targets, drug repositioning and prognostication.
-
Although challenging, deciphering the biological basis of identified associations is necessary to fully realise the potential of GWAS.
Abstract
Genome-wide association studies (GWAS) provide an agnostic approach for investigating the genetic basis of complex diseases. In oncology, GWAS of nearly all common malignancies have been performed, and over 450 genetic variants associated with increased risks have been identified. As well as revealing novel pathways important in carcinogenesis, these studies have shown that common genetic variation contributes substantially to the heritable risk of many common cancers. The clinical application of GWAS is starting to provide opportunities for drug discovery and repositioning as well as for cancer prevention. However, deciphering the functional and biological basis of associations is challenging and is in part a barrier to fully unlocking the potential of GWAS.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 /Â 30Â days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Houlston, R. & Peto, J. in Genetic Predisposition to Cancer 2nd edn (eds Eeles, R. A., Easton, D. F., Ponder, B. A. J. & Eng, C.) 235â248 (CRC Press, 2004).
Lichtenstein, P. et al. Environmental and heritable factors in the causation of cancer â analyses of cohorts of twins from Sweden, Denmark, and Finland. N. Engl. J. Med. 343, 78â85 (2000). Landmark paper estimating heritability of common cancers from an analysis of 44,788 twins.
Wiemels, J. L. et al. Prenatal origin of acute lymphoblastic leukaemia in children. Lancet 354, 1499â1503 (1999).
Peto, J. & Houlston, R. S. Genetics and the common cancers. Eur. J. Cancer 37 (Suppl. 8), S88âS96 (2001).
Anderson, D. E. Genetic study of breast cancer: identification of a high risk group. Cancer 34, 1090â1097 (1974).
Miki, Y. et al. A strong candidate for the breast and ovarian cancer susceptibility gene BRCA1. Science 266, 66â71 (1994).
Wooster, R. et al. Localization of a breast cancer susceptibility gene, BRCA2, to chromosome 13q12-13. Science 265, 2088â2090 (1994).
Hall, J. M. et al. Linkage of early-onset familial breast cancer to chromosome 17q21. Science 250, 1684â1689 (1990).
Peltomaki, P. et al. Genetic mapping of a locus predisposing to human colorectal cancer. Science 260, 810â812 (1993).
Lindblom, A., Tannergard, P., Werelius, B. & Nordenskjold, M. Genetic mapping of a second locus predisposing to hereditary non-polyposis colon cancer. Nat. Genet. 5, 279â282 (1993).
Kinzler, K. W. et al. Identification of FAP locus genes from chromosome 5q21. Science 253, 661â665 (1991).
Fishel, R. et al. The human mutator gene homolog MSH2 and its association with hereditary nonpolyposis colon cancer. Cell 75, 1027â1038 (1993).
Leach, F. S. et al. Mutations of a mutS homolog in hereditary nonpolyposis colorectal cancer. Cell 75, 1215â1225 (1993).
Cannon-Albright, L. A. et al. Assignment of a locus for familial melanoma, MLM, to chromosome 9p13-p22. Science 258, 1148â1152 (1992).
Hussussian, C. J. et al. Germline p16 mutations in familial melanoma. Nat. Genet. 8, 15â21 (1994).
Ballinger, M. L. et al. Monogenic and polygenic determinants of sarcoma risk: an international genetic study. Lancet Oncol. 17, 1261â1271 (2016).
Aaltonen, L., Johns, L., Jarvinen, H., Mecklin, J. P. & Houlston, R. Explaining the familial colorectal cancer risk associated with mismatch repair (MMR)-deficient and MMR-stable tumors. Clin. Cancer Res. 13, 356â361 (2007).
Peto, J. et al. Prevalence of BRCA1 and BRCA2 gene mutations in patients with early-onset breast cancer. J. Natl Cancer Inst. 91, 943â949 (1999).
Chubb, D. et al. Rare disruptive mutations and their contribution to the heritable risk of colorectal cancer. Nat. Commun. 7, 11883 (2016).
Anglian Breast Cancer Study Group. Prevalence and penetrance of BRCA1 and BRCA2 mutations in a population-based series of breast cancer cases. Br. J. Cancer 83, 1301â1308 (2000).
Lubbe, S. J., Webb, E. L., Chandler, I. P. & Houlston, R. S. Implications of familial colorectal cancer risk profiles and microsatellite instability status. J. Clin. Oncol. 27, 2238â2244 (2009).
Palles, C. et al. Germline mutations affecting the proofreading domains of POLE and POLD1 predispose to colorectal adenomas and carcinomas. Nat. Genet. 45, 136â144 (2013).
Weren, R. D. et al. A germline homozygous mutation in the base-excision repair gene NTHL1 causes adenomatous polyposis and colorectal cancer. Nat. Genet. 47, 668â671 (2015).
Swift, M., Reitnauer, P. J., Morrell, D. & Chase, C. L. Breast and other cancers in families with ataxia-telangiectasia. N. Engl. J. Med. 316, 1289â1294 (1987).
Renwick, A. et al. ATM mutations that cause ataxia-telangiectasia are breast cancer susceptibility alleles. Nat. Genet. 38, 873â875 (2006).
Meijers-Heijboer, H. et al. Low-penetrance susceptibility to breast cancer due to CHEK2*1100delC in noncarriers of BRCA1 or BRCA2 mutations. Nat. Genet. 31, 55â59 (2002).
Rahman, N. et al. PALB2, which encodes a BRCA2-interacting protein, is a breast cancer susceptibility gene. Nat. Genet. 39, 165â167 (2007).
Erkko, H. et al. A recurrent mutation in PALB2 in Finnish cancer families. Nature 446, 316â319 (2007).
Venter, J. C. et al. The sequence of the human genome. Science 291, 1304â1351 (2001).
Lander, E. S. et al. Initial sequencing and analysis of the human genome. Nature 409, 860â921 (2001).
International HapMap Consortium. The International HapMap Project. Nature 426, 789â796 (2003). Insights from the HapMap genotyping project, demonstrating that patterns of LD between common variants can allow design of arrays of 200,000â1,000,000 'tag SNPs' to capture a large proportion of common SNPs (â¼10 million).
Daly, M. J., Rioux, J. D., Schaffner, S. F., Hudson, T. J. & Lander, E. S. High-resolution haplotype structure in the human genome. Nat. Genet. 29, 229â232 (2001).
International HapMap Consortium. A haplotype map of the human genome. Nature 437, 1299â1320 (2005).
Price, A. L. et al. Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 38, 904â909 (2006).
Ioannidis, J. P., Ntzani, E. E., Trikalinos, T. A. & Contopoulos-Ioannidis, D. G. Replication validity of genetic association studies. Nat. Genet. 29, 306â309 (2001).
Lohmueller, K. E., Pearce, C. L., Pike, M., Lander, E. S. & Hirschhorn, J. N. Meta-analysis of genetic association studies supports a contribution of common variants to susceptibility to common disease. Nat. Genet. 33, 177â182 (2003).
The 1000 Genomes Project Consortium. A map of human genome variation from population-scale sequencing. Nature 467, 1061â1073 (2010). Initial findings from the 1000 genomes project, which characterized genetic variation in different populations after sequencing 1,092 individuals.
Huang, J. et al. Improved imputation of low-frequency and rare variants using the UK10K haplotype reference panel. Nat. Commun. 6, 8111 (2015).
McCarthy, S. et al. A reference panel of 64,976 haplotypes for genotype imputation. Nat. Genet. 48, 1279â1283 (2016). Demonstrating use of population reference haplotypes in imputation of GWAS arrays can allow genotype estimation at allele frequencies as low as 0.1%.
Easton, D. F. et al. Genome-wide association study identifies novel breast cancer susceptibility loci. Nature 447, 1087â1093 (2007). First breast cancer GWAS, describing the discovery of five risk loci.
Michailidou, K. et al. Genome-wide association analysis of more than 120,000 individuals identifies 15 new susceptibility loci for breast cancer. Nat. Genet. 47, 373â380 (2015).
Michailidou, K. et al. Large-scale genotyping identifies 41 new loci associated with breast cancer risk. Nat. Genet. 45, 353â361 (2013).
Amundadottir, L. T. et al. A common variant associated with prostate cancer in European and African populations. Nat. Genet. 38, 652â658 (2006).
Eeles, R. A. et al. Multiple newly identified loci associated with prostate cancer susceptibility. Nat. Genet. 40, 316â321 (2008).
Al Olama, A. A. et al. A meta-analysis of 87,040 individuals identifies 23 new susceptibility loci for prostate cancer. Nat. Genet. 46, 1103â1109 (2014).
Amos, C. I. et al. Genome-wide association scan of tag SNPs identifies a susceptibility locus for lung cancer at 15q25.1. Nat. Genet. 40, 616â622 (2008).
Wang, Y. et al. Rare variants of large effect in BRCA2 and CHEK2 affect risk of lung cancer. Nat. Genet. 46, 736â741 (2014).
Wang, Y., Broderick, P., Matakidou, A., Eisen, T. & Houlston, R. S. Chromosome 15q25 (CHRNA3-CHRNA5) variation impacts indirectly on lung cancer risk. PLoS ONE 6, e19085 (2011).
McKay, J. D. et al. Large-scale association analysis identifies new lung cancer susceptibility loci and heterogeneity in genetic susceptibility across histological subtypes. Nat. Genet. 49, 1126â1132 (2017).
Dunlop, M. G. et al. Common variation near CDKN1A, POLD3 and SHROOM2 influences colorectal cancer risk. Nat. Genet. 44, 770â776 (2012).
Houlston, R. S. et al. Meta-analysis of three genome-wide association studies identifies susceptibility loci for colorectal cancer at 1q41, 3q26.2, 12q13.13 and 20q13.33. Nat. Genet. 42, 973â977 (2010).
Jaeger, E. et al. Common genetic variants at the CRAC1 (HMPS) locus on chromosome 15q13.3 influence colorectal cancer risk. Nat. Genet. 40, 26â28 (2008).
Study, C. et al. Meta-analysis of genome-wide association data identifies four new susceptibility loci for colorectal cancer. Nat. Genet. 40, 1426â1435 (2008).
Tomlinson, I. P. et al. A genome-wide association study identifies colorectal cancer susceptibility loci on chromosomes 10p14 and 8q23.3. Nat. Genet. 40, 623â630 (2008).
Schumacher, F. R. et al. Genome-wide association study of colorectal cancer identifies six new susceptibility loci. Nat. Commun. 6, 7138 (2015).
Orlando, G. et al. Variation at 2q35 (PNKD and TMBIM1) influences colorectal cancer risk and identifies a pleiotropic effect with inflammatory bowel disease. Hum. Mol. Genet. 25, 2349â2359 (2016).
Amundadottir, L. et al. Genome-wide association study identifies variants in the ABO locus associated with susceptibility to pancreatic cancer. Nat. Genet. 41, 986â990 (2009).
Petersen, G. M. et al. A genome-wide association study identifies pancreatic cancer susceptibility loci on chromosomes 13q22.1, 1q32.1 and 5p15.33. Nat. Genet. 42, 224â228 (2010).
Childs, E. J. et al. Common variation at 2p13.3, 3q29, 7p13 and 17q25.1 associated with susceptibility to pancreatic cancer. Nat. Genet. 47, 911â916 (2015).
Wolpin, B. M. et al. Genome-wide association study identifies multiple susceptibility loci for pancreatic cancer. Nat. Genet. 46, 994â1000 (2014).
Abnet, C. C. et al. A shared susceptibility locus in PLCE1 at 10q23 for gastric adenocarcinoma and esophageal squamous cell carcinoma. Nat. Genet. 42, 764â767 (2010).
Helgason, H. et al. Loss-of-function variants in ATM confer risk of gastric cancer. Nat. Genet. 47, 906â910 (2015).
Purdue, M. P. et al. Genome-wide association study of renal cell carcinoma identifies two susceptibility loci on 2p21 and 11q13.3. Nat. Genet. 43, 60â65 (2011).
Gudmundsson, J. et al. A common variant at 8q24.21 is associated with renal cell cancer. Nat. Commun. 4, 2776 (2013).
Scelo, G. et al. Genome-wide association study identifies multiple risk loci for renal cell carcinoma. Nat. Commun. 8, 15724 (2017).
Kiemeney, L. A. et al. Sequence variant on 8q24 confers susceptibility to urinary bladder cancer. Nat. Genet. 40, 1307â1312 (2008).
Rothman, N. et al. A multi-stage genome-wide association study of bladder cancer identifies multiple susceptibility loci. Nat. Genet. 42, 978â984 (2010).
Reich, D. E. et al. Linkage disequilibrium in the human genome. Nature 411, 199â204 (2001).
Bishop, D. T. et al. Genome-wide association study identifies three loci associated with melanoma risk. Nat. Genet. 41, 920â925 (2009).
Law, M. H. et al. Genome-wide meta-analysis identifies five new susceptibility loci for cutaneous malignant melanoma. Nat. Genet. 47, 987â995 (2015).
Barrett, J. H. et al. Genome-wide association study identifies three new melanoma susceptibility loci. Nat. Genet. 43, 1108â1113 (2011).
Song, H. et al. A genome-wide association study identifies a new ovarian cancer susceptibility locus on 9p22.2. Nat. Genet. 41, 996â1000 (2009).
Kuchenbaecker, K. B. et al. Identification of six new susceptibility loci for invasive epithelial ovarian cancer. Nat. Genet. 47, 164â171 (2015).
Pharoah, P. D. et al. GWAS meta-analysis and replication identifies three new susceptibility loci for ovarian cancer. Nat. Genet. 45, 362â370 (2013).
Goode, E. L. et al. A genome-wide association study identifies susceptibility loci for ovarian cancer at 2q31 and 8q24. Nat. Genet. 42, 874â879 (2010).
Gudbjartsson, D. F. et al. ASIP and TYR pigmentation variants associate with cutaneous melanoma and basal cell carcinoma. Nat. Genet. 40, 886â891 (2008).
Chahal, H. S. et al. Genome-wide association study identifies 14 novel risk alleles associated with basal cell carcinoma. Nat. Commun. 7, 12510 (2016).
Stacey, S. N. et al. New common variants affecting susceptibility to basal cell carcinoma. Nat. Genet. 41, 909â914 (2009).
Stacey, S. N. et al. New basal cell carcinoma susceptibility loci. Nat. Commun. 6, 6825 (2015).
Kinnersley, B. et al. Genome-wide association study identifies multiple susceptibility loci for glioma. Nat. Commun. 6, 8559 (2015).
Shete, S. et al. Genome-wide association study identifies five susceptibility loci for glioma. Nat. Genet. 41, 899â904 (2009).
Melin, B. S. et al. Genome-wide association study of glioma subtypes identifies specific differences in genetic susceptibility to glioblastoma and non-glioblastoma tumors. Nat. Genet. 49, 789â794 (2017).
Dobbins, S. E. et al. Common variation at 10p12.31 near MLLT10 influences meningioma risk. Nat. Genet. 43, 825â827 (2011).
Litchfield, K. et al. Identification of 19 new risk loci and potential regulatory mechanisms influencing susceptibility to testicular germ cell tumor. Nat. Genet. 49, 1133â1140 (2017).
Litchfield, K. et al. Identification of four new susceptibility loci for testicular germ cell tumour. Nat. Commun. 6, 8690 (2015).
Wang, Z. et al. Meta-analysis of five genome-wide association studies identifies multiple new loci associated with testicular germ cell tumor. Nat. Genet. 49, 1141â1147 (2017).
Gudmundsson, J. et al. A genome-wide association study yields five novel thyroid cancer risk loci. Nat. Commun. 8, 14517 (2017).
Papaemmanuil, E. et al. Loci on 7p12.2, 10q21.2 and 14q11.2 are associated with risk of childhood acute lymphoblastic leukemia. Nat. Genet. 41, 1006â1010 (2009).
Vijayakrishnan, J. et al. A genome-wide association study identifies risk loci for childhood acute lymphoblastic leukemia at 10q26.13 and 12q23.1. Leukemia 31, 573â579 (2017).
Vijayakrishnan, J. et al. The 9p21.3 risk of childhood acute lymphoblastic leukaemia is explained by a rare high-impact variant in CDKN2A. Sci. Rep. 5, 15065 (2015).
Migliorini, G. et al. Variation at 10p12.2 and 10p14 influences risk of childhood B-cell acute lymphoblastic leukemia and phenotype. Blood 122, 3298â3307 (2013).
Law, P. J. et al. Genome-wide association analysis implicates dysregulation of immunity genes in chronic lymphocytic leukaemia. Nat. Commun. 8, 14175 (2017).
Crowther-Swanepoel, D. et al. Common variants at 2q37.3, 8q24.21, 15q21.3 and 16q24.1 influence chronic lymphocytic leukemia risk. Nat. Genet. 42, 132â136 (2010).
Di Bernardo, M. C. et al. A genome-wide association study identifies six susceptibility loci for chronic lymphocytic leukemia. Nat. Genet. 40, 1204â1210 (2008).
Broderick, P. et al. Common variation at 3p22.1 and 7p15.3 influences multiple myeloma risk. Nat. Genet. 44, 58â61 (2011).
Chubb, D. et al. Common variation at 3q26.2, 6p21.33, 17p11.2 and 22q13.1 influences multiple myeloma risk. Nat. Genet. 45, 1221â1225 (2013).
Weinhold, N. et al. The CCND1 c.870G>A polymorphism is a risk factor for t(11;14)(q13;q32) multiple myeloma. Nat. Genet. 45, 522â525 (2013).
Mitchell, J. S. et al. Genome-wide association study identifies multiple susceptibility loci for multiple myeloma. Nat. Commun. 7, 12050 (2016).
Enciso-Mora, V. et al. A genome-wide association study of Hodgkin's lymphoma identifies new susceptibility loci at 2p16.1 (REL), 8q24.21 and 10p14 (GATA3). Nat. Genet. 42, 1126â1130 (2010).
Frampton, M. et al. Variation at 3p24.1 and 6q23.3 influences the risk of Hodgkin's lymphoma. Nat. Commun. 4, 2549 (2013).
Cozen, W. et al. A meta-analysis of Hodgkin lymphoma reveals 19p13.3 TCF3 as a novel susceptibility locus. Nat. Commun. 5, 3856 (2014).
Skibola, C. F. et al. Genome-wide association study identifies five susceptibility loci for follicular lymphoma outside the HLA region. Am. J. Hum. Genet. 95, 462â471 (2014).
Cerhan, J. R. et al. Genome-wide association study identifies multiple susceptibility loci for diffuse large B cell lymphoma. Nat. Genet. 46, 1233â1238 (2014).
Turnbull, C. et al. A genome-wide association study identifies susceptibility loci for Wilms tumor. Nat. Genet. 44, 681â684 (2012).
Diskin, S. J. et al. Copy number variation at 1q21.1 associated with neuroblastoma. Nature 459, 987â991 (2009).
Goldin, L. R., Pfeiffer, R. M., Li, X. & Hemminki, K. Familial risk of lymphoproliferative tumors in families of patients with chronic lymphocytic leukemia: results from the Swedish Family-Cancer Database. Blood 104, 1850â1854 (2004).
Codd, V. et al. Identification of seven loci affecting mean telomere length and their association with disease. Nat. Genet. 45, 422â427 (2013).
Rafnar, T. et al. Sequence variants at the TERT-CLPTM1L locus associate with many cancer types. Nat. Genet. 41, 221â227 (2009).
Wang, Y. et al. Common 5p15.33 and 6p21.33 variants influence lung cancer risk. Nat. Genet. 40, 1407â1409 (2008). First lung cancer GWAS to identify common genetic factors influencing lung cancer risk in people who smoke.
Sherborne, A. L. et al. Variation in CDKN2A at 9p21.3 influences childhood acute lymphoblastic leukemia risk. Nat. Genet. 42, 492â494 (2010).
Timofeeva, M. N. et al. Influence of common genetic variation on lung cancer risk: meta-analysis of 14,900 cases and 29,485 controls. Hum. Mol. Genet. 21, 4980â4995 (2012).
Falchi, M., Spector, T. D., Perks, U., Kato, B. S. & Bataille, V. Genome-wide search for nevus density shows linkage to two melanoma loci on chromosome 9 and identifies a new QTL on 5q31 in an adult twin cohort. Hum. Mol. Genet. 15, 2975â2979 (2006).
Yeager, M. et al. Genome-wide association study of prostate cancer identifies a second risk locus at 8q24. Nat. Genet. 39, 645â649 (2007).
Tomlinson, I. et al. A genome-wide association scan of tag SNPs identifies a susceptibility variant for colorectal cancer at 8q24.21. Nat. Genet. 39, 984â988 (2007). Paper reporting on the first GWAS of CRC, with discovery of rs6983267 at 8q24.21.
Nan, H. et al. Genome-wide association study identifies novel alleles associated with risk of cutaneous basal cell carcinoma and squamous cell carcinoma. Hum. Mol. Genet. 20, 3718â3724 (2011).
Chahal, H. S. et al. Genome-wide association study identifies novel susceptibility loci for cutaneous squamous cell carcinoma. Nat. Commun. 7, 12048 (2016).
Antoniou, A. C. et al. A locus on 19p13 modifies risk of breast cancer in BRCA1 mutation carriers and is associated with hormone receptor-negative breast cancer in the general population. Nat. Genet. 42, 885â892 (2010).
Studd, J. B. et al. Genetic and regulatory mechanism of susceptibility to high-hyperdiploid acute lymphoblastic leukaemia at 10p21.2. Nat. Commun. 8, 14616 (2017).
Law, P. J. et al. Genome-wide association analysis of chronic lymphocytic leukaemia, Hodgkin lymphoma and multiple myeloma identifies pleiotropic risk loci. Sci. Rep. 7, 41071 (2017).
Speedy, H. E. et al. A genome-wide association study identifies multiple susceptibility loci for chronic lymphocytic leukemia. Nat. Genet. 46, 56â60 (2014).
Rapley, E. A. et al. A genome-wide association study of testicular germ cell tumor. Nat. Genet. 41, 807â810 (2009).
Kanetsky, P. A. et al. Common variation in KITLG and at 5q31.3 predisposes to testicular germ cell cancer. Nat. Genet. 41, 811â815 (2009).
Turnbull, C. et al. Variants near DMRT1, TERT and ATF7IP are associated with testicular germ cell cancer. Nat. Genet. 42, 604â607 (2010).
Ruark, E. et al. Identification of nine new susceptibility loci for testicular cancer, including variants near DAZL and PRDM14. Nat. Genet. 45, 686â689 (2013).
Thorgeirsson, T. E. et al. A variant associated with nicotine dependence, lung cancer and peripheral arterial disease. Nature 452, 638â642 (2008). GWAS of smoking quantity, describing geneâlifestyle interaction between variation at 15q24 and nicotine dependence, leading to an indirect association with lung cancer risk.
Freathy, R. M. et al. A common genetic variant in the 15q24 nicotinic acetylcholine receptor gene cluster (CHRNA5-CHRNA3-CHRNB4) is associated with a reduced ability of women to quit smoking in pregnancy. Hum. Mol. Genet. 18, 2922â2927 (2009).
Garcia-Closas, M. et al. Common genetic polymorphisms modify the effect of smoking on absolute risk of bladder cancer. Cancer Res. 73, 2211â2220 (2013).
Enciso-Mora, V. et al. Deciphering the 8q24.21 association for glioma. Hum. Mol. Genet. 22, 2293â2302 (2013).
Malmer, B. et al. GLIOGENE an International Consortium to understand familial glioma. Cancer Epidemiol. Biomarkers Prev. 16, 1730â1734 (2007).
Haiman, C. A. et al. Multiple regions within 8q24 independently affect risk for prostate cancer. Nat. Genet. 39, 638â644 (2007).
Woodage, T. et al. The APC I1307K allele and cancer risk in a community-based study of Ashkenazi Jews. Nat. Genet. 20, 62â65 (1998).
Bush, W. S. & Moore, J. H. Chapter 11: genome-wide association studies. PLoS Comput. Biol. 8, e1002822 (2012).
Kilpivaara, O. et al. A germline JAK2 SNP is associated with predisposition to the development of JAK2(V617F)-positive myeloproliferative neoplasms. Nat. Genet. 41, 455â459 (2009).
Hemminki, K., Sundquist, J. & Bermejo, J. L. Associated cancers in parents and offspring of polycythaemia vera and myelofibrosis patients. Br. J. Haematol. 147, 526â530 (2009).
Litchfield, K. et al. Rare disruptive mutations in ciliary function genes contribute to testicular cancer susceptibility. Nat. Commun. 7, 13840 (2016).
Yang, J. et al. Common SNPs explain a large proportion of the heritability for human height. Nat. Genet. 42, 565â569 (2010).
Sampson, J. N. et al. Analysis of heritability and shared heritability based on genome-wide association studies for 13 cancer types. J. Natl Cancer Inst. 107, djv279 (2015).
Frampton, M. J. et al. Implications of polygenic risk for personalised colorectal cancer screening. Ann. Oncol. 27, 429â434 (2016).
Speed, D. et al. Reevaluation of SNP heritability in complex human traits. Nat. Genet. 49, 986â992 (2017).
Sham, P. C. & Purcell, S. M. Statistical power and significance testing in large-scale genetic studies. Nat. Rev. Genet. 15, 335â346 (2014).
Pomerantz, M. M. et al. The 8q24 cancer risk variant rs6983267 shows long-range interaction with MYC in colorectal cancer. Nat. Genet. 41, 882â884 (2009). One of the first attempts to functionally characterize a cancer risk locus, demonstrating allele-specific differential binding of TCF7L2 to rs6983267, which encompasses an enhancer element that interacts with MYC.
Howie, B. N., Donnelly, P. & Marchini, J. A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet. 5, e1000529 (2009).
Howie, B., Marchini, J. & Stephens, M. Genotype imputation with thousands of genomes. G3 (Bethesda) 1, 457â470 (2011).
Stacey, S. N. et al. A germline variant in the TP53 polyadenylation signal confers cancer susceptibility. Nat. Genet. 43, 1098â1103 (2011).
Enciso-Mora, V. et al. Low penetrance susceptibility to glioma is caused by the TP53 variant rs78378222. Br. J. Cancer 108, 2178â2185 (2013).
Killedar, A. et al. A common cancer risk-associated allele in the hTERT locus encodes a dominant negative inhibitor of telomerase. PLoS Genet. 11, e1005286 (2015).
Mercer, T. R. et al. DNase I-hypersensitive exons colocalize with promoters and distal regulatory elements. Nat. Genet. 45, 852â859 (2013).
Grundberg, E. et al. Mapping cis- and trans-regulatory effects across multiple tissues in twins. Nat. Genet. 44, 1084â1089 (2012).
Zhu, Z. et al. Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nat. Genet. 48, 481â487 (2016).
Encode Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57â74 (2012).
Mifsud, B. et al. Mapping long-range promoter contacts in human cells with high-resolution capture Hi-C. Nat. Genet. 47, 598â606 (2015).
Dryden, N. H. et al. Unbiased analysis of potential targets of breast cancer susceptibility loci by capture Hi-C. Genome Res. 24, 1854â1868 (2014).
Jager, R. et al. Capture Hi-C identifies the chromatin interactome of colorectal cancer risk loci. Nat. Commun. 6, 6178 (2015). Use of targeted capture approach to greatly enrich for Hi-C contacts within CRC risk regions, aiding functional interrogation of DNA regulatory interactions at the known risk loci.
Dixon, J. R. et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485, 376â380 (2012).
de Souza, N. The ENCODE project. Nat. Methods 9, 1046 (2012).
Roadmap Epigenomics Consortium et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317â330 (2015).
Stunnenberg, H. G., International Human Epigenome Consortium & Hirst, M. The International Human Epigenome Consortium: a blueprint for scientific collaboration and discovery. Cell 167, 1145â1149 (2016).
Castro, M. A. et al. Regulators of genetic risk of breast cancer identified by integrative network analysis. Nat. Genet. 48, 12â21 (2016).
Tuupanen, S. et al. The common colorectal cancer predisposition SNP rs6983267 at chromosome 8q24 confers potential to enhanced Wnt signaling. Nat. Genet. 41, 885â890 (2009).
Sur, I. K. et al. Mice lacking a Myc enhancer that includes human SNP rs6983267 are resistant to intestinal tumors. Science 338, 1360â1363 (2012). Genetic engineering in the mouse demonstrated an in vivo effect of a risk SNP in a regulatory element on tumour development.
Walsh, K. M. et al. A heritable missense polymorphism in CDKN2A confers strong risk of childhood acute lymphoblastic leukemia and is preferentially selected during clonal evolution. Cancer Res. 75, 4884â4894 (2015).
Boyle, E. A., Li, Y. I. & Pritchard, J. K. An expanded view of complex traits: from polygenic to omnigenic. Cell 169, 1177â1186 (2017).
Lawrenson, K. et al. Cis-eQTL analysis and functional validation of candidate susceptibility genes for high-grade serous ovarian cancer. Nat. Commun. 6, 8234 (2015).
Glodzik, D. et al. A somatic-mutational process recurrently duplicates germline susceptibility loci and tissue-specific super-enhancers in breast cancers. Nat. Genet. 49, 341â348 (2017).
Ongen, H. et al. Putative cis-regulatory drivers in colorectal cancer. Nature 512, 87â90 (2014).
Li, Q. et al. Integrative eQTL-based analyses reveal the biology of breast cancer risk loci. Cell 152, 633â641 (2013).
Nelson, M. R. et al. The support of human genetic evidence for approved drug indications. Nat. Genet. 47, 856â860 (2015).
Zhang, J. et al. Use of genome-wide association studies for cancer research and drug repositioning. PLoS ONE 10, e0116477 (2015).
Sanseau, P. et al. Use of genome-wide association studies for drug repositioning. Nat. Biotechnol. 30, 317â320 (2012).
Griffiths, C. E. M. et al. Comparison of ustekinumab and etanercept for moderate-to-severe psoriasis. N. Engl. J. Med. 362, 118â128 (2010).
Nair, R. P. et al. Genome-wide scan reveals association of psoriasis with IL-23 and NF-κB pathways. Nat. Genet. 41, 199â204 (2009).
Di Meglio, P. et al. The IL23R R381Q gene variant protects against immune-mediated diseases by impairing IL-23-induced Th17 effector response in humans. PLoS ONE 6, e17160 (2011).
Roberts, A. W. et al. Targeting BCL2 with venetoclax in relapsed chronic lymphocytic leukemia. N. Engl. J. Med. 374, 311â322 (2016).
Babina, I. S. & Turner, N. C. Advances and challenges in targeting FGFR signalling in cancer. Nat. Rev. Cancer 17, 318â332 (2017).
Pujana, M. A. Integrating germline and somatic data towards a personalized cancer medicine. Trends Mol. Med. 20, 413â415 (2014).
Pashayan, N. et al. Polygenic susceptibility to prostate and breast cancer: implications for personalised screening. Br. J. Cancer 104, 1656â1663 (2011).
Seibert, T. M. et al. A genetic risk score to guide age-specific, personalized prostate cancer screening. bioRxiv http://dx.doi.org/10.1101/089383 (2016).
Lecarpentier, J. et al. Prediction of breast and prostate cancer risks in male BRCA1 and BRCA2 mutation carriers using polygenic risk scores. J. Clin. Oncol. 35, 2240â2250 (2017).
Kuchenbaecker, K. B. et al. Evaluation of polygenic risk scores for breast and ovarian cancer risk prediction in BRCA1 and BRCA2 mutation carriers. J. Natl Cancer Inst. 109, djw302 (2017).
Rodriguez-Broadbent, H. et al. Mendelian randomisation implicates hyperlipidaemia as a risk factor for colorectal cancer. Int. J. Cancer 140, 2701â2708 (2017).
Hedditch, E. L. et al. ABCA Transporter gene expression and poor outcome in epithelial ovarian cancer. J. Natl Cancer Inst. 106, dju149 (2014).
Perez-Andreu, V. et al. Inherited GATA3 variants are associated with Ph-like childhood acute lymphoblastic leukemia and risk of relapse. Nat. Genet. 45, 1494â1498 (2013).
Wu, C. et al. Genome-wide association study identifies common variants in SLC39A6 associated with length of survival in esophageal squamous-cell carcinoma. Nat. Genet. 45, 632â638 (2013).
Johnson, D. C. et al. Genome-wide association study identifies variation at 6q25.1 associated with survival in multiple myeloma. Nat. Commun. 7, 10290 (2016).
Berndt, S. I. et al. Two susceptibility loci identified for prostate cancer aggressiveness. Nat. Commun. 6, 6889 (2015).
Johnson, N. et al. Cytochrome P450 allele CYP3A7*1C associates with adverse outcomes in chronic lymphocytic leukemia, breast and lung cancer. Cancer Res. 76, 1485â1493 (2016).
Aminkeng, F. et al. A coding variant in RARG confers susceptibility to anthracycline-induced cardiotoxicity in childhood cancer. Nat. Genet. 47, 1079â1084 (2015).
Fachal, L. et al. A three-stage genome-wide association study identifies a susceptibility locus for late radiotherapy toxicity at 2q24.1. Nat. Genet. 46, 891â894 (2014).
Canver, M. C. et al. Variant-aware saturating mutagenesis using multiple Cas9 nucleases identifies regulatory elements at trait-associated loci. Nat. Genet. 49, 625â634 (2017).
Canver, M. C. et al. BCL11A enhancer dissection by Cas9-mediated in situ saturating mutagenesis. Nature 527, 192â197 (2015).
Wellcome Trust Case Control Consortium. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447, 661â678 (2007).
Risch, N. & Merikangas, K. The future of genetic studies of complex human diseases. Science 273, 1516â1517 (1996).
Smith, G. D. & Ebrahim, S. 'Mendelian randomization': can genetic epidemiology contribute to understanding environmental determinants of disease? Int. J. Epidemiol. 32, 1â22 (2003).
Jarvis, D. et al. Mendelian randomisation analysis strongly implicates adiposity with risk of developing colorectal cancer. Br. J. Cancer 115, 266â272 (2016).
Interleukin-6 Receptor Mendelian Randomisation Analysis (IL6R MR) Consortium et al. The interleukin-6 receptor as a target for prevention of coronary heart disease: a Mendelian randomisation analysis. Lancet 379, 1214â1224 (2012).
Acknowledgements
The authors are grateful to Cancer Research UK for support. A.S. is in receipt of a clinical training fellowship from Cancer Research UK.
Author information
Authors and Affiliations
Contributions
A.S., B.K. and R.S.H. researched data for the article, made substantial contributions to discussions of the content, wrote the article and reviewed and/or edited the manuscript before submission.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Supplementary information
Supplementary Figure 1
Cancer susceptibility loci identified through genome-wide association studies. (DOC 804 kb)
Supplementary Table 1
Cancer GWAS loci identified as of March 2017 (XLSX 184 kb)
Glossary
- Heritability
-
An estimate of the proportion of variation in a trait in a given population that is due to genetic variation. In particular, narrow-sense heritability is the proportion of variance in a trait due to additive genetic factors, whereas broad-sense heritability is the proportion of variance in a trait due to all genetic factors (for example, including dominance and geneâgene interactions).
- Relative risk
-
(RR). The ratio of disease occurrence in one group versus another (for example, cancer risk in patient relatives compared with in the general population). The RR estimate associated with common risk alleles identified through genome-wide association studies (GWAS) is usually a per-allele RR (as in the codominant log-additive genetic model).
- Mendelian predisposition
-
A phenomenon that occurs when a germline mutation in a single gene (for example, a cancer susceptibility gene) is sufficient to cause cancer in a majority of patients (for example, female carriers of BRCA1 mutations have an â¼80% lifetime risk of developing breast cancer). These mutations can be dominant or recessive, which are caused by monoallelic and biallelic mutations, respectively.
- Cancer susceptibility genes
-
(CSGs). Genes in which inherited mutations (commonly high-penetrance) predispose individuals to cancer.
- Penetrance
-
The proportion of individuals carrying a particular allele (for example, in a cancer susceptibility gene) who go on to develop cancer. High-penetrance mutations confer a high risk of causing cancer, whereas low-penetrance polymorphisms confer a low risk.
- Risk allele frequency
-
The frequency of a risk allele (B) in a given population at a biallelic site with a non-risk allele (A), derived from genotype counts through the following formula: (2 Ã BB + AB)/(2 Ã (AA + AB + BB))
- Odds ratios
-
(ORs). The odds that an outcome will occur given a particular exposure compared with the absence of that exposure (for example, comparing variant site allele frequency in patients with cancer and in controls).
- Linkage disequilibrium
-
(LD). The nonrandom association of alleles at different sites in a given population. Alleles in high LD are those where their shared frequency combinations are greater than those that would be expected if they were inherited independently. LD can be affected by factors such as natural selection and genetic drift as well as by rates of mutation and recombination.
- Effect size
-
A quantitative measurement statistic of the strength of an association between two variables (for example, single nucleotide polymorphism (SNP) genotype and cancer risk).
- Pleiotropy
-
A phenomenon that occurs when a risk locus is associated with multiple phenotypic traits. In some cases, the same variant is presumed to influence multiple traits, while in other cases, different traits map to distinct locations within the risk locus.
- Population attributable risk
-
The number of cases of disease among exposed individuals that can be attributed to that exposure (for example, carriers of a particular risk single nucleotide polymorphism (SNP)).
- Genome-wide complex trait analysis
-
(GCTA). A computational method by which the narrow-sense heritability of a trait can be estimated through caseâcontrol genotypes from genome-wide association studies (GWAS) and estimates of trait incidence.
- Fine-mapping
-
A process of refining association signals from genome-wide association studies (GWAS) and prioritising likely causative variants (for example, through in silico annotations of putative functional effects).
- Hi-C analysis
-
A form of chromosome conformation capture in which crosslinked DNA fragments are sequenced in order to infer the three-dimensional structure of the genome and to identify potential regulatory interactions.
Rights and permissions
About this article
Cite this article
Sud, A., Kinnersley, B. & Houlston, R. Genome-wide association studies of cancer: current insights and future perspectives. Nat Rev Cancer 17, 692â704 (2017). https://doi.org/10.1038/nrc.2017.82
Published:
Issue Date:
DOI: https://doi.org/10.1038/nrc.2017.82
This article is cited by
-
Differentiated genomic footprints suggest isolation and long-distance migration of Hmong-Mien populations
BMC Biology (2024)
-
Co-expression in tissue-specific gene networks links genes in cancer-susceptibility loci to known somatic driver genes
BMC Medical Genomics (2024)
-
The evolutionary impact of childhood cancer on the human gene pool
Nature Communications (2024)
-
A distinct class of pan-cancer susceptibility genes revealed by an alternative polyadenylation transcriptome-wide association study
Nature Communications (2024)
-
Polygenic scores and their applications in kidney disease
Nature Reviews Nephrology (2024)