Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Identification of common genetic variants influencing spontaneous dizygotic twinning and female fertility Hamdi Mbarek,1,21,* Stacy Steinberg,2,22 Dale R. Nyholt,3,4,22 Scott D. Gordon,4 Michael B. Miller,5 Allan F. McRae,6,4 Jouke Jan Hottenga,1,21 Felix R. Day,20 Gonneke Willemsen,1,21 Eco J. de Geus,1,21 Gareth E. Davies,7 Hilary C. Martin,8 Brenda W. Penninx,9 Rick Jansen,9 Kerrie McAloney,4 Jacqueline M. Vink,1 Jaakko Kaprio,10 Robert Plomin,11 Tim D. Spector,12 Patrik K. Magnusson,13 Bruno Reversade,14 R. Alan Harris,15 Kjersti Aagaard,15 Ragnar P. Kristjansson,2 Isleifur Olafsson,16 Gudmundur Ingi Eyjolfsson,17 Olof Sigurdardottir,18 William G. Iacono,5 Cornelis B. Lambalk,19 Grant W. Montgomery,4 Matt McGue,5 Ken K. Ong,20 John R.B Perry,20 Nicholas G. Martin,4 Hreinn Stefánsson,2 Kari Stefánsson,2 and Dorret I. Boomsma,1,21,* 1 Department of Biological Psychology, Vrije U Amsterdamniversiteit, 1081BT Amsterdam, The Netherlands. 2 deCODE Genetics, 101 Reykjavik, Iceland. 3 Institute of Health and Biomedical Innovation, Queensland University of Technology, 4059 Brisbane, Australia. 4 QIMR Berghofer Medical Research Institute, 4029 Brisbane, Australia. 5 Department of Psychology, University of Minnesota, Minneapolis, MN 55455, USA. 6 Queensland Brain Institute, The University of Queensland, 4072 Brisbane, Australia. 7 Avera Institute for Human Genetics, Sioux Falls, SD 57108, USA. 8 Wellcome Trust Centre for Human Genetics, OX3 TBN Oxford, United Kingdom. 9 Department of Psychiatry and EMGO Institute for Health and Care Research, Vrije U Universiteity, Medical Center/GGZ inGeest, 1081HL Amsterdam, The Netherlands. 10 Institute for Molecular Medicine Finland FIMM, University of Helsinki; Department of Public Health, University of Helsinki; National Institute for Health and Welfare, FI-00014 Helsinki, Finland. 1 11 Medical Research Council Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology & Neuroscience, King's College London, SE5 8AF London, United Kingdom. 12 Department of Twin Research & Genetic Epidemiology, King's College London, SE1 7EH London, United Kingdom. 13 Department of Medical Epidemiology and Biostatistics, Karolinska Institute, SE-171 77 Stockholm, Sweden. 14 Institute of Medical Biology, A*STAR, 138648 Singapore, Singapore. 15 Division of Maternal-Fetal Medicine, Departments of Obstetrics and Gynecology, Molecular and Human Genetics, and Molecular and Cell Biology, Baylor College of Medicine and Texas Children’s Hospital, Houston, TX 77030, USA. 16 Department of Clinical Biochemistry, Landspitali- The National University Hospital of Iceland, 101 Reykjavik, Iceland. 17 Icelandic Medical Center (Laeknasetrid), Laboratory in Mjodd (RAM), Reykjavik, Iceland. 18 Department of Clinical Biochemistry, Akureyri Hospital, 109 Akureyri, Iceland. 19 Department of Obstetrics, Gynecology and Reproductive Medicine, VU University Medical Center, 1007MB Amsterdam, The Netherlands. 20 MRC Epidemiology Unit, University of Cambridge School of Clinical Medicine, Institute of Metabolic Science, Cambridge Biomedical Campus, CB2 0QQ Cambridge, United Kingdom. 21 EMGO+ Institute for Health and Care Research, 1081BT Amsterdam, The Netherlands. 22 These authors contributed equally to this study *Correspondence to h.mbarek@vu.nl (H.M, @drHamdiMbarek), di.boomsma@vu.nl (D.I.B) 2 Abstract Spontaneous dizygotic (DZ) twinning occurs in 1-4% of women, with familial clustering and unknown pathophysiological pathways and genetic origin. DZ twinning may index increased fertility and has distinct health implications for mother and child. We performed the first a GWA study in 1,980 mothers of spontaneous DZ twins and 12,953 controls. Findings were replicated in a large Icelandic cohort and tested for association across a broad range of fertility traits in women. Two SNPs were identified (rs11031006 near FSHB, p = 1.54 × 10-9, and rs17293443 in SMAD3, p = 1.57 × 10-8) and replicated (p = 3 × 10−3 and p = 1.44 × 10−4, respectively). Based on ~90,000 births in Iceland, the relative risk of a mother delivering twins increased by 18% for each copy of the allele rs11031006-G, and 9% for the rs17293443-C. A higher polygenic risk score (PRS) for DZ twinning, calculated based on the results of the DZ twinning GWAS, was significantly associated with DZ twinning in Iceland (p = 0.001). A higher PRS was also associated with having children (p = 0.01), greater lifetime parity (p = 0.03) and earlier age at first child (p = 0.02). The Allele rs11031006-G was associated with higher serum FSH levels, earlier age at menarche, earlier age at first child, higher lifetime parity, lower PCOS risk, and earlier age at menopause. Conversely, the rs17293443-C was associated with later age at last child. We identified the first robust genetic risk variants for DZ twinning: one near FSHB, and a second within SMAD3,—the product of which plays an important role in gonadal responsiveness to FSH. These loci contribute to crucial aspects of reproductive capacity and health. 3 1 Introduction 2 DZ twinning (MIM: 276400) is defined as the concomitant conception and 3 development of two independent zygotes during one pregnancy. Mothers of spontaneous 4 (conceived without assisted reproductive technology) DZ twins have a predisposition to 5 multiple ovulation events due to interference with single dominant follicle selection, a 6 biological mechanism fundamental for the human species.1 DZ twinning is common, with 7 large regional differences from six per 1000 births in Asia to 40 per 1000 births in Africa,2 8 whereas monozygotic twinning occurs around the world at a constant rate of around three to 9 four per 1000 births.3; 4 DZ twinning rates also vary substantially over time. For example, in 10 the US, the observed incidence of twin births increased by a factor of 1.9 between 1971 and 11 2009.5 While a considerable part of the increase is attributable to fertility treatments, with an 12 estimate of 36% of all twins born in the USA in 2011 resulting from assisted reproduction, the 13 majority of twins are still conceived spontaneously. In addition to fertility treatments, 14 increases in maternal age contribute to increases in twinning incidence.3; 6 15 Twinning is associated with increased risks to mother and offspring, including higher 16 risks of stillbirth, neonatal death and premature birth. Compared to singleton children, twins 17 use more hospital resources, especially during the first year of life,7 with hospital costs in the 18 first five years of life being as much as 3·3-fold higher.8 19 Mothers of DZ twins differ from other women in that they are taller, have an increased 20 BMI, are more often overweight and more often smoke before the twin pregnancy.9 Family 21 history, increased parity and gravidity all increase the risk of spontaneous DZ twinning.1; 3 22 Remarkably, twinning rates do not reflect average nutritional status of a population, as 23 established from longitudinal studies in countries that experienced periods of starvation, such 24 as the Dutch hunger winter. Above a specific, yet undetermined, threshold, nutrition seems to 25 be of minimal importance for reproduction in general and also for twinning.10 These 4 26 observations all point to spontaneous twinning being a heritable trait and suggest the potential 27 for polygenic inheritance. In a landmark study of twinning based on data from genealogic 28 records from Utah,11 White and Wyshak established that the genotype of the mother, but not 29 that of the father, affects the frequency of DZ twinning. 30 The underlying physiological mechanism for DZ twinning is the release and 31 fertilization of two or more oocytes. Ovarian folliculogenesis and determination of ovulation 32 quota are controlled both by circulating concentrations of follicle-stimulating hormone (FSH) 33 and by intra-ovarian factors including the two oocyte growth factors, GDF9 (MIM: 601918) 34 and BMP15 (MIM: 300247), as well as their cognate receptors.12 In the common marmoset 35 monkey (subfamily Callitrichinae), DZ twins comprise the predominant litter size, and 36 singletons are rarely, if ever, observed. On DNA sequencing, specific nonsynonymous 37 substitutions were identified in GDF9, BMP15, BMP4 (MIM: 112262), and WFIKKN1 (MIM: 38 608021) as having a role in Callitrichine twinning.13 These genes are among a larger set of 63 39 candidate genes with a potential involvement in regulation of ovulation number and/or control 40 of growth and body size.14 41 Efforts to characterize the genes that contribute to DZ twinning in humans have not 42 been successful. Candidate gene and genome-wide linkage studies failed to uncover common 43 variants associated with DZ twinning,15-19 although one study reported rare variants in GDF9 44 to be associated with DZ twinning.18 45 DZ twinning has been suggested as a measure of human fertility both at the individual 46 and at the population level.20 Spontaneous DZ twinning may be considered as a marker of 47 high fertility, as it reflects the frequency of double ovulation, the probability of coitus within 48 the appropriate time frame with fertilization of both ova, and maintenance of a multiple 49 pregnancy. 5 The aim of this study was to perform the first a genome-wide association study 50 51 (GWAS) in mothers of spontaneous DZ twins to identify relevant genomic regions and test 52 their significance across a broad range of female fertility and reproductive traits including age 53 at menarche, age at natural menopause, age at first and last child, and lifetime parity. Three 54 twin registries, from the Netherlands, Australia and Minnesota (USA) had detailed 55 information on spontaneous twinning in mothers of DZ twins, as well as genotype data. 56 Replication of top hits for twinning was possible in the Icelandic population and for other 57 measures of reproductive ageing in several large-scale population meta-analyses21-24. 58 59 Material and Methods 60 Details of methods and associated references can be found in the supplemental data. 61 Descriptions of Participating Studies 62 Netherlands Twin Register (NTR) 63 The NTR sample consisted of 806 cases and 4,535 controls from the Netherlands Twin 64 Register (2,776 participants) and the Netherlands Study of Depression and anxiety (NESDA; 65 2,565 participants). NTR participants were ascertained by the presence of liveborn twins or 66 triplets in the family and consist of multiples, their parents, siblings and spouses. Twins were 67 born in all strata of society and NTR represents a general sample from the Dutch population25- 68 28 . NESDA is a longitudinal study focusing on the course and consequences of depression and 69 anxiety disorders. Subjects for NESDA were recruited from the general population, mental 70 health organizations and general practices. The sample includes subjects selected for 71 depression and anxiety, as well as healthy controls29. Zygosity of twins was confirmed by 72 DNA genotyping. Data on mode of pregnancy were available from several data collection 73 waves including surveys sent out to mothers of twins, a survey to parents upon registration of 74 young twins, and telephone interview as part of a project on DZ twinning9. The comparison of 6 75 the survey data with the hospital records showed that mothers can accurately report on the 76 mode of conception of their twins30. Participants were excluded if they reported the use of 77 assisted reproductive technology at one or more occasions. In case no reports on mode of 78 pregnancy were available, data were excluded unless the twins were born prior to 1985. 79 QIMR Berghofer Medical Research Institute (QIMR) 80 The sample used in this analysis consisted of 606 cases and 6,656 controls. The individuals 81 were drawn from families containing (any type of) twins recruited for prior studies, either 82 from around Australia from the Australian Twin Registry (ATR) (generally twins born before 83 1971) or from south-eastern Queeensland (generally twins born after 1980). Study recruitment 84 was predominantly population based (any family where the twins were willing to participate) 85 with no screening performed on reproductive phenotypes apart from selecting families with 86 twins. Studies for the older cohort typically were focused on personality traits but not selected 87 for them. Zygosity of twins was reported at time of recruitment; during phenotyping studies; 88 and tested by genotyping (on SNP arrays or Sequenom assays). New phenotyping excluded 89 mothers (‘cases’) from Queensland cohort, who used Assisted Reproductive Technology 90 (ART–typically IVF or hormone treatment) to become pregnant. Screening questions asked of 91 the mother during clinical sessions for phenotyping, were the primary basis for excluding 92 ART cases. A smaller subset of mothers was contacted specifically to establish this 93 information where it was not otherwise available. For the older (ATR) cohort, at the time of 94 the twins’ birth, IVF was not yet in clinical use and other ART was rare. 95 Minnesota Center for Twin and Family Research (MCTFR) 96 All subjects in this sample were independently ascertained through vital records of the State 97 of Minnesota in an effort to construct a population-based twin registry31; 32. The sample for 98 the current study consisted of 568 mothers of DZ twins and 1,862 controls who were the 99 parents of MZ twins from 1,062 families, including 800 complete parental pairs, 203 mothers 7 100 and 59 fathers. Most of the twins were born in the 70s or early 80s, when even though fertility 101 treatment was available in the US, it was expensive and few had access to it. Genotyping was 102 population-based and independent of phenotypes other than twinning. About 92% of the 103 registry, and 100% of both case and control samples, are of primarily European ancestry. 104 Iceland (deCODE) 105 Mothers of twins or other multiples (“twins”) were selected from among those taking part in 106 deCODE’s genetic studies based on a nationwide genealogical database. To increase the 107 proportion of these mother of twins who were mothers of dizygotic twins, twins who had been 108 genotyped and shown to be monozygotic were not used to identify mothers. Controls were 109 individuals participating in deCODE’s genetic research from which both mothers of twins and 110 the mothers’ first-degree relatives had been removed. For the prediction of twinning using 111 polygenic risk scores, mothers having opposite sex or verified dizygotic twins were compared 112 with mothers who did not have twins. 113 114 Study Design, Genome-Wide Association Study and Replication 115 We established the Twinning GWAS Consortium (TGC) to characterize the genetic basis of 116 DZ twinning in humans and performed the first a genome-wide association study (GWAS) 117 for DZ twinning utilizing data from 1,980 mothers of DZ twins (MODZT) and 12,953 118 controls from The Netherlands, US and Australian European ancestry cohorts. Sample sizes 119 and study characteristics are described in table S1. All cases underwent screening to exclude 120 mothers who received assisted reproductive techniques. Controls were screened to exclude 121 pedigrees containing DZ twins. In the replication stage, significant findings from the meta- 122 analysis were tested in large Icelandic cohort of 3,597 mothers of twins and 297,348 controls. 123 All participants provided written informed consent, including consent for genotyping and 8 124 analysis, and were recruited according to the protocols approved by the institution review 125 board of each institution. 126 127 Fertility Traits Measures 128 Fertility measures (having children, number of children, age at first and last child and average 129 birth interval) were defined in females born before 1970 based on the Icelandic genealogy. 130 Data for age at menarche, age at menopause and polycystic ovary syndrome were derived 131 from previously published GWAS consortia21-24. Sample sizes and study characteristics are 132 described in table 2. 133 134 Genotyping, Quality Control and Imputation 135 Each participating cohort performed participant-level genotyping of single nucleotide 136 polymorphisms (SNPs), that included standard quality-control measures for genotyping and 137 imputation (Table S21 and Supplemental Methods). All imputations were performed using the 138 1000 Genomes Project March 2012 release as the reference panel. 139 Statistical analysis 140 GWA analysis was performed in each cohort using logistic regression under an additive 141 genetic model with adjustment for principal components of genetic ancestry and specific 142 covariates for each study. GWA results for each SNP (odds ratio [OR] and 95% confidence 143 interval [CI]) were combined across cohorts using fixed-effect meta-analysis with inverse 144 variance weighting. The most significant SNP at each genome-wide significant (p<5×10-8) 145 locus was tested in the replication stage and replicating SNPs were examined for association 146 with FSH levels and other fertility traits. 147 Association Tests 9 148 Each genome-wide association analysis from the three cohorts was conducted using logistic 149 regression under an additive genetic model with adjustment for principal components of 150 genetic ancestry. Because the GWAS data include family members we added the –family 151 option in the analysis, which takes the familial structure of the data into account using a 152 sandwich estimator33. Imputed SNPs were analyzed using PLINK software34 and genotype 153 imputation uncertainty was accounted for by using allelic dosage in PLINK. 154 Meta-analysis was performed using the fixed-effects inverse variance method based on the 155 regression β estimates and standard errors from each study implemented in METAL35. The 156 presence of heterogeneity between cohorts for the effect sizes of risk alleles was investigated 157 using the Cochran’s Q-test as implemented in METAL. To determine whether the genome- 158 wide significant signal at each locus with low LD in the same chromosomal region (defined 159 as r2 < 0.05 in a 750-kb region) could be accounted for by a single SNP, we carried out 160 conditional analysis. Each cohort performed a genome-wide analysis for MODZT using 161 logistic regression adjusting for the top signal at each of the three associated regions to 162 determine whether potential second signals remained significant even after adjusting for these 163 variants. Results from each individual study were meta-analyzed to determine whether these 164 potential second signals were truly independent (that is, if p < 5 × 10-8). In Iceland, being a 165 mother of twins was tested for association with the top alleles using logistic regression and 166 including age, age-squared and county of birth as covariates as described previously36. 167 Associations between FSH level and genotype was assessed using linear regression as 168 described previously36. To study the association of fertility measures and SNP genotype, we 169 used logistic regression (having children), Poisson log-linear regression (number of children), 170 or linear regression (age at first child, age at last child, and average birth interval). In these 171 analyses, birth cohort (as a factor for each five year interval), county of birth and six principal 172 components were included as covariates. Relatedness was controlled for by using genomic 10 173 control in all Icelandic association analyses37. The combined p value of the meta-analysis and 174 the replication study was calculated using Fisher’s combined probability test 38. 175 176 FSH Serum Level Measure 177 Serum levels of follicle-stimulating hormone (FSH) were measured in 2,411 men (1,275 178 genotyped persons; 1,136 close relatives of genotyped individuals) and 15,586 women (9,738 179 genotyped persons; 5,848 close relatives of genotyped individuals) referred to three clinics in 180 Iceland. FSH testing was undertaken primarily to investigate possible gonad impairment. 181 Hormone levels were measured by electrochemiluminescence immunoassay, using reagents 182 and analytical instruments from Roche Diagnostics GmbH, according to the manufacturer’s 183 instructions. 184 185 186 In silico Functional Annotation We used a number of publicly available bioinformatics tools and datasets to identify putative 187 functional effects of the top associated SNPs at each locus, including: Combined Annotation 188 Dependent Depletion (CADD)39, HaploReg2 v440 and Variant effect predictor (VEP)41 . 189 190 Gene Based Test 191 Results of the MODZT meta-analysis were used to perform a gene-based test of association 192 for the 63 candidate genes from Harris et al. using the Knowledge-based mining system for 193 Genome-wide Genetic studies (KGG) software Version 3.542; 43. This approach uses an 194 extended Simes test that integrates prior functional information and the meta-analysis 195 association results when combining the SNP p values within a gene in order to obtain an 196 overall association p value for each entire gene. As we tested for genetic association for 63 197 genes, the significance level was set at 7.93 × 10-4 (Bonferroni correction; 0.05/63). 11 198 199 Polygenic Risk Scores (PRS) 200 Polygenic risk scores were calculated based on the results of the MODZT meta-analysis. Only 201 markers having info > 0.9 in all groups and MAF > 0.01 were included. To obtain effect sizes 202 taking LD into account, the LDpred method developed by Vilhjálmsson and colleagues was 203 used44. As suggested by Vilhjálmsson et al, we calculated multiple sets of LD-modified effect 204 sizes based on a grid of values for the fraction of causal markers (α = 0.0001, 0.0003, 0.001, 205 0.003, 0.01, 0.03, 0.1, 0.3, 1). The resulting scores were then tested in a validation data set of 206 Icelandic mothers of dizygotic twins. The score producing the most significant result in the 207 validation data set were subsequently used to test for association with five fertility-related 208 traits (“has children”, “number of children”, “age at first child”, “age at last child” and 209 “average birth interval”). 210 211 Results 212 Genome-wide association results, replication and gene based analyses 213 The overall GWAS meta-analysis genomic control statistic (λ) was 1.01, indicating no 214 appreciable inflation due to population structure. The quantile-quantile (Q-Q) plot of genome- 215 wide p values showed a strong deviation from the null hypothesis of no association (Figure 216 S1). The results are represented in the Manhattan plot (Figure 1). Three chromosomal regions 217 contained genome-wide significant SNPs (p < 5 × 10-8). Twenty-two SNPs on chromosome 218 11p13 showed genome-wide significant associations. Of these, the strongest signal 219 (rs11031006, p = 1.54 × 10-9 and OR 1.41, 95% CI 1.29-1.53) lies in the region 5’ of the 220 transcription start site of FSHB (encoding the FSH, beta polypeptide, [MIM: 136530]). A 221 second locus was represented by five SNPs on 1q42.13 covering an intergenic region flanked 222 by PGBD5 (MIM: 616791) and COG2 ([MIM: 606974]), (p = 1.23 × 10-8, OR 1.43, 95% CI 12 223 1.31-1.55 for strongest signal, rs12064669). A third locus on 15q22.33, mapped to the first 224 intron of SMAD3 (SMAD family member 3, [MIM: 603109]) (rs17293443, p = 1.57 × 10-8 225 and OR 1.27, 95% CI 1.19-1.35). No significant heterogeneity in SNP effects was observed 226 across cohorts for the top SNPs (p > 0.1, Cochran’s Q test) ; Tables 1 3 and S23). The 227 regional association plots for these loci are shown in figure S2. After conditioning on the top 228 SNPs at each locus, no secondary signals were observed (all p > 0.05; Tables S43, S54 and 229 S65). We sought validation of these three top signals in an independent replication study from 230 Iceland (deCODE) totaling 3,597 mothers of twins and 297,348 controls. The FSHB 231 (rs11031006, p = 3 × 10−3, OR 1.14, 95% CI 1.06-1.22) and SMAD3 (rs17293443, p = 1.44 × 232 10−4, OR 1.15, 95% CI 1.07-1.23) loci replicated, but rs12064669 (p=0.88) was not confirmed 233 (Table 13 and Figure S32). We also investigated whether any of the proposed 63 candidate 234 genes14 was associated with human DZ twinning. In a gene-based test, five genes 235 demonstrated a nominally significant association (p < 0.05; BMPR1A [MIM: 601299], 236 BMPR1B [MIM: 603248], IGF1 [MIM: 147440], FSHB and FSHR [MIM: 136435]). After 237 correction for multiple testing, only FSHB remained significant (Table S76). 238 239 Functional in silico annotation of associated variants 240 We explored plausible functional effects of our associated variants using Combined 241 Annotation Dependent Depletion (CADD).39 The FSHB SNP rs11031006 had a high Phred 242 scaled C-score (22.4), indicating that it is among the top 1% of SNPs in the human genome 243 most likely to have a functional effect. The Phred score for the SMAD3 SNP rs17293443 was 244 only 2.71 indicating that it is among the bottom 50% of SNPs in the human genome likely to 245 have a functional effect (Table S107). Examination of individual constituents of the CADD 246 scores showed particularly high conservation-based scores for rs11031006 (Figure S63). To 247 further investigate possible functional effects we examined data from the ENCODE project. 45 Field Code Changed 13 Field Code Changed 248 The FSHB SNP rs11031006 alters the sequence of 11 protein-binding motifs including that of 249 the Estrogen receptor alpha, indicating a possible effect on hormonal feedback inhibition 250 (Table S118). Variant Effect Predictor (VEP)41 identified rs17293443 as a regulatory region 251 variant within a promoter flanking region (ENSR00000410126) (Figure S74). SNP 252 rs17293443 was contained in a DNase I hypersensitive site suggesting open chromatin, 253 although it did not alter any of the transcription factor binding sites present in the promoter 254 flanking region. Together, these data indicate that rs11031006 and rs17293443 may have 255 direct functional roles. 256 SNPs associated with FSH serum level 257 We analyzed serum FSH measurements from 17,997 genotyped Icelanders and their close 258 relatives. SNP rs11031006 was significantly associated with higher serum FSH levels (p = 2.3 259 × 10-10), with each G allele conferring an increase in FSH level of about 0.11 SD units (Figure 260 S54). Notably, the allele (rs11031006-G) conferring the strongest association with FSH levels 261 in the FSH GWAS is the same allele that conferred the greatest chance of having DZ twins in 262 our MODZT GWAS. No association was seen between the SMAD3 signal and serum FSH 263 levels (p = 0.30). 264 265 Relative risk of twin birth and polygenic risk score prediction of DZ twinning and 266 fertility measures 267 We estimated that the relative risk of a twin birth, based on approximately 90,000 births in 268 Iceland between 1950 and 1991, was increased by 18% for each maternal rs11031006-G 269 allele and by 9% for each rs17293443-C allele (Table S8). A higher polygenic risk score 270 (PRS) for DZ twinning, calculated based on the results of the DZ twinning GWAS, was 271 significantly associated with DZ twinning in our independent Icelandic cohort (p = 0.001) 272 (Table S7). A higher PRS was also associated with a higher likelihood of having children (p = 14 273 0.01), higher lifetime number of children (p = 0.03), and an earlier age at first child (p = 0.02) 274 (Table S98). A re-calculated PRS, excluding the 1 Mb regions surrounding the two replicated 275 variants, remained associated with DZ twinning (p = 0.02) and with the likelihood of having 276 children (p = 0.03). These results reflect the polygenic contribution to the susceptibility to DZ 277 twinning and its association with greater reproductive ability. 278 279 SNPs associated with female reproduction traits 280 Table 24 reports on the two loci robustly implicated in DZ twinning and other reproductive 281 traits in women. Consistent with its effects on higher circulating FSH levels, the rs11031006- 282 G allele is also associated with earlier age at menarche,22; 46 earlier age at first child and 283 higher total lifetime number of children, lower risk of polycystic ovary syndrome (PCOS),23 284 and earlier age at natural menopause.24; 47 Also, the DZ twinning SNP rs11031006 is 285 correlated with a variant (rs10835638, FSHB-211G>T) located in the promoter of FSHB (r2 = 286 0.62) that is associated with timing of breast development in girls.48 In contrast, the 287 rs17293443-C allele in SMAD3 was associated only with a later age at last child (Figure S65). 288 Functional in silico annotation of associated variants 289 We explored plausible functional effects of our associated variants using Combined 290 Annotation Dependent Depletion (CADD).38 The FSHB SNP rs11031006 had a high Phred 291 scaled C-score (22.4), indicating that it is among the top 1% of SNPs in the human genome 292 most likely to have a functional effect. The Phred score for the SMAD3 SNP rs17293443 was 293 only 2.71 indicating that it is among the bottom 50% of SNPs in the human genome likely to 294 have a functional effect (Table S10). Examination of individual constituents of the CADD 295 scores showed particularly high conservation-based scores for rs11031006 (Figure S6). To 296 further investigate possible functional effects we examined data from the ENCODE project.44 297 The FSHB SNP rs11031006 alters the sequence of 11 protein-binding motifs including that of Field Code Changed 15 Field Code Changed 298 the Estrogen receptor alpha, indicating a possible effect on hormonal feedback inhibition 299 (Table S11). Variant Effect Predictor (VEP)40 identified rs17293443 as a regulatory region 300 variant within a promoter flanking region (ENSR00000410126) (Figure S7). SNP rs17293443 301 was contained in a DNase I hypersensitive site suggesting open chromatin, although it did not 302 alter any of the transcription factor binding sites present in the promoter flanking region. 303 Together, these data indicate that rs11031006 and rs17293443 may have direct functional 304 roles. 305 Discussion 306 Here we report the first compelling evidence that sequence variation at the FSHB and 307 SMAD3 loci increases the odds of DZ twinning. A number of studies in mothers of DZ twins, 308 but not all,49 have found higher FSH levels responsible for multiple follicle growth.50 The 309 associations of FSHB rs11031006-G with earlier ages at breast development, menarche, 310 menopause, first child, and higher lifetime parity indicates that this locus plays an important 311 role in multiple reproductive aspects. Female carriers of rs11031006-G likely may have a 312 more advanced depletion of the ovarian follicular pool and hence would have an increased 313 risk of premature ovarian failure (POF, [MIM: 612964]). Indeed advanced ovarian aging is a 314 recognized feature of familial DZ twinning, with reported lower levels of anti-Mullerian 315 Hormone (AMH) a marker of lower ovarian primordial follicular reserve.51 The rs17293443- 316 C allele in SMAD3 also increases chances of DZ twinning, but this effect appears independent 317 of circulating FSH levels. 318 Of the 63 suggested candidate genes for twinning, only FSHB was associated in gene- 319 based tests after correcting for multiple testing. Emerging data in human and non-human 320 primates suggest that mechanisms underlying multiple ovulation may differ among species, 321 explaining why in some species twinning is accompanied by unique evolutionary adaptations 322 enabling offspring’s survival, such as the marmoset diminutive fetal size in a simplex uterus, 16 323 and subsequent alloparenting.14 Conversely, in humans, multiple gestations remain an 324 independent risk factor for preterm birth, pregnancy loss, and fetal growth restriction. Thus 325 loci common to species and strains with higher rates of DZ twinning will not necessarily be 326 shared, as the rate, complications, and future reproductive fitness of those twin gestations 327 differ. 328 The third genome-wide hit from the discovery was not replicated in Iceland. SNP 329 rs120644669 is located 149kb from the angiotensinogen (AGT, [MIM: 106150]), which 330 influences ovulatory capacity in mice,52 and 89kb from component of oligomeric golgi 331 complex 2 (COG2) affecting protein glycosilation, which regulates the biological activity of 332 the pituitary gonadotrophins.53 333 Genetic variants near FSHB (rs11031005 and rs11031002, which are highly correlated 334 with rs11031006) are reportedly associated not only with higher serum FSH, but also lower 335 LH levels.54 This agrees with the known response of the gonadotrophic cell to a high 336 frequency hypothalamic pulsatile GnRH signal that reciprocally controls secretion of both 337 hormones.55 Furthermore, under normal conditions suppression of FSH in the early follicular 338 phase and higher LH levels in late phase typically favor mono-ovulation in the human.56 In a 339 study aiming to identify genetic predictors for IVF success or IVF-controlled ovarian 340 stimulation (COS), rs611246 located in FSHB (r2 = 0.3 with rs11031006), was reported 341 significantly associated with measured early follicular phase FSH values and also with the 342 probability of clinical pregnancy, suggesting that these genetic variants are potential predictor 343 candidates that could be considered in clinical ovarian reserve and function assessment in 344 assisted reproduction.57 345 A recent linkage study in cattle reported only one strong signal (p < 1 × 10-28) for 346 ovulation rate in a region spanning SMAD3, SMAD6 (MIM: 602931) and IQCH (MIM: 347 612523).58 SMAD3 encodes one of a family of proteins that function as signal transducers and 17 348 transcriptional modulators that mediate multiple signaling pathways. Observations in mice 349 have established an essential role for Smad3 in mediating TGFbeta and activin signals in the 350 ovarian granulosa cell and also in the pituitary59 to maintain a favorable environment for 351 oocyte maturation.60 Smad3 is strongly expressed in the human ovary, where it promotes 352 granulosa cell proliferation and steroidogenesis possibly by upregulating gonadotrophin 353 receptor signaling pathways.61 Thus, sequence variation in SMAD3 may increase chance of 354 DZ twinning by increasing responsiveness to FSH. Understanding the role of SMAD3 will 355 offer novel opportunities to optimize responsiveness and minimize risk among assisted 356 reproduction technology (ART) recipients for example through adjustment of hormonal 357 stimulation and thus contributes to prevention of life-threatening ovarian hyper stimulation 358 syndrome in hyper responding female carriers of the rs17293443-C allele and conversely 359 prevention of a poor response in patients individuals with allelic variants that lead to a poor 360 response. 361 The potential applications of this work in reproductive medicine are multifold. Firstly, 362 it reveals a well-defined set of loci for DZ twinning in the general population. Secondly, 363 twinning is associated with common perinatal morbidities such as preterm birth, discordant 364 twin growth, latter prenatal asymmetric intrauterine growth restriction, and placental 365 abruption. Multiple gestations are also related to a higher prevalence of maternal morbidities 366 such as preeclampsia, postpartum hemorrhage, and ensuing complications. By understanding 367 the genetic basis of DZ twinning, we concomitantly identify loci conferring susceptibility (or 368 conversely resistance) to these prevalent perinatal comorbidities. The involvement of SMAD3 369 is of relevance in the light of a possible phenotype in male DZ twins namely the repeatedly 370 reported more frequent occurrence of testicular seminoma, a gonadal tumor in which the 371 mitogenic cyclin D2 is overexpressed.62; 63 In animal experimental studies knockout of 372 SMAD3 significantly attenuates cyclin D2 and tumor proliferation.64 This indicates to the 18 373 likely clinical potential of SMAD3 stretching beyond that of fertility physiology and 374 treatment. 375 This study The sequence variants found to influence spontaneous DZ twinning and 376 their relationship with other fertility-related measures, provides important insights into 377 ovarian functioning and the control of natural multiple follicle growth and reproductive aging. 378 This has important implications for fertility, including improved outcome prediction and 379 novel avenues of fertility treatment. Other strengths include the rigorous phenotype inclusion 380 of mothers of DZ twins with a documented history of spontaneous DZ twinning. It is worth 381 mentioning that analyses without excluding mothers who conceived their twins with hormone 382 induction of multiple ovulation or other ART, did not yield to any genome-wide significant 383 results (results not shown). We thus recommend twins registries collect data on mode of 384 pregnancy of twins. There are also some limitations. The current effort focused on unraveling 385 the genetic basis of DZ twinning in European-ancestry populations only. However, the 386 highest incidence of DZ twins was reported in the Nigerian population and some other 387 countries from Africa. Next steps in unraveling the genetic cause of DZ twinning need to 388 include mothers of DZ twins originating from these regions, and if possible also from regions 389 where DZ twinning is a rare trait, such as Japan2. 390 In summary, we identified FSHB and the novel SMAD3 locus as maternal 391 susceptibility loci for DZ twinning. These loci are also significantly and specifically 392 associated with several other aspects of reproductive capacity and health. 393 394 395 396 19 397 398 399 Supplemental Data Supplemental data include seven six figures and 11eight tables Acknowledgments 400 A list of support provided to individual studies appears in supplemental data. Support for the 401 Netherlands Twin Register was obtained from the Netherlands Organization for Scientific 402 Research (NWO) and The Netherlands Organisation for Health Research and Development 403 (ZonMW) grants, 904-61-193,480-04-004, 400-05-717, Addiction-31160008, 911-09-032, 404 Spinozapremie 56-464-14192, Biobanking and Biomolecular Resources Research 405 Infrastructure (BBMRI –NL, 184.021.007); the European Research Council (ERC-230374 406 and ERC-284167); Rutgers University Cell and DNA Repository (NIMH U24 MH068457- 407 06), the Avera Institute, Sioux Falls, South Dakota (USA) and the National Institutes of 408 Health (NIH R01 HD042157-01A1). Part of the genotyping was funded by the Genetic 409 Association Information Network (GAIN) of the Foundation for the National Institutes of 410 Health and Grand Opportunity grants 1RC2 MH089951). We acknowledge support from VU 411 Amsterdam and the Institute for Health and Care Research (EMGO+). The Berghofer Medical 412 Research Institute (QIMR) study was supported by grants from the National Health and 413 Medical Research Council (NHMRC) of Australia (241944, 339462, 389927, 389875, 414 389891, 389892, 389938, 443036, 442915, 442981, 496610, 496739, 552485, 552498, 415 1050208, 1075175). Dale R. Nyholt was supported by the Australian Research Council 416 (ARC) Future Fellowship (FT0991022), NHMRC Research Fellowship (APP0613674) 417 Schemes and by the Visiting Professors Programme (VPP) of the Royal Netherlands 20 418 Academy of Arts and Sciences (KNAW). Allan F. McRae was supported by an NRMRC 419 Career Development Fellowship (APP1083656). Grant W. Montgomery was supported by 420 NIH grant (HD042157, a collaborative study of the genetics of DZ twinning) and NHMRC 421 Fellowship (GNT1078399). The Minnesota Center for Twin and Family Research (MCTFR) 422 was supported in part by USPHS Grants from the National Institute on Alcohol Abuse and 423 Alcoholism (AA09367 and AA11886), and the National Institute on Drug Abuse (DA05147, 424 DA13240, and DA024417). 425 We would like to thank also 23andMe and 23andMe's consented research participants for 426 contributing data on age at menarche for the FSHB gene locus and the Twinning Gwas 427 Consortium (TGC) co-authros from: Finland (Anu Loukola, Juho Wedenoja, Emmi Tikkanen, 428 Beenish Qaiser), Sweden (Nancy Pedersen, Andrea Ganna), United kingdom King's College 429 London (Department of Twin Research & Genetic Epidemiology: Pirro Hysi, Massimo 430 Mangino), Institute of Psychiatry, Psychology & Neuroscience, Medical Research Council 431 Social, Genetic and Developmental Psychiatry Centre (Eva Krapohl, Andrew McMillan). S.S, R.P.K, H.S and K.S are employees of deCODE Genetics/Amgen. The other authors declare no competing financial interests. Web Resources 1000GenomesProject, ftp://ftp-trace.ncbi.nih.gov/1000genomes/ ftp/release/20110521/ CADD, http://cadd.gs.washington.edu/ HaploReg, http://www.broadinstitute.org/mammals/haploreg/haploreg.php Variant Effect Predictor (VEP), http://www.ensembl.org/info/docs/tools/vep/index.html Knowledge-based mining system for Genome-wide Genetic studies (KGG), http://grass.cgs.hku.hk/limx/kgg/ ENCODE, https://genome.ucsc.edu/ENCODE/ OMIM, http://www.omim.org/ 21 LDpred, https://bitbucket.org/bjarni_vilhjalmsson/ldpred MACH, http://www.sph.umich.edu/csg/abecasis/MACH/ Minimac, http://genome.sph.umich.edu/wiki/Minimac Beagle, https://faculty.washington.edu/browning/beagle/b3.html PLINK, http://pngu.mgh.harvard.edu/~purcell/plink/ LocusZoom, http://csg.sph.umich.edu/locuszoom/ References 432 1. Bulmer, M.G. (1970). The biology of twinning in man.(Oxford,: Clarendon). 433 2. Hall, J.G. (2003). Twinning. Lancet 362, 735-743. 434 435 3. Hoekstra, C., Zhao, Z.Z., Lambalk, C.B., Willemsen, G., Martin, N.G., Boomsma, D.I., and Montgomery, G.W. (2008). Dizygotic twinning. Hum Reprod Update 14, 37-47. 436 437 4. Smits, J., and Monden, C. (2011). Twinning across the Developing World. PLoS One 6, e25239. 438 439 440 5. Kulkarni, A.D., Jamieson, D.J., Jones, H.W., Jr., Kissin, D.M., Gallo, M.F., Macaluso, M., and Adashi, E.Y. (2013). Fertility treatments and multiple births in the United States. N Engl J Med 369, 2218-2225. 441 442 6. Fauser, B.C., Devroey, P., and Macklon, N.S. (2005). Multiple birth resulting from ovarian stimulation for subfertility treatment. Lancet 365, 1807-1816. 443 444 445 446 7. Chambers, G.M., Hoang, V.P., Lee, E., Hansen, M., Sullivan, E.A., Bower, C., and Chapman, M. (2014). Hospital costs of multiple-birth and singleton-birth children during the first 5 years of life and the role of assisted reproductive technology. JAMA Pediatr 168, 1045-1053. 447 448 449 450 8. van Heesch, M.M., Evers, J.L., van der Hoeven, M.A., Dumoulin, J.C., van Beijsterveldt, C.E., Bonsel, G.J., Dykgraaf, R.H., van Goudoever, J.B., Koopman-Esseboom, C., Nelen, W.L., et al. (2015). Hospital costs during the first 5 years of life for multiples compared with singletons born after IVF or ICSI. Hum Reprod 30, 1481-1490. 451 452 453 9. Hoekstra, C., Willemsen, G., van Beijsterveldt, C.E., Lambalk, C.B., Montgomery, G.W., and Boomsma, D.I. (2010). Body composition, smoking, and spontaneous dizygotic twinning. Fertil Steril 93, 885-893. 454 455 456 10. Eriksson, A.W., Bressers, W.M., Kostense, P.J., Pitkanen, K.J., Mielke, J.H., Jorde, L.B., Tas, R.F., and Fellman, J.O. (1988). Twinning rate in Scandinavia, Germany and The Netherlands during years of privation. Acta Genet Med Gemellol (Roma) 37, 277-297. 457 458 11. White, C., and Wyshak, G. (1964). Inheritance in Human Dizygotic Twinning. N Engl J Med 271, 1003-1005. 459 460 12. Moore, R.K., Erickson, G.F., and Shimasaki, S. (2004). Are BMP-15 and GDF-9 primary determinants of ovulation quota in mammals? Trends Endocrinol Metab 15, 356-361. 22 461 462 13. (2014). The common marmoset genome provides insight into primate biology and evolution. Nat Genet 46, 850-857. 463 464 465 14. Harris, R.A., Tardif, S.D., Vinar, T., Wildman, D.E., Rutherford, J.N., Rogers, J., Worley, K.C., and Aagaard, K.M. (2014). Evolutionary genetics and implications of small size and twinning in callitrichine primates. Proc Natl Acad Sci U S A 111, 1467-1472. 466 467 15. Lambalk, C.B. (2001). Is there a role for follicle-stimulating-hormone receptor in familial dizygotic twinning? Lancet 357, 735-736. 468 469 470 16. Duffy, D., Montgomery, G., Treloar, S., Birley, A., Kirk, K., Boomsma, D., Beem, L., de Geus, E., Slagboom, E., Knighton, J., et al. (2001). IBD sharing around the PPARG locus is not increased in dizygotic twins or their mothers. Nat Genet 28, 315. 471 472 473 474 17. Painter, J.N., Willemsen, G., Nyholt, D., Hoekstra, C., Duffy, D.L., Henders, A.K., Wallace, L., Healey, S., Cannon-Albright, L.A., Skolnick, M., et al. (2010). A genome wide linkage scan for dizygotic twinning in 525 families of mothers of dizygotic twins. Hum Reprod 25, 1569-1580. 475 476 477 478 18. Palmer, J.S., Zhao, Z.Z., Hoekstra, C., Hayward, N.K., Webb, P.M., Whiteman, D.C., Martin, N.G., Boomsma, D.I., Duffy, D.L., and Montgomery, G.W. (2006). Novel variants in growth differentiation factor 9 in mothers of dizygotic twins. J Clin Endocrinol Metab 91, 4713-4716. 479 480 481 482 19. Luong, H.T., Chaplin, J., McRae, A.F., Medland, S.E., Willemsen, G., Nyholt, D.R., Henders, A.K., Hoekstra, C., Duffy, D.L., Martin, N.G., et al. (2011). Variation in BMPR1B, TGFRB1 and BMPR2 and control of dizygotic twinning. Twin Res Hum Genet 14, 408-416. 483 484 20. Tong, S., Caddy, D., and Short, R.V. (1997). Use of dizygotic to monozygotic twinning ratio as a measure of fertility. Lancet 349, 843-845. 485 486 487 21. Day, F.R., Bulik-Sullivan, B., Hinds, D.A., Finucane, H.K., Murabito, J.M., Tung, J.Y., Ong, K.K., and Perry, J.R. (2015). Shared genetic aetiology of puberty timing between sexes and with health-related outcomes. Nat Commun 6, 8842. 488 489 490 22. Perry, J.R., Day, F., Elks, C.E., Sulem, P., Thompson, D.J., Ferreira, T., He, C., Chasman, D.I., Esko, T., Thorleifsson, G., et al. (2014). Parent-of-origin-specific allelic associations among 106 genomic loci for age at menarche. Nature 514, 92-97. 491 492 493 494 23. Day, F.R., Hinds, D.A., Tung, J.Y., Stolk, L., Styrkarsdottir, U., Saxena, R., Bjonnes, A., Broer, L., Dunger, D.B., Halldorsson, B.V., et al. (2015). Causal mechanisms and balancing selection inferred from genetic associations with polycystic ovary syndrome. Nat Commun 6, 8464. 495 496 497 498 24. Day, F.R., Ruth, K.S., Thompson, D.J., Lunetta, K.L., Pervjakova, N., Chasman, D.I., Stolk, L., Finucane, H.K., Sulem, P., Bulik-Sullivan, B., et al. (2015). Large-scale genomic analyses link reproductive aging to hypothalamic signaling, breast cancer susceptibility and BRCA1-mediated DNA repair. Nat Genet 47, 1294-1303. 499 500 501 502 25. Boomsma, D.I., de Geus, E.J., Vink, J.M., Stubbe, J.H., Distel, M.A., Hottenga, J.J., Posthuma, D., van Beijsterveldt, T.C., Hudziak, J.J., Bartels, M., et al. (2006). Netherlands Twin Register: from twins to twin families. Twin research and human genetics : the official journal of the International Society for Twin Studies 9, 849-857. 503 504 26. Willemsen, G., de Geus, E.J., Bartels, M., van Beijsterveldt, C.E., Brooks, A.I., Estourgievan Burk, G.F., Fugman, D.A., Hoekstra, C., Hottenga, J.J., Kluft, K., et al. (2010). 23 505 506 The Netherlands Twin Register biobank: a resource for genetic epidemiological studies. Twin Res Hum Genet 13, 231-245. 507 508 509 510 27. van Beijsterveldt, C.E., Groen-Blokhuis, M., Hottenga, J.J., Franic, S., Hudziak, J.J., Lamb, D., Huppertz, C., de Zeeuw, E., Nivard, M., Schutte, N., et al. (2013). The Young Netherlands Twin Register (YNTR): longitudinal twin and family studies in over 70,000 children. Twin Res Hum Genet 16, 252-267. 511 512 513 514 28. Willemsen, G., Vink, J.M., Abdellaoui, A., den Braber, A., van Beek, J.H., Draisma, H.H., van Dongen, J., van 't Ent, D., Geels, L.M., van Lien, R., et al. (2013). The Adult Netherlands Twin Register: twenty-five years of survey and biological data collection. Twin Res Hum Genet 16, 271-281. 515 516 517 518 29. Penninx, B.W., Beekman, A.T., Smit, J.H., Zitman, F.G., Nolen, W.A., Spinhoven, P., Cuijpers, P., De Jong, P.J., Van Marwijk, H.W., Assendelft, W.J., et al. (2008). The Netherlands Study of Depression and Anxiety (NESDA): rationale, objectives and methods. International journal of methods in psychiatric research 17, 121-140. 519 520 521 522 30. van Beijsterveldt, C.E., Hoekstra, C., Schats, R., Montgomery, G.W., Willemsen, G., and Boomsma, D.I. (2008). Mode of conception of twin pregnancies: willingness to reply to survey items and comparison of survey data to hospital records. Twin Res Hum Genet 11, 349-351. 523 524 525 526 31. Miller, M.B., Basu, S., Cunningham, J., Eskin, E., Malone, S.M., Oetting, W.S., Schork, N., Sul, J.H., Iacono, W.G., and McGue, M. (2012). The Minnesota Center for Twin and Family Research genome-wide association study. Twin Res Hum Genet 15, 767774. 527 528 529 32. Iacono, W.G., Carlson, S.R., Taylor, J., Elkins, I.J., and McGue, M. (1999). Behavioral disinhibition and the development of substance-use disorders: findings from the Minnesota Twin Family Study. Dev Psychopathol 11, 869-900. 530 531 33. Rogers, W.H. (1993). Regression standard errors in clustered samples. Stata Technical Bulletin 13, 19-23. 532 533 534 535 34. Purcell, S., Neale, B., Todd-Brown, K., Thomas, L., Ferreira, M.A., Bender, D., Maller, J., Sklar, P., de Bakker, P.I., Daly, M.J., et al. (2007). PLINK: a tool set for wholegenome association and population-based linkage analyses. Am J Hum Genet 81, 559575. 536 537 35. Willer, C.J., Li, Y., and Abecasis, G.R. (2010). METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics 26, 2190-2191. 538 539 540 541 36. Gudbjartsson, D.F., Helgason, H., Gudjonsson, S.A., Zink, F., Oddson, A., Gylfason, A., Besenbacher, S., Magnusson, G., Halldorsson, B.V., Hjartarson, E., et al. (2015). Large-scale whole-genome sequencing of the Icelandic population. Nat Genet 47, 435444. 542 543 37. Devlin, B., and Roeder, K. (1999). Genomic control for association studies. Biometrics 55, 997-1004. 544 545 38. Fisher, R.A. (1925). Statistical methods for research workers.(Edinburgh, London,: Oliver and Boyd). 546 547 548 39. Kircher, M., Witten, D.M., Jain, P., O'Roak, B.J., Cooper, G.M., and Shendure, J. (2014). A general framework for estimating the relative pathogenicity of human genetic variants. Nat Genet 46, 310-315. 24 549 550 551 40. Ward, L.D., and Kellis, M. (2012). HaploReg: a resource for exploring chromatin states, conservation, and regulatory motif alterations within sets of genetically linked variants. Nucleic Acids Res 40, D930-934. 552 553 554 41. McLaren, W., Pritchard, B., Rios, D., Chen, Y., Flicek, P., and Cunningham, F. (2010). Deriving the consequences of genomic variants with the Ensembl API and SNP Effect Predictor. Bioinformatics 26, 2069-2070. 555 556 557 42. Li, M.X., Kwan, J.S., and Sham, P.C. (2012). HYST: a hybrid set-based test for genomewide association studies, with application to protein-protein interaction-based association analysis. Am J Hum Genet 91, 478-488. 558 559 560 43. Li, M.X., Gui, H.S., Kwan, J.S., and Sham, P.C. (2011). GATES: a rapid and powerful gene-based association test using extended Simes procedure. Am J Hum Genet 88, 283-293. 561 562 563 564 44. Vilhjalmsson, B.J., Yang, J., Finucane, H.K., Gusev, A., Lindstrom, S., Ripke, S., Genovese, G., Loh, P.R., Bhatia, G., Do, R., et al. (2015). Modeling Linkage Disequilibrium Increases Accuracy of Polygenic Risk Scores. Am J Hum Genet 97, 576-592. 565 566 45. (2012). An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57-74. 567 568 569 46. Lunetta, K.L., Day, F.R., Sulem, P., Ruth, K.S., Tung, J.Y., Hinds, D.A., Esko, T., Elks, C.E., Altmaier, E., He, C., et al. (2015). Rare coding variants and X-linked loci associated with age at menarche. Nat Commun 6, 7756. 570 571 572 573 47. Stolk, L., Perry, J.R., Chasman, D.I., He, C., Mangino, M., Sulem, P., Barbalic, M., Broer, L., Byrne, E.M., Ernst, F., et al. (2012). Meta-analyses identify 13 loci associated with age at menopause and highlight DNA repair and immune pathways. Nat Genet 44, 260-268. 574 575 576 577 48. Hagen, C.P., Sorensen, K., Aksglaede, L., Mouritsen, A., Mieritz, M.G., Tinggaard, J., Wohlfart-Veje, C., Petersen, J.H., Main, K.M., Rajpert-De Meyts, E., et al. (2014). Pubertal onset in girls is strongly influenced by genetic variation affecting FSH action. Sci Rep 4, 6412. 578 579 580 49. Gilfillan, C.P., Robertson, D.M., Burger, H.G., Leoni, M.A., Hurley, V.A., and Martin, N.G. (1996). The control of ovulation in mothers of dizygotic twins. J Clin Endocrinol Metab 81, 1557-1562. 581 582 583 584 50. Lambalk, C.B., Boomsma, D.I., De Boer, L., De Koning, C.H., Schoute, E., PoppSnijders, C., and Schoemaker, J. (1998). Increased levels and pulsatility of folliclestimulating hormone in mothers of hereditary dizygotic twins. J Clin Endocrinol Metab 83, 481-486. 585 586 587 51. Van der Stroom, E.M., Konig, T.E., Vink, J.M., Boomsma, D.I., and Lambalk, C.B. (2013). Ovarian reserve and anti-Mullerian hormone (AMH) in mothers of dizygotic twins. Twin Res Hum Genet 16, 634-638. 588 589 52. Hefler, L.A., and Gregg, A.R. (2001). Influence of the angiotensinogen gene on the ovulatory capacity of mice. Fertil Steril 75, 1206-1211. 590 591 592 53. Wide, L., and Eriksson, K. (2013). Dynamic changes in glycosylation and glycan composition of serum FSH and LH during natural ovarian stimulation. Ups J Med Sci 118, 153-164. 25 593 594 595 596 54. Ruth, K.S., Campbell, P.J., Chew, S., Lim, E.M., Hadlow, N., Stuckey, B.G., Brown, S.J., Feenstra, B., Joseph, J., Surdulescu, G.L., et al. (2015). Genome-wide association study with 1000 genomes imputation identifies signals for nine sex hormone-related phenotypes. Eur J Hum Genet. 597 598 599 55. Magill, J.C., Ciccone, N.A., and Kaiser, U.B. (2013). A mathematical model of pulsecoded hormone signal responses in pituitary gonadotroph cells. Math Biosci 246, 3846. 600 601 56. Hillier, S.G. (2009). Paracrine support of ovarian stimulation. Mol Hum Reprod 15, 843850. 602 603 604 57. Laisk-Podar, T., Kaart, T., Peters, M., and Salumets, A. (2015). Genetic variants associated with female reproductive ageing - potential markers for assessing ovarian function and ovarian stimulation outcome. Reprod Biomed Online. 605 606 58. Kirkpatrick, B.W., and Morris, C.A. (2015). A Major Gene for Bovine Ovulation Rate. PLoS One 10, e0129025. 607 608 609 59. Bernard, D.J. (2004). Both SMAD2 and SMAD3 mediate activin-stimulated expression of the follicle-stimulating hormone beta subunit in mouse gonadotrope cells. Mol Endocrinol 18, 606-623. 610 611 60. Gong, X., and McGee, E.A. (2009). Smad3 is required for normal follicular folliclestimulating hormone responsiveness in the mouse. Biol Reprod 81, 730-738. 612 613 614 61. Liu, Y., Chen, X., Xue, X., Shen, C., Shi, C., Dong, J., Zhang, H., Liang, R., Li, S., and Xu, J. (2014). Effects of Smad3 on the proliferation and steroidogenesis in human ovarian luteinized granulosa cells. IUBMB Life 66, 424-437. 615 616 62. Lambalk, C.B., and Boomsma, D.I. (1998). Twinning, cancer, and genetics. Lancet 351, 909-910. 617 618 619 63. Swerdlow, A.J., De Stavola, B.L., Swanwick, M.A., and Maconochie, N.E. (1997). Risks of breast and testicular cancers in young adult twins in England and Wales: evidence on prenatal and genetic aetiology. Lancet 350, 1723-1728. 620 621 622 64. Looyenga, B.D., and Hammer, G.D. (2007). Genetic removal of Smad3 from inhibin-null mice attenuates tumor progression by uncoupling extracellular mitogenic signals from the cell cycle machinery. Mol Endocrinol 21, 2440-2457. 623 Figures titles and legends Figure 1. Manhattan Plot for the genome-wide association meta-analysis of mothers of dizygotic twins. SNPs are plotted on the x-axis according to their position on each chromosome against the significance of the association on the y-axis (shown as –log10 p 26 values). The dotted red line denotes p=5×10–8 statistical significance. The arrows point to the chromosomal region that reached genome-wide significance level. Figure 2. Forest plots depicting risk allele odds ratio (OR) estimates at (A) rs11031006 (near FSHB), (B) rs17293443 (SMAD3) and (C) rs12064669 from each study, the meta-analysis, deCODE (replication) and the overall combined results. Black squares indicate the OR and horizontal lines represent the 95% CIs. The combined results are indicated by the black diamond. Tables titles and legends Table 1. Number of cases and controls included in the meta-analysis and replication; mean maternal age at delivery Table 2. Sample characteristics for fertility measures traits Table 13. Genome-wide significant loci in the meta-analysis of mothers of spontaneous DZ twins (n = 1,980) and replication (n = 3,597) versus screened controls Table 24. Association results for the implicated DZ twinning SNPs in fertility-related measures 27 Table 1. Number of cases and controls included in the meta-analysis and replication; mean maternal age at delivery No. of subjects Cases Controls Mean maternal Cohort age at delivery (SD) Netherlands 806 4,535 29.7 (4.1) Australia 606 6,556 29.8 (4.8) Minnesota 568 1,862 28.3 (4.5) 1,980 12,953 29.3 (4.5) 3,597 (1,356 genotyped; 2,241 297,348 (76,342 genotyped; 30.3 (5.8) in silico) 221,006 in silico) Total discovery Replication Iceland Table 2. Sample characteristics for measure of fertility Traits Sample size Mean (SD) Age at menarche 259,247 -1 Age at menopause 69,360 -1 FSH levels 17,997 - Has children 41,946 0.942 Number of children 41,946 2.9 (1.6) Age at first child 41,946 23.0 (4.7) Age at last child 41,946 32.1 (5.5) Average birth interval 41,946 4.7 (2.5) 5,184 cases / 82,759 controls -1 Polycystic ovary syndrome 1 2 this is a large meta-analysis of summary statistics and sample mean is not available fraction with children; 28 Table 13. Genome-wide significant loci in the meta-analysis of mothers of spontaneous DZ twins (n=1,980) and replication (n=3,597) versus screened controls Meta-analysis SNP Locus Positiona Gene Annotation Risk Replication RAF OR (95%CI) p RAF 0.85 1.41 1.54×10-9 0.85 1.57×10-8 1.23×10-8 OR (95% CI) Combined p p 1.14 (1.06-1.22) 3×10-3 1.25×10-10 0.21 1.15 (1.07-1.23) 1.44×10-4 6.29×10-11 0.07 1.01 (0.89-1.13) 0.88 2.09×10-7 allele rs11031006 11p14.1 30226528 FSHB 5’ upstream G (1.29-1.53) rs17293443 15q22.33 67437863 SMAD3 Intron C 0.24 1.27 (1.19-1.35) rs12064669 1q42.13 230688643 Intergenic C 0.10 1.43 (1.31-1.55) a SNP position according to NCBI Human Genome Build 37; RAF, risk allele frequency; OR, odd ratio; 95% CI, 95% confidence interval 29 Table 24. Association results for the implicated DZ twinning SNPs in fertility-related measures rs11031006-G allele (near FSHB) DZ twinning and fertility-related measure Effect OR (95% CI) / p value rs17293443-C allele (in SMAD3) Effect Beta (95% CI) OR (95% CI) / p value Beta (95% CI) DZ twinninga Increase 1.41 (1.29, 1.53) 1.54×10-9 Increase 1.27 (1.19, 1.35) 1.57×10-8 FSH levels (SD units)b Increase 0.11 (0.078, 2.3×10-10 ̶ 0.016 (-0.014, 0.30 0.15) Age at menarche (years)c,d Earlier -0.04 (-0.012, 0.045) 8.5×10-10 Earlier̶ 0.011) Age at menopause (years)e Earlier -0.2165 (-0.065, ̶ 8.5×10-14 0.009 (-0.048, 0.049) 1.07 (0.98, 1.17) 0.12 ̶ 0.96 (0.89, 1.04) 0.34 Increase 0.014 (0.00091, 0.03 ̶ 0.0048 (-0.0062, 0.39 0.27)h Earlier -0.20 (-0.31, - 0.016)h ̶ 5.3×10-4 0.032 (-0.065, 0.086) Age at last child (years)f Earlier -0.097 (-0.21, ̶ 0.015 (-0.037, 0.51 0.13) 0.08 Later 0.015) Average birth interval (years)f ‒0.71 ̶ Has children (yes/no)f Age at first child (years)f 0.84 0.010) 0.052) Number of childrenf -0.001 (-0.127, 0.14 (0.043, 4.7×10-3 0.24) 0.57 ̶ 0.0051 (-0.040, 0.82 30 0.067) PCOSg Decrease 0.90 (0.84, 0.95) 0.050) 12.39×10- ̶ 1.00 (0.96,1.05) 0.78 49 MODZT GWAS; bdeCODE sample of 17,997 individuals with FSH levels; c,dLunetta dDay et all . and Perry et al. (ref 21 and 2221) and Perry et al. (ref 22); eDay et al. (ref 2424) and Stolk et al. (ref 25); fdeCODE sample of 41,946 women; gPolycystic ovary syndrome, Hayes Day et al. (ref 223); h factor of increase from log-linear model.. a 31