Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Effect of natural genetic variation on enhancer selection and function

Abstract

The mechanisms by which genetic variation affects transcription regulation and phenotypes at the nucleotide level are incompletely understood. Here we use natural genetic variation as an in vivo mutagenesis screen to assess the genome-wide effects of sequence variation on lineage-determining and signal-specific transcription factor binding, epigenomics and transcriptional outcomes in primary macrophages from different mouse strains. We find substantial genetic evidence to support the concept that lineage-determining transcription factors define epigenetic and transcriptomic states by selecting enhancer-like regions in the genome in a collaborative fashion and facilitating binding of signal-dependent factors. This hierarchical model of transcription factor function suggests that limited sets of genomic data for lineage-determining transcription factors and informative histone modifications can be used for the prioritization of disease-associated regulatory variants.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Genetic variation affects LDTF binding.
Figure 2: Genetic variation supports the LDTF collaborative binding model.
Figure 3: Validation of predicted binding and modification patterns.
Figure 4: p65 binding is largely determined by LDTF binding.
Figure 5: Validation of strain-specific enhancer activity and causal variants.

Similar content being viewed by others

Accession codes

Accessions

Gene Expression Omnibus

Data deposits

Data are available in the Gene Expression Omnibus (GEO) under accession GSE46494.

References

  1. Hindorff, L. A. et al. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc. Natl Acad. Sci. USA 106, 9362–9367 (2009)

    Article  ADS  CAS  Google Scholar 

  2. Cowper-Sal-lari, R. et al. Breast cancer risk-associated SNPs modulate the affinity of chromatin for FOXA1 and alter gene expression. Nature Genet. 44, 1191–1198 (2012)

    Article  CAS  Google Scholar 

  3. Degner, J. F. et al. DNase I sensitivity QTLs are a major determinant of human expression variation. Nature 482, 390–394 (2012)

    Article  ADS  CAS  Google Scholar 

  4. Gaffney, D. J. et al. Dissecting the regulatory architecture of gene expression QTLs. Genome Biol. 13, R7 (2012)

    Article  CAS  Google Scholar 

  5. Gaulton, K. J. et al. A map of open chromatin in human pancreatic islets. Nature Genet. 42, 255–259 (2010)

    Article  CAS  Google Scholar 

  6. Heinz, S. et al. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell 38, 576–589 (2010)

    Article  CAS  Google Scholar 

  7. Kasowski, M. et al. Variation in transcription factor binding among humans. Science 328, 232–235 (2010)

    Article  ADS  CAS  Google Scholar 

  8. Maurano, M. T., Wang, H., Kutyavin, T. & Stamatoyannopoulos, J. A. Widespread site-dependent buffering of human regulatory polymorphism. PLoS Genet. 8, e1002599 (2012)

    Article  CAS  Google Scholar 

  9. McDaniell, R. et al. Heritable individual-specific and allele-specific chromatin signatures in humans. Science 328, 235–239 (2010)

    Article  ADS  CAS  Google Scholar 

  10. Reddy, T. E. et al. Effects of sequence variation on differential allelic transcription factor occupancy and gene expression. Genome Res. 22, 860–869 (2012)

    Article  CAS  Google Scholar 

  11. Schaub, M. A., Boyle, A. P., Kundaje, A., Batzoglou, S. & Snyder, M. Linking disease associations with regulatory information in the human genome. Genome Res. 22, 1748–1759 (2012)

    Article  CAS  Google Scholar 

  12. Garber, M. et al. A high-throughput chromatin immunoprecipitation approach reveals principles of dynamic gene regulation in mammals. Mol. Cell 47, 810–822 (2012)

    Article  CAS  Google Scholar 

  13. Mullen, A. C. et al. Master transcription factors determine cell-type-specific responses to TGF-β signaling. Cell 147, 565–576 (2011)

    Article  CAS  Google Scholar 

  14. Soufi, A., Donahue, G. & Zaret, K. S. Facilitators and impediments of the pluripotency reprogramming factors’ initial engagement with the genome. Cell 151, 994–1004 (2012)

    Article  CAS  Google Scholar 

  15. Trompouki, E. et al. Lineage regulators direct BMP and Wnt pathways to cell-specific programs during differentiation and regeneration. Cell 147, 577–589 (2011)

    Article  CAS  Google Scholar 

  16. Ghisletti, S. et al. Identification and characterization of enhancers controlling the inflammatory gene expression program in macrophages. Immunity 32, 317–328 (2010)

    Article  CAS  Google Scholar 

  17. Keane, T. M. et al. Mouse genomic variation and its effect on phenotypes and gene regulation. Nature 477, 289–294 (2011)

    Article  ADS  CAS  Google Scholar 

  18. Mirny, L. A. Nucleosome-mediated cooperativity between transcription factors. Proc. Natl Acad. Sci. USA 107, 22534–22539 (2010)

    Article  ADS  CAS  Google Scholar 

  19. He, H. H. et al. Nucleosome dynamics define transcriptional enhancers. Nature Genet. 42, 343–347 (2010)

    Article  CAS  Google Scholar 

  20. Creyghton, M. P. et al. Histone H3K27ac separates active from poised enhancers and predicts developmental state. Proc. Natl Acad. Sci. USA 107, 21931–21936 (2010)

    Article  ADS  CAS  Google Scholar 

  21. Kaikkonen, M. U. et al. Remodeling of the enhancer landscape during macrophage activation is coupled to enhancer transcription. Mol. Cell 51, 310–325 (2013)

    Article  CAS  Google Scholar 

  22. Rockman, M. V. & Kruglyak, L. Genetics of global gene expression. Nature Rev. Genet. 7, 862–872 (2006)

    Article  CAS  Google Scholar 

  23. Orozco, L. D. et al. Unraveling inflammatory responses using systems genetics and gene-environment interactions in macrophages. Cell 151, 658–670 (2012)

    Article  CAS  Google Scholar 

  24. Song, L. et al. Open chromatin defined by DNaseI and FAIRE identifies regulatory elements that shape cell-type identity. Genome Res. 21, 1757–1767 (2011)

    Article  CAS  Google Scholar 

  25. Bennett, B. J. et al. A high-resolution association mapping panel for the dissection of complex traits in mice. Genome Res. 20, 281–290 (2010)

    Article  CAS  Google Scholar 

  26. Raetz, C. R. et al. Kdo2-Lipid A of Escherichia coli, a defined endotoxin that activates macrophages via TLR-4. J. Lipid Res. 47, 1097–1111 (2006)

    Article  CAS  Google Scholar 

  27. Smale, S. T. Transcriptional regulation in the innate immune system. Curr. Opin. Immunol. 24, 51–57 (2012)

    Article  CAS  Google Scholar 

  28. Wong, D. et al. Extensive characterization of NF-κΒ binding uncovers non-canonical motifs and advances the interpretation of genetic functional traits. Genome Biol. 12, R70 (2011)

    Article  CAS  Google Scholar 

  29. Pham, T. H. et al. Mechanisms of in vivo binding site selection of the hematopoietic master transcription factor PU.1. Nucleic Acids Res. 41, 6391–6402 (2013)

    Article  CAS  Google Scholar 

  30. Jolma, A. et al. DNA-binding specificities of human transcription factors. Cell 152, 327–339 (2013)

    Article  CAS  Google Scholar 

  31. Barski, A. et al. High-resolution profiling of histone methylations in the human genome. Cell 129, 823–837 (2007)

    Article  CAS  Google Scholar 

  32. Wang, D. et al. Reprogramming transcription by distinct classes of enhancers functionally defined by eRNA. Nature 474, 390–394 (2011)

    Article  CAS  Google Scholar 

  33. Langmead, B., Trapnell, C., Pop, M. & Salzberg, S. L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25 (2009)

    Article  Google Scholar 

  34. Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nature Methods 9, 357–359 (2012)

    Article  CAS  Google Scholar 

  35. Hochberg, Y. B. Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. A Stat. Soc. 57, 289–300 (1995)

    MathSciNet  MATH  Google Scholar 

  36. Frazer, K. A. et al. A sequence-based variation map of 8.27 million SNPs in inbred mouse strains. Nature 448, 1050–1053 (2007)

    Article  ADS  CAS  Google Scholar 

  37. Kirby, A. et al. Fine mapping in 94 inbred mouse strains using a high-density haplotype resource. Genetics 185, 1081–1095 (2010)

    Article  CAS  Google Scholar 

  38. Kang, H. M. et al. Efficient control of population structure in model organism association mapping. Genetics 178, 1709–1723 (2008)

    Article  Google Scholar 

  39. Barrett, J. C., Fry, B., Maller, J. & Daly, M. J. Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 21, 263–265 (2005)

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We thank A. J. Lusis for providing access to eQTL data (http://systems.genetics.ucla.edu/) and for productive conversations. We thank D. Pollard for discussions and suggestions, and L. Bautista for assistance with figure preparation. These studies were supported by National Institutes of Health (NIH) grants DK091183, CA17390 and DK063491 (C.K.G.). M.U.K. was supported by the Foundation Leducq Career Development award and grants from Academy of Finland, Finnish Foundation for Cardiovascular Research and Finnish Cultural Foundation, North Savo Regional fund. C.E.R. was supported by the American Heart Association Western States Affiliates (12POST11760017) and the NIH (5T32DK007494).

Author information

Authors and Affiliations

Authors

Contributions

S.H., C.K.G. and C.E.R. designed the study; S.H., C.E.R., K.A.A., M.U.K. and L.D.O. performed experiments; C.E.R. performed all genetic-variation-related analysis; C.B. wrote custom code for HOMER2 and analysed data; K.A.A. and S.H. analysed data; C.E.R., S.H. and C.K.G. wrote the manuscript.

Corresponding author

Correspondence to C. K. Glass.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Extended data figures and tables

Extended Data Figure 1 ChIP-Seq data characteristics.

a, Summary of ChIP-Seq features identified. The number of ChIP-seq regions/peaks identified in untreated primary thioglycolate-elicited macrophages is tabulated for H3K4me2, H3K27ac, PU.1 and C/EBPα. Peaks for p65 were quantified in macrophages treated with 100 ng ml−1 KLA for 1 h. Unless otherwise noted, modification and binding were considered strain-specific at ≥fourfold difference between strains in sequenced tags, and the FDR was <1 × 10−14 based on Poisson cumulative distribution testing and Benjamini and Hochberg correction. b–e, Reproducibility and strain-specific binding. Two separate pools of thioglycolate-elicited macrophages from mice from each strain (C57BL/6J and BALB/cJ) were treated with KLA for 1 h. ChIP-seq for p65 was performed separately on each pool (see Methods). The number of normalized sequencing tags at the union of peaks identified in the indicated experiments is shown. Peaks highlighted in red were deemed experiment-specific using criteria applied throughout this study (fourfold, and FDR < 1 × 10−14 from the cumulative Poisson distribution and Benjamini and Hochberg FDR estimation). The number of experiment-specific peaks is indicated (red) relative to the total number of peaks (black). f, Comparison of the p65 log2 peak tag ratio between strains and experimental sets for all peaks (black), highlighting experiment-specific peaks (red) identified in either d or e. g, Heat map showing pairwise correlation between all p65 experiments. Pearson correlation coefficients are given for each comparison.

Extended Data Figure 2 Strain-specific LDTF binding correlates with variant density and location in LDTF motifs but not with genomic context.

a, Genomic features do not distinguish between strain-similar and strain-specific LDTF binding. Peaks were restricted to promoter-distal peaks (>3 kb to gene start sites). Genomic features (distance to nearest gene, distance to nearest repeat, CpG content and conservation score) were compared among three pairs of strain-similarly bound and strain-specifically bound PU.1 and/or C/EBPα loci (listed as groups 1–6). Box midlines are medians, boundaries are first and third quartiles. Whiskers extend to the extreme data points. CpG content and conservation were quantified in 1-kb regions centred on the LDTF peak. P values from two-sided t-test are given if below 0.05. b, Strain-specific C/EBPα binding occurs in regions with increased variant density. ChIP-Seq tag counts in 200-bp peak regions were stratified into five bins according to log2 ratios of peak tag counts in BALB/cJ versus C57BL/6J mice (x axis, log2 ratio), and the variant density distributions are shown per bin. c, d, Variant density distribution in strain-specific peaks. Mean variant densities within 10-bp bins relative to ChIP-Seq peak centres in strain-similar (red) or strain-specific (blue) peaks. e, Strain-specific PU.1 binding correlates with mutations in PU.1 motifs. PU.1 motif mutations were quantified in PU.1-bound regions and plotted against the logarithmic ratio of PU.1 peak tag counts in each strain (binding ratio) (x axis). The frequency of motifs that were mutated in BALB/cJ are plotted in red and those mutated in C57BL/6J in blue. f, The analogous relationship as shown in e for PU.1 is plotted for C/EBP motif mutations versus C/EBPα strain binding ratio.

Extended Data Figure 3 Strain-specific PU.1 and

C/EBPα binding correlates with strain-specific LDTF motifs. a, Top and degenerate motifs enriched in H3K4me2 and PU.1 or C/EBPα ChIP-Seq peaks. b, NF-κB consensus and degenerate motifs enriched in p65 ChIP-Seq peaks. These motifs were used to query individual genome sequences and identify strain-specific motifs in subsequent analysis. Degenerate ‘weak’ motif occurrence numbers for a given factor include ChIP-Seq peaks containing ‘strong’ motifs. Position weight matrices and log-odds score thresholds for each motif are given in Supplementary Table 1. c, d, Mutations in LDTF motifs affect PU.1 (c) and C/EBPα (d) binding. Left panels show scatterplots for the ChIP-Seq-defined binding of PU.1 (c) and C/EBPα (d) between C57BL/6J (x axes) and BALB/cJ (y axes). Strain-specific motifs were queried within 100-bp of each peak position. Red symbols designate binding events at loci where a polymorphism mutated a C/EBP, PU.1 or AP-1 motif in the C57BL/6J genome, whereas the motif was intact in the BALB/cJ genome. Blue points highlight mutations in these motifs in the BALB/cJ genome only. Violin plots in the right panels show the effect of each motif mutation (along x axes: PU.1, C/EBP, AP-1 and NF-κB) on the ratio of PU.1 (c) and C/EBPα (d) binding between mouse strains, (y axes: positive values denote BALB/cJ-specific, negative values denote C57BL/6J-specific). Tag ratio distributions for peaks overlapping C57BL/6J motif mutations are on the left (light colours), those for peaks overlapping BALB/cJ motif mutations are on the right (dark colours). The fold-difference between mean binding ratios is indicated under the pair of distributions for each motif. The grey distribution indicates PU.1- or C/EBPα-bound loci not overlapping strain-specific motifs.

Extended Data Figure 4 Effects of cognate motif distance from peak centre, variant position within a motif and the presence of alternative motifs on strain-differential binding of PU.1 and C/EBPα.

a–d, PU.1 and C/EBP motif mutations near the experimentally derived peak centre are associated with impaired binding. a, c, The ratios of the frequencies of variant-containing motifs at the given distances from strain-differentially versus strain-similarly bound peak centres (>twofold versus <twofold tag count ratio) for 570 PU.1 (a) and 278 C/EBP (b) variant-containing motifs are shown, respectively. b, d, The distribution of absolute strain peak tag count ratios of peaks whose centre is at the given distances from mutated PU.1 (b) or C/EBP (d) motifs. Box midlines are medians, and boundaries are first and third quartiles. Whiskers extend to the extreme data points. P values are from two-sided t-test. e, f, Effects of alternative PU.1 and C/EBP motifs and core mutations on binding. The number of non-mutated ‘alternative’ PU.1 or C/EBP motifs in the strain with a PU.1 or C/EBP motif mutation was counted, and the absolute respective PU.1 or C/EBPα log2 strain binding ratio is shown. g, Defining the C/EBP motif core by comparing differential versus similar C/EBPα binding. Sequence variants within C/EBP motifs located in loci devoid of alternative C/EBP motifs (n = 178) were counted according to whether they were in differential (blue) or similar (red) C/EBPα-bound peaks. h, The distribution of PU.1 binding strain log2 ratios (x axis) is shown for PU.1 mutations located in the PU.1 core and non-core nucleotides (defined in Fig. 1g). i, The C/EBPα binding strain log2 ratio is shown for C/EBP core and non-core mutations as defined in g. j, k, Motif mutations predominately occur at differentially bound loci. The odds ratios (x axis; equation shown in box) describing the relative effect of the indicated characteristics of mutated motifs on differential binding relative to similar binding are shown for PU.1 (j) and C/EBPα (k). Whiskers show 95% confidence intervals. nt, nucleotides. l, m, The percentage of respective motif mutations consistent with altered PU.1 (l) and C/EBPα (m) binding is shown for the indicated categories of motif mutations.

Extended Data Figure 5 Analysis pipeline for predicting functional PU.1 mutations in NOD.

Data are shown in Fig. 1H.

Extended Data Figure 6 LDTF motif mutations are enriched at strain-specific

C/EBPα-bound loci relative to strain-similar loci. a, The log2 odds ratio for observing a C57BL/6J-specific versus BALB/cJ-specific mutation in the indicated three bins of C/EBPα binding ratios: similar (middle bin), or strain-specifically C/EBPα bound (left and right bins). Details are in the Methods. b, Collaborative binding is largely not mediated by direct protein–protein interactions. A total of 14,199 loci bound by PU.1 and C/EBPα were centred on the PU.1 weak motif (0 on x axes) and cumulative instances of C/EBP and AP-1 motifs were plotted at each position relative to the central PU.1 motif. Interferon response factor (IRF) half-sites are plotted as control for a factor that requires direct protein–protein interactions with PU.1 for DNA binding. The motifs in each comparison showing overlapping sequence and base pair distances are indicated to the right. Peak distances from the central PU.1 motif are indicated in the histograms. RC denotes reverse complement. c, Allelle-specific C/EBPα binding in F1 heterozygotes is similar to binding in homozygous parental strains. C/EBPα ChIP-seq reads from CB6F1/J hybrid F1 macrophages were mapped with no mismatches to both parental genome sequences to identify allele-specific reads. C/EBPα log2 peak tag ratios between the parental strains (BALB/cJ versus C57BL/6J) are shown on the x axis, and the log2 ratio of allele-specific reads in the F1 hybrids are shown on the y axis (BALB/cJ allele versus C57BL/6J allele). C57BL/6J-specific C/EBPα regions are blue, BALB/cJ-specific C/EBPα regions are red, and strain-similar C/EBPα regions are black. Strain-specific or similar regions were defined from parental data.

Extended Data Figure 7 Strain-specific epigenetic marks correlate with LDTF binding, and LDTF mutations segregate with altered H3K4me2 deposition.

a–f, Strain-specificity of LDTF binding and epigenetic marks. The relative amount of H3K4me2 (a–c) and H3K27ac (d–f) between C57BL/6J and BALB/cJ (x axes) is highly correlated with the amount of bound PU.1, C/EBPα or product (PU.1 × C/EBPα). The log2 ratios of the peak tag counts for PU.1, C/EBPα and PU.1 × C/EBPα in each strain are shown relative to the log2 of the peak tag count ratios for H3K4me2 or H3K27ac. Loci containing strain-specific LDTF motifs in a differentially PU.1- or C/EBPα-bound peak are highlighted. Correlation coefficients (Pearson) are indicated for each comparison. g, LDTF mutations segregate with altered H3K4me2 deposition. The log2 of the ratio of the product of the normalized peak tag counts for PU.1 and C/EBPα in 200 bp in each strain (x axis) is compared to the log2 H3K4me2 peak tag ratio in 1 kb (y axis) for loci containing at least a PU.1 or C/EBPα peak. Strain-specific LDTF motif mutations are indicated by the designated symbols and coloured by the mutated strain (C57BL/6J red, BALB/cJ blue). The distribution of H3K4me2 strain ratios stratified by corresponding LDTF strain mutations is shown to the right, with P value from a two-sided t-test. h, Relationships between H3K27ac patterns in different cell types. ES, embryonic stem. Hierarchical clustering of H3K27ac-positive regions as determined by ChIP-Seq and analysis with HOMER. The number of ChIP-seq tags in each of the 86,264 H3K27ac-marked regions used for comparison with eQTL data in Fig. 2e that were detected in at least one cell type was clustered using Euclidean distance.

Extended Data Figure 8 LDTFs prime the p65 cistrome.

a, The 69,517 regions that gained p65 in C57BL/6J after KLA treatment were analysed for binding of PU.1 and C/EBPα with and without KLA treatment as shown in the pie charts. Loci not bound by PU.1 or C/EBPα after KLA treatment were analysed by de novo motif finding. The most enriched motif was AP-1, and the second-most enriched motif was NF-κB. b, Violin plots of the p65 strain ratios of mean-normalized p65 binding for p65-bound peaks stratified by motifs mutations present in either BALB/cJ or C57BL/6J. Mutated motifs included PU.1 (strong and weak), C/EBP (strong and weak), C/EBP:AP-1 heterodimers, AP-1 and NF-κB. The effect on p65 binding per group is shown by comparing the mean-normalized p65 tag binding ratio along the y axis (log2(BALB/cJ–C57BL/6J); positive values denote BALB/cJ-specific, negative values denote C57BL/6J-specific). White circles indicate the distribution means, and the average fold change associated with C57BL/6J-mutating and BALB/cJ-mutating SNPs in the respective motifs is given beneath. One-sided t-test P values between each pair of distributions ranged from 1 × 10−29 to 1 × 10−14. c, Variant density in strain-specific and strain-similar p65 peaks. Mean variant density within 10-bp bins relative to p65 ChIP-Seq peak centres in strain-similar (red) or strain-specific peaks (blue). d–e, The variant density distribution in strain-specific p65 peaks is broader than those for PU.1 or C/EBPα. Fold enrichment of variant densities in strain-specific relative to strain-similar peaks (y axes) for PU.1 (d), C/EBPα (e) and p65 (f) is shown relative to the peak centres (x axes). Ratios plotted in d and e are from data in Extended Data Fig. 2c and d, respectively.

Extended Data Figure 9 Validation of strain-specific enhancer activity.

a, Enhancer activity in transient reporter assays correlates with strain-specific LDTF and p65 binding. Luciferase assay results for 24 loci (20 strain-specific enhancers with strain-specific motifs, 1 positive control with strain-similar enhancer activity (row 7, column 3), 2 negative controls lacking enhancer activity in both strains (row 8, columns 1 and 2), and 1 strain-specific enhancer lacking a strain-specific motif (row 8, column 3)) in transiently transfected RAW264.7 cells 48 h after transfection. Each 1-kb locus is represented by the horizontal midline within a box (see Fig. 5). ChIP-seq tag pile-ups are shown for PU.1 (green), C/EBPα (blue), p65 (red), H3K27ac (purple) and H3K4me2 (orange) for C57BL/6J (above midline) and BALB/cJ (below midline) with identical scales. Binding/modification data are shown after treatment with 100 ng ml−1 KLA. Vertical black lines indicate SNP locations. Horizontal bars indicate average luciferase (enhancer) activity of the empty vector (blue, no enhancer), activity of a locus cloned from either strain in grey C57BL/6J (above) and BALB/cJ (below) under basal conditions, or after overnight stimulation with 100 ng ml−1 KLA (pink). Luciferase values from transiently transfected cells were normalized to the activity measured for a co-transfected UB6 promoter-β-galactosidase reporter construct. Empty vector values were scaled to 0.5 for the first four loci, and to 1 for the remaining loci. Constructs in which the predicted motif-disrupting variant alleles were swapped are denoted by ‘M’, with mutations causing a significant effect in at least two out of three replicates being denoted by an additional asterisk (P < 0.05, one-sided t-test). Error bars show s.d. from three biological replicates, average values are indicated next to each bar. Experiments were replicated at least three times. Significant strain-specific enhancer activity is indicated by a dagger (grey without treatment, red after KLA treatment, one-tailed t-test, P < 0.05). b, Chromatinization is necessary for the strain specificity of a subset of enhancers. RAW264.7 cells were stably transfected with the two constructs containing the loci that showed strain-specific binding but lacked strain-specific enhancer activity in transient reporter assays (row 4, column 1 and row 1, column 3, marked by an asterisk). Luciferase activity measured in lysates of stably transfected cells was normalized to total protein content. RLU, relative light units.

Extended Data Figure 10 Motif analysis identifies causal SNPs in enhancers.

Regions of ∼1 kb size centred on PU.1 or C/EBPα ChIP-Seq peaks of similar tag count in C57BL/6J and BALB/cJ (t-test (P < 0.05) are marked with an asterisk. Strain and motif mutated by a variant are indicated below denoted by the ‘m’ prefix. In the table, plus signs indicate whether a tested enhancer contains an alternative motif for the same factor, a variant at a motif position that is not located at a motif core as defined in Fig. 1g and Extended Data Fig. 4g, or a variant in a motif located less than 20 bp away from the peak centre. Characteristics of the loci and primer sequences are in Supplementary Table 3. b, Identifying causal variants by motif analysis. Left panels show the ChIP-Seq pile-ups and SNP locations as in Extended Data Fig. 9. Right panels plot the relative enhancer reporter luciferase activities of the loci shown on the left, either in the wild-type configuration or when swapping the SNP indicated by a black triangle by site-directed mutagenesis. Motifs mutated by the indicated SNPs are shown above, with the mutation underlined and in red. c, To confirm that the centrally located PU.1 motif is essential for the C57BL/6J-specific activity, a 1-kb fragment of the locus from C57BL/6J or BALB/cJ was cloned into the luciferase reporter as described in Fig. 5 and the effects of swapping alleles at the predicted causal PU.1 SNP and flanking control 5′ and 3′ SNPs on enhancer activity are shown. Swapping alleles at the PU.1 SNP reversed strain-specific enhancer activity, whereas swapping alleles at either flanking SNP had no significant effect.

Supplementary information

Supplementary Table 1 - HOMER-formatted motif files for the motifs used for strain-specific motif finding listed in Extended Data Figure 3a,b

The header rows, which begin with ">", list the consensus motif, the motif name, and the log-odds threshold above which a given sequence is considered to be positive for the motif. Below each header is the position weight matrix that lists the frequency of each nucleotide (A, C, G, T in the columns from left to right, respectively) at each position (rows) of the motif from top to bottom. (XLSX 41 kb)

Supplementary Table 2 - Strain-specific PU.1-bound loci where NOD broke the C57BL6/BALBc haplotypes

Loci are shown in rows. The number of variants at each region is shown between C57BL/6J and BALB/cJ in column 4. The number of variants with alleles matching the binding pattern observed across NOD, C57BL/6J, and BALB/cJ are shown in column 5. (XLSX 44 kb)

Supplementary Table 3 - Strain-similar loci cloned for luciferase reporter assays

The genomic location, variant information, strain-specific motif information, and primer sequences used to clone strain-similar loci are shown in columns for the 9 loci tested (data in Extended Data Figure 10a). (XLSX 36 kb)

PowerPoint slides

Source data

Rights and permissions

Reprints and permissions

About this article

Cite this article

Heinz, S., Romanoski, C., Benner, C. et al. Effect of natural genetic variation on enhancer selection and function. Nature 503, 487–492 (2013). https://doi.org/10.1038/nature12615

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nature12615

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing