Abstract
Precise control of protein synthesis by engineering sequence elements in 5â² untranslated regions (5â² UTRs) remains a fundamental challenge. To accelerate our understanding of the cis-regulatory code embedded in 5â² UTRs, we devised massively parallel reporter assays from a synthetic messenger RNA library composed of over one million 5â² UTR variants. A completely randomized 10-nucleotide sequence preceding an upstream open reading frame (uORF) and downstream GFP drives a broad range of translational outputs and mRNA stability in mammalian cells. While efficient translation protects mRNA from degradation, uORF translation triggers mRNA decay in a UPF1-dependent manner. We also identified translational inhibitory elements with G-quadruplexes as marks for mRNA decay in P-bodies. Unexpectedly, an unstructured A-rich element in 5â² UTRs destabilizes mRNAs in the absence of translation, although it enables cap-independent translation. Our results not only identify diverse sequence features of 5â² UTRs that control mRNA translatability, but they also reveal ribosome-dependent and ribosome-independent mRNA-surveillance pathways.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 /Â 30Â days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
Deep sequencing data that support the findings of this study have been deposited in GEO database with the accession code GSE145046. Source data are provided with this paper.
Code availability
All the procedures but those for sequencing reads cleaning were completed using custom Perl and R scripts, which are available upon reasonable request. Source data are provided with this paper.
References
Bicknell, A. A. & Ricci, E. P. When mRNA translation meets decay. Biochem. Soc. Trans. 45, 339â351 (2017).
Schwartz, D. C. & Parker, R. Mutations in translation initiation factors lead to increased rates of deadenylation and decapping of mRNAs in Saccharomyces cerevisiae. Mol. Cell Biol. 19, 5247â5256 (1999).
Kurosaki, T., Popp, M. W. & Maquat, L. E. Quality and quantity control of gene expression by nonsense-mediated mRNA decay. Nat. Rev. Mol. Cell Biol. 20, 406â420 (2019).
Presnyak, V. et al. Codon optimality is a major determinant of mRNA stability. Cell 160, 1111â1124 (2015).
Chan, L. Y., Mugler, C. F., Heinrich, S., Vallotton, P. & Weis, K. Non-invasive measurement of mRNA decay reveals translation initiation as the major determinant of mRNA stability. Elife 7, e32536 (2018).
Hinnebusch, A. G., Ivanov, I. P. & Sonenberg, N. Translational control by 5â²-untranslated regions of eukaryotic mRNAs. Science 352, 1413â1416 (2016).
Dvir, S. et al. Deciphering the rules by which 5â²-UTR sequences affect protein expression in yeast. Proc. Natl Acad. Sci. USA 110, E2792âE2801 (2013).
Noderer, W. L. et al. Quantitative analysis of mammalian translation initiation sites by FACS-seq. Mol. Syst. Biol. 10, 748 (2014).
Cuperus, J. T. et al. Deep learning of the regulatory grammar of yeast 5â² untranslated regions from 500,000 random sequences. Genome Res. 27, 2015â2024 (2017).
Sample, P. J. et al. Human 5â² UTR design and variant effect prediction from a massively parallel translation assay. Nat. Biotechnol. 37, 803â809 (2019).
Starck, S. R. et al. Translation from the 5â² untranslated region shapes the integrated stress response. Science 351, aad3867 (2016).
Dersh, D., Yewdell, J. W. & Wei, J. A SIINFEKL-based system to measure MHC class I antigen presentation efficiency and kinetics. Methods Mol. Biol. 1988, 109â122 (2019).
Ingolia, N. T., Lareau, L. F. & Weissman, J. S. Ribosome profiling of mouse embryonic stem cells reveals the complexity and dynamics of mammalian proteomes. Cell 147, 789â802 (2011).
Lee, S., Liu, B., Huang, S. X., Shen, B. & Qian, S. B. Global mapping of translation initiation sites in mammalian cells at single-nucleotide resolution. Proc. Natl Acad. Sci. USA 109, E2424âE2432 (2012).
Kearse, M. G. & Wilusz, J. E. Non-AUG translation: a new start for protein synthesis in eukaryotes. Genes Dev. 31, 1717â1731 (2017).
Fritz, D. T., Ford, L. P. & Wilusz, J. An in vitro assay to study regulated mRNA stability. Sci. STKE 2000, pl1 (2000).
Kim, Y. K. & Maquat, L. E. UPFront and center in RNA decay: UPF1 in nonsense-mediated mRNA decay and beyond. RNA 25, 407â422 (2019).
Hogg, J. R. & Goff, S. P. Upf1 senses 3â²UTR length to potentiate mRNA decay. Cell 143, 379â389 (2010).
Kwok, C. K., Marsico, G. & Balasubramanian, S. Detecting RNA G-quadruplexes (rG4s) in the transcriptome. Cold Spring Harb. Perspect. Biol. 10, a032284 (2018).
Fay, M. M., Lyons, S. M. & Ivanov, P. RNA G-quadruplexes in biology: principles and molecular mechanisms. J. Mol. Biol. 429, 2127â2147 (2017).
Chen, X. C. et al. Tracking the dynamic folding and unfolding of RNA G-quadruplexes in live cells. Angew. Chem. Int. Ed. Engl. 57, 4702â4706 (2018).
Kumari, S., Bugaut, A., Huppert, J. L. & Balasubramanian, S. An RNA G-quadruplex in the 5â² UTR of the NRAS proto-oncogene modulates translation. Nat. Chem. Biol. 3, 218â221 (2007).
Herdy, B. et al. Analysis of NRAS RNA G-quadruplex binding proteins reveals DDX3X as a novel interactor of cellular G-quadruplex containing transcripts. Nucleic Acids Res. 46, 11592â11604 (2018).
Weingarten-Gabbay, S. et al. Systematic discovery of cap-independent translation sequences in human and viral genomes. Science 351, aad4939 (2016).
Wahle, E. & Winkler, G. S. RNA decay machines: deadenylation by the Ccr4âNot and Pan2âPan3 complexes. Biochim. Biophys. Acta 1829, 561â570 (2013).
Gilbert, W. V., Zhou, K., Butler, T. K. & Doudna, J. A. Cap-independent translation is required for starvation-induced differentiation in yeast. Science 317, 1224â1227 (2007).
Mayr, C. Regulation by 3â²-untranslated regions. Annu. Rev. Genet. 51, 171â194 (2017).
Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J. 17, 10â12 (2011).
Acknowledgements
We thank the Grimson lab for providing us with several shRNAs targeting mammalian decay factors. We are grateful to Cornell University Life Sciences Core Laboratory Center for sequencing, FACS, and confocal microscope support. This work was supported by US National Institutes of Health (R01GM1222814 and R21CA227917) and HHMI Faculty Scholar (55108556) to S.-B.Q.
Author information
Authors and Affiliations
Contributions
S.-B.Q. conceived the project and designed the experiments. L.J. performed the majority of experiments and Y.M. conducted the majority of data analysis. Q.J. contributed to the PABP1 knockdown experiments. D.D. and J.W.Y. helped with 25D1 reagents and HEK293-Kb cells. S.-B.Q. wrote the manuscript with comments from L.J. and Y.M. All authors discussed the results and edited the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Peer review informaiton Peer reviewer reports are available. Anke Sparmann was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.
Publisherâs note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Characterization of uORF reporters.
a, Basic design of the uORF reporter with the SIINFEKL sequence highlighted. 5 additional amino acids (LEQLE) are present, which permits processing of SIINFEKL from the same flanking amino acids regardless of the TIS sequence. b, Bar graphs show the 25D1 or GFP mean fluorescence of HEK293-Kb cells transfected with plasmids of uORF reporters. n = 2 biological replicates, Error bars indicate SEM. c, Representative flow cytometry scatterplots of HEK 293-Kb cells transfected with synthetic mRNA reporters with optimal or suboptimal AUG codons. d, A bar graph shows the ratio of 25D1/GFP in HEK 293-Kb cells transfected with synthetic mRNA or plasmid DNA. Error bars, mean ± s.e.m; n = 3 biological replicates.
Extended Data Fig. 2 Establishing massively parallel uORF reporters.
a, Schematic of generating a library of RNA-based uORF reporters by PCR-amplification using primers composed of random 10-nt sequences upstream of the uORF. Pooled PCR products were utilized as templates for in vitro RNA synthesis followed by 5â capping and 3â polyadenylation. b, Comparison of sequence randomness for nucleotide oligos synthesized by different vendors. c, A histogram shows the distribution of read count on individual unique random sequences. d, A scatter plot shows the correlation of read counts between two biological replicates of original oligo sequences (Rho = 0.79, P < 2.2 Ã 10-16).
Extended Data Fig. 3 Characterizing massively parallel uORF reporters using flow cytometry.
a, Representative flow cytometry scatterplots of HEK 293-Kb cells transfected with the GFP mRNA reporter or pooled uORF reporters. Relative GFP and 25D1 fluorescence intensity between GFP and uORF-GFP reporters are shown in histograms as well as bar graphs. Error bars, mean ± s.e.m. n = 3 biological replicates. b, HEK293-Kb cells were transfected with mRNA reporters followed by FACS soring into GFPH and 25D1H populations. A bar plot (top) shows the ratio of triplet frequency within the random sequences enriched in the 25D1H population over the GFPH population. Only the top 10% sequence variants ranked in 25D1H and GFPH populations are used. The original frequency of triplets in different populations is shown as a heat map (bottom). c, Correlation of triplet frequencies within the sequence variants enriched in 25D1H or GFPH populations. All points are color-encoded based on the similarity to ATG.
Extended Data Fig. 4 Identification of TIS sequence features in uORF reporters enriched in ribosome fractions.
a, A total of 14,381 sequences with all permutations of NNNNAUGNNN are ranked based on the ratio of frequency between monosome and polysome. Both the top and bottom hits are highlighted. b, Sequence logo of 10 nt random sequences with high (top) or low (bottom) M/P ratio. Note that the high M/P sequence is consistent with the Kozak consensus sequence. c, A violin plot shows the ratio of relative frequencies of inserted sequences in monosome over polysome, when ATG triplet is followed by different nucleotides. d, Representative flow cytometry scatterplots of HEK 293-Kb cells transfected with mRNA reporters with sequence variants chosen from monosome (M1 - M4) or polysome (P1 - P4).
Extended Data Fig. 5 uORF translation triggers mRNA decay in a ribosome-dependent manner.
a, A scatter plot shows the correlation of triplet frequencies enriched in polysome and GFPH population from HEK293-Kb cells transfected with mRNA reporters. All points are color-encoded based on the similarity to ATG. b, A scatter plot shows the correlation of in vivo and in vitro half-lives of mRNA reporters. Half-life values were centered to medians. c, A boxplot shows positive correlation between GFP intensities and half-lives of mRNA reporters. All random sequences were divided into five groups based GFP intensity measured by flow cytometry. d, HEK293-Kb cells were transfected with DNA plasmids with or without optimal ATG codons (left) or representative hits from M and P fractions (right), followed by RT-qPCR at indicated time points (n = 3 biological replicates; t test). Error bars indicate SEM. ** P < 0.01; * P < 0.05. e, Representative flow cytometry scatterplots of HEK 293-Kb cells transfected with mRNA reporters capped with ApppG with sequence variants chosen from monosome (M1) or polysome (P1).
Extended Data Fig. 6 Identification of inhibitory elements in 5âUTR that impair translation.
a, Flow cytometry scatterplots of HEK 293-Kb cells transfected with mRNA reporters enriched in the ribosome-free fractions (N1 â N8). b, HEK293-Kb cells were transfected with mRNA reporters or plasmids with sequence variants chosen from the ribosome-free fractions (N1 â N8), followed by RT-qPCR at indicated time points (n = 3 biological replicates; t test). Error bars indicate SEM. ** P < 0.01. c, Flow cytometry scatterplots of HEK 293-Kb cells transfected with mRNA reporters enriched in the ribosome-free fractions (N1 â N8) and stained with QUMA-1.
Extended Data Fig. 7 RG4 in 5âUTR triggers mRNA decay in the P-body.
a, HEK293-Kb cells with specific decay factor knockdown were transfected with the N1 mRNA reporter, followed by RT-qPCR at indicated time points. (n = 3 biological replicates; t test). Error bars indicate SEM. * P < 0.05. b, Representative confocal images of HEK 293-Kb cells transfected with P1 or N1 mRNA reporters. The mRNA reporters were synthesized in the presence of Alexa Fluor-UTP and the transfected cells were co-stained with a DCP2 antibody. DNA was counter-stained with Hoechst 33342. Arrowheads indicate typical mRNA foci. Bar, 10 μm. Images are representative of at least 50 cells.
Extended Data Fig. 8 RG4 in 5âUTR derived from NRAS triggers mRNA decay.
a, The left panel shows the schematic of mRNA reporter with 5âUTR derived from NRAS with (WT) or without (MT) RG4. The right panel shows the decay of mRNA reporters in transfected HEK293-Kb cells. (n = 3 biological replicates; t test). Error bars indicate SEM. ** P < 0.01. b, The stability of endogenous NRAS was measured in HEK293-Kb cells with or without DHX36 knockdown. (n = 3 biological replicates; t test). Error bars indicate SEM. * P < 0.05.
Extended Data Fig. 9 An A-rich element in 5âUTR enables cap-independent translation.
a, A scatter plot shows the distribution of quadruplets between ribosome free and ribosome-bound fractions. Each point refers a 4-mers sequence, and all points are color-encoded based on the ratio of A/C. b, Representative flow cytometry of HEK 293-Kb cells transfected with 10C, 10A, 5A, or 4A mRNA reporters capped with non-functional ApppG. c, Representative flow cytometry of HEK 293-Kb cells transfected with mRNA reporters containing AAAAAUGAAA capped with m7G or ApppG.
Extended Data Fig. 10 An A-rich element in 5âUTR promotes translation-independent decay.
a, HEK293-Kb cells were transfected with 10A or 10C mRNA reporters capped with m7G (right) or ApppG (left), followed by RT-qPCR at indicated time points. (n = 3 biological replicates; t test). Error bars indicate SEM. b, The left panel shows the distribution of in vitro half-lives of mRNA reporters. The most stable (top 10%) sequences are highlighted in red, and the most unstable sequences are highlighted in light blue. The right panel shows the heat map of base frequency at different positions of random sequences. c, A violin plot shows half-life of mRNAs groups with different number of continuous As in random sequences. d, A heat map shows the effect of A-cluster length and position on the in vitro half-life of mRNA reporters. e, The in vitro decay of mRNA reporters (10A, 5A, and 4A) in the lysates of HEK293-Kb cells was determined by RT-qPCR at indicated time points. (n = 3 biological replicates; t test). Error bars indicate SEM. f, The in vitro decay of 10A mRNA reporters in the lysates of HEK293-Kb cells with or without UPF1 knockdown was determined by RT-qPCR at indicated time points (left). For the in vivo stability, HEK293-Kb cells with or without UPF1 knockdown were transfected with 10A mRNA reporters followed by RT-qPCR at indicated time points (right). (n = 3 biological replicates; t test). Error bars indicate SEM. g, The in vitro stability of mRNA reporters (10A, M1, and P1) in the lysates of HEK293-Kb cells with CNOT1, PARN, or PAN3 knockdown was determined by RT-qPCR at indicated time points. (n = 3 biological replicates; t test). Error bars indicate SEM. ** P < 0.01.
Supplementary information
Supplementary Table 1
List of all oligonucleotide sequences used
Source data
Source Data Fig. 1
Statistical Source Data
Source Data Fig. 2
Statistical Source Data
Source Data Fig. 3
Statistical Source Data
Source Data Fig. 4
Statistical Source Data
Source Data Fig. 5
Statistical Source Data
Source Data Fig. 6
Statistical Source Data
Source Data Extended Data Fig. 1
Statistical Source Data
Source Data Extended Data Fig. 3
Statistical Source Data
Source Data Extended Data Fig. 4
Statistical Source Data
Source Data Extended Data Fig. 5
Statistical Source Data
Source Data Extended Data Fig. 6
Statistical Source Data
Source Data Extended Data Fig. 7
Statistical Source Data
Source Data Extended Data Fig. 8
Statistical Source Data
Source Data Extended Data Fig. 9
Statistical Source Data
Source Data Extended Data Fig. 10
Statistical Source Data
Rights and permissions
About this article
Cite this article
Jia, L., Mao, Y., Ji, Q. et al. Decoding mRNA translatability and stability from the 5â² UTR. Nat Struct Mol Biol 27, 814â821 (2020). https://doi.org/10.1038/s41594-020-0465-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41594-020-0465-x
This article is cited by
-
Characterization and optimization of 5´ untranslated region containing poly-adenine tracts in Kluyveromyces marxianus using machine-learning model
Microbial Cell Factories (2024)
-
Optimizing 5âUTRs for mRNA-delivered gene editing using deep learning
Nature Communications (2024)
-
eIF4E-independent translation is largely eIF3d-dependent
Nature Communications (2024)
-
Long-read sequencing for 29 immune cell subsets reveals disease-linked isoforms
Nature Communications (2024)
-
Liver fibrosis pathologies and potentials of RNA based therapeutics modalities
Drug Delivery and Translational Research (2024)