Noncanonical open reading frames encode functional proteins essential for cancer cell survival.


Journal

Nature biotechnology
ISSN: 1546-1696
Titre abrégé: Nat Biotechnol
Pays: United States
ID NLM: 9604648

Informations de publication

Date de publication:
06 2021
Historique:
received: 18 02 2020
accepted: 16 12 2020
pubmed: 30 1 2021
medline: 28 8 2021
entrez: 29 1 2021
Statut: ppublish

Résumé

Although genomic analyses predict many noncanonical open reading frames (ORFs) in the human genome, it is unclear whether they encode biologically active proteins. Here we experimentally interrogated 553 candidates selected from noncanonical ORF datasets. Of these, 57 induced viability defects when knocked out in human cancer cell lines. Following ectopic expression, 257 showed evidence of protein expression and 401 induced gene expression changes. Clustered regularly interspaced short palindromic repeat (CRISPR) tiling and start codon mutagenesis indicated that their biological effects required translation as opposed to RNA-mediated effects. We found that one of these ORFs, G029442-renamed glycine-rich extracellular protein-1 (GREP1)-encodes a secreted protein highly expressed in breast cancer, and its knockout in 263 cancer cell lines showed preferential essentiality in breast cancer-derived lines. The secretome of GREP1-expressing cells has an increased abundance of the oncogenic cytokine GDF15, and GDF15 supplementation mitigated the growth-inhibitory effect of GREP1 knockout. Our experiments suggest that noncanonical ORFs can express biologically active proteins that are potential therapeutic targets.

Identifiants

pubmed: 33510483
doi: 10.1038/s41587-020-00806-2
pii: 10.1038/s41587-020-00806-2
pmc: PMC8195866
mid: NIHMS1698134
doi:

Substances chimiques

Neoplasm Proteins 0

Types de publication

Letter Research Support, N.I.H., Extramural

Langues

eng

Sous-ensembles de citation

IM

Pagination

697-704

Subventions

Organisme : NICHD NIH HHS
ID : R01 HD091846
Pays : United States
Organisme : NIGMS NIH HHS
ID : R35 GM138192
Pays : United States
Organisme : NICHD NIH HHS
ID : R01 HD073104
Pays : United States
Organisme : NCI NIH HHS
ID : K12 CA090354
Pays : United States
Organisme : NCI NIH HHS
ID : R00 CA207865
Pays : United States

Références

Ewing, B. & Green, P. Analysis of expressed sequence tags indicates 35,000 human genes. Nat. Genet. 25, 232–234 (2000).
pubmed: 10835644 doi: 10.1038/76115
Fields, C., Adams, M. D., White, O. & Venter, J. C. How many genes in the human genome? Nat. Genet. 7, 345–346 (1994).
pubmed: 7920649 doi: 10.1038/ng0794-345
Liang, F. et al. Gene index analysis of the human genome estimates approximately 120,000 genes. Nat. Genet. 25, 239–240 (2000).
pubmed: 10835646 doi: 10.1038/76126
Omenn, G. S. et al. Progress on identifying and characterizing the human proteome: 2018 metrics from the HUPO Human Proteome Project. J. Proteome Res. 17, 4031–4041 (2018).
pubmed: 30099871 pmcid: 6387656 doi: 10.1021/acs.jproteome.8b00441
Ingolia, N. T. et al. Ribosome profiling reveals pervasive translation outside of annotated protein-coding genes. Cell Rep. 8, 1365–1379 (2014).
pubmed: 25159147 pmcid: 4216110 doi: 10.1016/j.celrep.2014.07.045
Ji, Z., Song, R., Regev, A. & Struhl, K. Many lncRNAs, 5′UTRs, and pseudogenes are translated and some are likely to express functional proteins. eLife 4, e08890 (2015).
pubmed: 26687005 pmcid: 4739776 doi: 10.7554/eLife.08890
Pertea, M. et al. CHESS: a new human gene catalog curated from thousands of large-scale RNA sequencing experiments reveals extensive transcriptional noise. Genome Biol. 19, 208 (2018).
pubmed: 30486838 pmcid: 6260756 doi: 10.1186/s13059-018-1590-2
van Heesch, S. et al. The translational landscape of the human heart. Cell 178, 242–260 (2019).
pubmed: 31155234 doi: 10.1016/j.cell.2019.05.010
Burge, C. & Karlin, S. Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268, 78–94 (1997).
pubmed: 9149143 doi: 10.1006/jmbi.1997.0951
Dinger, M. E., Pang, K. C., Mercer, T. R. & Mattick, J. S. Differentiating protein-coding and noncoding RNA: challenges and ambiguities. PLoS Comput. Biol. 4, e1000176 (2008).
pubmed: 19043537 pmcid: 2518207 doi: 10.1371/journal.pcbi.1000176
Lander, E. S. et al. Initial sequencing and analysis of the human genome. Nature 409, 860–921 (2001).
pubmed: 11237011 doi: 10.1038/35057062
Mouse Genome Sequencing Consortium Initial sequencing and comparative analysis of the mouse genome. Nature 420, 520–562 (2002).
doi: 10.1038/nature01262
Mudge, J. M. et al. Discovery of high-confidence human protein-coding genes and exons by whole-genome PhyloCSF helps elucidate 118 GWAS loci. Genome Res. 29, 2073–2087 (2019).
pubmed: 31537640 pmcid: 6886504 doi: 10.1101/gr.246462.118
Banfai, B. et al. Long noncoding RNAs are rarely translated in two human cell lines. Genome Res. 22, 1646–1657 (2012).
pubmed: 22955977 pmcid: 3431482 doi: 10.1101/gr.134767.111
Jungreis, I. et al. Nearly all new protein-coding predictions in the CHESS database are not protein-coding. Preprint at bioRxiv https://doi.org/10.1101/360602 (2018).
Bazzini, A. A. et al. Identification of small ORFs in vertebrates using ribosome footprinting and evolutionary conservation. EMBO J. 33, 981–993 (2014).
pubmed: 24705786 pmcid: 4193932 doi: 10.1002/embj.201488411
Branca, R. M. et al. HiRIEF LC–MS enables deep proteome coverage and unbiased proteogenomics. Nat. Methods 11, 59–62 (2014).
pubmed: 24240322 doi: 10.1038/nmeth.2732
Cabili, M. N. et al. Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses. Genes Dev. 25, 1915–1927 (2011).
pubmed: 21890647 pmcid: 3185964 doi: 10.1101/gad.17446611
Calviello, L. et al. Detecting actively translated open reading frames in ribosome profiling data. Nat. Methods 13, 165–170 (2016).
pubmed: 26657557 doi: 10.1038/nmeth.3688
Gao, X. et al. Quantitative profiling of initiating ribosomes in vivo. Nat. Methods 12, 147–153 (2015).
pubmed: 25486063 doi: 10.1038/nmeth.3208
Gascoigne, D. K. et al. Pinstripe: a suite of programs for integrating transcriptomic and proteomic datasets identifies novel proteins and improves differentiation of protein-coding and non-coding genes. Bioinformatics 28, 3042–3050 (2012).
pubmed: 23044541 doi: 10.1093/bioinformatics/bts582
Iyer, M. K. et al. The landscape of long noncoding RNAs in the human transcriptome. Nat. Genet. 47, 199–208 (2015).
pubmed: 25599403 pmcid: 4417758 doi: 10.1038/ng.3192
Kim, M. S. et al. A draft map of the human proteome. Nature 509, 575–581 (2014).
pubmed: 24870542 pmcid: 4403737 doi: 10.1038/nature13302
Koch, A. et al. A proteogenomics approach integrating proteomics and ribosome profiling increases the efficiency of protein identification and enables the discovery of alternative translation start sites. Proteomics 14, 2688–2698 (2014).
pubmed: 25156699 pmcid: 4391000 doi: 10.1002/pmic.201400180
Ma, J. et al. Discovery of human sORF-encoded polypeptides (SEPs) in cell lines and tissue. J. Proteome Res. 13, 1757–1765 (2014).
pubmed: 24490786 pmcid: 3993966 doi: 10.1021/pr401280w
Mackowiak, S. D. et al. Extensive identification and analysis of conserved small ORFs in animals. Genome Biol. 16, 179 (2015).
pubmed: 26364619 pmcid: 4568590 doi: 10.1186/s13059-015-0742-x
Ruiz-Orera, J., Messeguer, X., Subirana, J. A. & Alba, M. M. Long non-coding RNAs as a source of new peptides. eLife 3, e03523 (2014).
pubmed: 25233276 pmcid: 4359382 doi: 10.7554/eLife.03523
Schwaid, A. G. et al. Chemoproteomic discovery of cysteine-containing human short open reading frames. J. Am. Chem. Soc. 135, 16750–16753 (2013).
pubmed: 24152191 doi: 10.1021/ja406606j
Slavoff, S. A. et al. Peptidomic discovery of short open reading frame-encoded peptides in human cells. Nat. Chem. Biol. 9, 59–64 (2013).
pubmed: 23160002 doi: 10.1038/nchembio.1120
Sun, H. et al. Integration of mass spectrometry and RNA-seq data to confirm human ab initio predicted genes and lncRNAs. Proteomics 14, 2760–2768 (2014).
pubmed: 25339270 doi: 10.1002/pmic.201400174
Zhang, C. et al. Systematic analysis of missing proteins provides clues to help define all of the protein-coding genes on human chromosome 1. J. Proteome Res. 13, 114–125 (2014).
pubmed: 24256544 doi: 10.1021/pr400900j
Vanderperre, B. et al. Direct detection of alternative open reading frames translation products in human significantly expands the proteome. PLoS ONE 8, e70698 (2013).
pubmed: 23950983 pmcid: 3741303 doi: 10.1371/journal.pone.0070698
Wilhelm, M. et al. Mass-spectrometry-based draft of the human proteome. Nature 509, 582–587 (2014).
pubmed: 24870543 doi: 10.1038/nature13319
Subramanian, A. et al. A next generation connectivity map: L1000 platform and the first 1,000,000 profiles. Cell 171, 1437–1452 (2017).
pubmed: 29195078 pmcid: 5990023 doi: 10.1016/j.cell.2017.10.049
Nassa, M. et al. Analysis of human collagen sequences. Bioinformation 8, 26–33 (2012).
pubmed: 22359431 pmcid: 3282272 doi: 10.6026/97320630008026
Breit, S. N., Tsai, V. W. & Brown, D. A. Targeting obesity and cachexia: Identification of the GFRAL receptor-MIC-1/GDF15 pathway. Trends Mol. Med. 23, 1065–1067 (2017).
pubmed: 29129392 doi: 10.1016/j.molmed.2017.10.005
Mullican, S. E. & Rangwala, S. M. Uniting GDF15 and GFRAL: therapeutic opportunities in obesity and beyond. Trends Endocrinol. Metab. 29, 560–570 (2018).
pubmed: 29866502 doi: 10.1016/j.tem.2018.05.002
Baroni, M. et al. Distinct response to GDF15 knockdown in pediatric and adult glioblastoma cell lines. J. Neurooncol. 139, 51–60 (2018).
pubmed: 29671197 doi: 10.1007/s11060-018-2853-1
Huang, C. Y. et al. Molecular alterations in prostate carcinomas that associate with in vivo exposure to chemotherapy: identification of a cytoprotective mechanism involving growth differentiation factor 15. Clin. Cancer Res. 13, 5825–5833 (2007).
pubmed: 17908975 doi: 10.1158/1078-0432.CCR-07-1037
Ratnam, N. M. et al. NF-kappaB regulates GDF-15 to suppress macrophage surveillance during early tumor development. J. Clin. Invest. 127, 3796–3809 (2017).
pubmed: 28891811 pmcid: 5617672 doi: 10.1172/JCI91561
Corre, J. et al. Bioactivity and prognostic significance of growth differentiation factor GDF15 secreted by bone marrow mesenchymal stem cells in multiple myeloma. Cancer Res. 72, 1395–1406 (2012).
pubmed: 22301101 doi: 10.1158/0008-5472.CAN-11-0188
Peake, B. F., Eze, S. M., Yang, L., Castellino, R. C. & Nahta, R. Growth differentiation factor 15 mediates epithelial mesenchymal transition and invasion of breast cancers through IGF-1R-FoxM1 signaling. Oncotarget 8, 94393–94406 (2017).
pubmed: 29212236 pmcid: 5706882 doi: 10.18632/oncotarget.21765
Martinez, T. F. et al. Accurate annotation of human protein-coding small open reading frames. Nat. Chem. Biol. 16, 458–468 (2020).
pubmed: 31819274 doi: 10.1038/s41589-019-0425-0
Chen, J. et al. Pervasive functional translation of noncanonical human open reading frames. Science 367, 1140–1146 (2020).
pubmed: 32139545 pmcid: 7289059 doi: 10.1126/science.aay0262
Xie, W. et al. Epigenomic analysis of multilineage differentiation of human embryonic stem cells. Cell 153, 1134–1148 (2013).
pubmed: 23664764 pmcid: 3786220 doi: 10.1016/j.cell.2013.04.022
Chen, J. et al. Evolutionary analysis across mammals reveals distinct classes of long non-coding RNAs. Genome Biol. 17, 19 (2016).
pubmed: 26838501 pmcid: 4739325 doi: 10.1186/s13059-016-0880-9
Liu, S. J. et al. CRISPRi-based genome-scale identification of functional long noncoding RNA loci in human cells. Science 355, aah7111 (2017).
Petersen, T. N., Brunak, S., von Heijne, G. & Nielsen, H. SignalP 4.0: discriminating signal peptides from transmembrane regions. Nat. Methods 8, 785–786 (2011).
pubmed: 21959131 doi: 10.1038/nmeth.1701
Kelley, L. A., Mezulis, S., Yates, C. M., Wass, M. N. & Sternberg, M. J. The Phyre2 web portal for protein modeling, prediction and analysis. Nat. Protoc. 10, 845–858 (2015).
pubmed: 25950237 pmcid: 5298202 doi: 10.1038/nprot.2015.053
Domazet-Loso, T., Brajkovic, J. & Tautz, D. A phylostratigraphy approach to uncover the genomic history of major adaptations in metazoan lineages. Trends Genet. 23, 533–539 (2007).
pubmed: 18029048 doi: 10.1016/j.tig.2007.08.014
Domazet-Loso, T. et al. No evidence for phylostratigraphic bias impacting inferences on patterns of gene emergence and evolution. Mol. Biol. Evol. 34, 843–856 (2017).
pubmed: 28087778 pmcid: 5400388
Kumar, S., Stecher, G., Suleski, M. & Hedges, S. B. TimeTree: a resource for timelines, timetrees, and divergence times. Mol. Biol. Evol. 34, 1812–1819 (2017).
pubmed: 28387841 doi: 10.1093/molbev/msx116
Yang, X. et al. A public genome-scale lentiviral expression library of human ORFs. Nat. Methods 8, 659–661 (2011).
pubmed: 21706014 pmcid: 3234135 doi: 10.1038/nmeth.1638
Subramanian, A. et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl Acad. Sci. USA 102, 15545–15550 (2005).
pubmed: 16199517 pmcid: 1239896 doi: 10.1073/pnas.0506580102
Ross, Z., Wickham, H., Robinson, D. Declutter your R workflow with tidy tools. Preprint at PeerJ https://peerj.com/preprints/3180.pdf (2017).
Enache, O. M. et al. The GCTx format and cmap{Py, R, M, J} packages: resources for optimized storage and integrated traversal of annotated dense matrices. Bioinformatics 35, 1427–1429 (2019).
pubmed: 30203022 doi: 10.1093/bioinformatics/bty784
Doench, J. G. et al. Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9. Nat. Biotechnol. 34, 184–191 (2016).
pubmed: 26780180 pmcid: 4744125 doi: 10.1038/nbt.3437
Piccioni, F., Younger, S. T. & Root, D. E. Pooled lentiviral-delivery genetic screens. Curr. Protoc. Mol. Biol. 121, 32.1.1–32.1.21 (2018).
doi: 10.1002/cpmb.52
Meyers, R. M. et al. Computational correction of copy number effect improves specificity of CRISPR-Cas9 essentiality screens in cancer cells. Nat. Genet. 49, 1779–1784 (2017).
pubmed: 29083409 pmcid: 5709193 doi: 10.1038/ng.3984
Hart, T., Brown, K. R., Sircoulomb, F., Rottapel, R. & Moffat, J. Measuring error rates in genomic perturbation screens: gold standards for human functional genomics. Mol. Syst. Biol. 10, 733 (2014).
pubmed: 24987113 pmcid: 4299491 doi: 10.15252/msb.20145216
Bae, S., Park, J. & Kim, J. S. Cas-OFFinder: a fast and versatile algorithm that searches for potential off-target sites of Cas9 RNA-guided endonucleases. Bioinformatics 30, 1473–1475 (2014).
pubmed: 24463181 pmcid: 4016707 doi: 10.1093/bioinformatics/btu048
Yu, C. et al. High-throughput identification of genotype-specific cancer vulnerabilities in mixtures of barcoded tumor cell lines. Nat. Biotechnol. 34, 419–423 (2016).
pubmed: 26928769 pmcid: 5508574 doi: 10.1038/nbt.3460
Pinello, L. et al. Analyzing CRISPR genome-editing experiments with CRISPResso. Nat. Biotechnol. 34, 695–697 (2016).
pubmed: 27404874 pmcid: 5242601 doi: 10.1038/nbt.3583
Niknafs, Y. S. et al. MiPanda: a resource for analyzing and visualizing next-generation sequencing transcriptomics data. Neoplasia 20, 1144–1149 (2018).
pubmed: 30268942 pmcid: 6171536 doi: 10.1016/j.neo.2018.09.001
Shevchenko, A., Wilm, M., Vorm, O. & Mann, M. Mass spectrometric sequencing of proteins silver-stained polyacrylamide gels. Anal. Chem. 68, 850–858 (1996).
pubmed: 8779443 doi: 10.1021/ac950914h
Peng, J. & Gygi, S. P. Proteomics: the move to mixtures. J. Mass Spectrom. 36, 1083–1091 (2001).
pubmed: 11747101 doi: 10.1002/jms.229
Eng, J. K., McCormack, A. L. & Yates, J. R. An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. J. Am. Soc. Mass Spectrom. 5, 976–989 (1994).
pubmed: 24226387 doi: 10.1016/1044-0305(94)80016-2
Beausoleil, S. A., Villen, J., Gerber, S. A., Rush, J. & Gygi, S. P. A probability-based approach for high-throughput protein phosphorylation analysis and site localization. Nat. Biotechnol. 24, 1285–1292 (2006).
pubmed: 16964243 doi: 10.1038/nbt1240
Jones, D. T. & Cozzetto, D. DISOPRED3: precise disordered region predictions with annotated protein-binding activity. Bioinformatics 31, 857–863 (2015).
pubmed: 25391399 doi: 10.1093/bioinformatics/btu744

Auteurs

John R Prensner (JR)

Broad Institute of Harvard and MIT, Cambridge, MA, USA.
Department of Pediatric Oncology, Dana-Farber Cancer Institute, Boston, MA, USA.
Division of Pediatric Hematology/Oncology, Boston Children's Hospital, Boston, MA, USA.

Oana M Enache (OM)

Broad Institute of Harvard and MIT, Cambridge, MA, USA.

Victor Luria (V)

Department of Systems Biology, Harvard Medical School, Boston, MA, USA.

Karsten Krug (K)

Broad Institute of Harvard and MIT, Cambridge, MA, USA.

Karl R Clauser (KR)

Broad Institute of Harvard and MIT, Cambridge, MA, USA.

Joshua M Dempster (JM)

Broad Institute of Harvard and MIT, Cambridge, MA, USA.

Amir Karger (A)

IT-Research Computing, Harvard Medical School, Boston, MA, USA.

Li Wang (L)

Broad Institute of Harvard and MIT, Cambridge, MA, USA.

Karolina Stumbraite (K)

Broad Institute of Harvard and MIT, Cambridge, MA, USA.

Vickie M Wang (VM)

Broad Institute of Harvard and MIT, Cambridge, MA, USA.

Ginevra Botta (G)

Broad Institute of Harvard and MIT, Cambridge, MA, USA.

Nicholas J Lyons (NJ)

Broad Institute of Harvard and MIT, Cambridge, MA, USA.

Amy Goodale (A)

Broad Institute of Harvard and MIT, Cambridge, MA, USA.

Zohra Kalani (Z)

Broad Institute of Harvard and MIT, Cambridge, MA, USA.

Briana Fritchman (B)

Broad Institute of Harvard and MIT, Cambridge, MA, USA.

Adam Brown (A)

Broad Institute of Harvard and MIT, Cambridge, MA, USA.

Douglas Alan (D)

Broad Institute of Harvard and MIT, Cambridge, MA, USA.

Thomas Green (T)

Broad Institute of Harvard and MIT, Cambridge, MA, USA.

Xiaoping Yang (X)

Broad Institute of Harvard and MIT, Cambridge, MA, USA.

Jacob D Jaffe (JD)

Broad Institute of Harvard and MIT, Cambridge, MA, USA.
Inzen Therapeutics, Cambridge, MA, USA.

Jennifer A Roth (JA)

Broad Institute of Harvard and MIT, Cambridge, MA, USA.

Federica Piccioni (F)

Broad Institute of Harvard and MIT, Cambridge, MA, USA.
Merck Research Laboratories, Boston, MA, USA.

Marc W Kirschner (MW)

Department of Systems Biology, Harvard Medical School, Boston, MA, USA.

Zhe Ji (Z)

Department of Pharmacology, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA.
Department of Biomedical Engineering, McCormick School of Engineering, Northwestern University, Evanston, IL, USA.

David E Root (DE)

Broad Institute of Harvard and MIT, Cambridge, MA, USA.

Todd R Golub (TR)

Broad Institute of Harvard and MIT, Cambridge, MA, USA. golub@broadinstitute.org.
Department of Pediatric Oncology, Dana-Farber Cancer Institute, Boston, MA, USA. golub@broadinstitute.org.
Division of Pediatric Hematology/Oncology, Boston Children's Hospital, Boston, MA, USA. golub@broadinstitute.org.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH