Analysis of nested alternate open reading frames and their encoded proteins.


Journal

NAR genomics and bioinformatics
ISSN: 2631-9268
Titre abrégé: NAR Genom Bioinform
Pays: England
ID NLM: 101756213

Informations de publication

Date de publication:
Dec 2022
Historique:
received: 03 07 2022
revised: 14 08 2022
accepted: 27 09 2022
entrez: 21 10 2022
pubmed: 22 10 2022
medline: 22 10 2022
Statut: epublish

Résumé

Transcriptional and post-transcriptional mechanisms diversify the proteome beyond gene number, while maintaining a sequence relationship between original and altered proteins. A new mechanism breaks this paradigm, generating novel proteins by translating alternative open reading frames (Alt-ORFs) within canonical host mRNAs. Uniquely, 'alt-proteins' lack sequence homology with host ORF-derived proteins. We show global amino acid frequencies, and consequent biochemical characteristics of Alt-ORFs nested within host ORFs (nAlt-ORFs), are genetically-driven, and predicted by summation of frequencies of hundreds of encompassing host codon-pairs. Analysis of 101 human nAlt-ORFs of length ≥150 codons confirms the theoretical predictions, revealing an extraordinarily high median isoelectric point (pI) of 11.68, due to anomalous charged amino acid levels. Also, nAlt-ORF proteins exhibit a >2-fold preference for reading frame 2 versus 3, predicted mitochondrial and nuclear localization, and elevated codon adaptation index indicative of natural selection. Our results provide a theoretical and conceptual framework for exploration of these largely unannotated, but potentially significant, alternative ORFs and their encoded proteins.

Identifiants

pubmed: 36267124
doi: 10.1093/nargab/lqac076
pii: lqac076
pmc: PMC9580016
doi:

Types de publication

Journal Article

Langues

eng

Pagination

lqac076

Informations de copyright

© The Author(s) 2022. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics.

Références

PLoS Pathog. 2018 Jan 29;14(1):e1006857
pubmed: 29377958
Proc Natl Acad Sci U S A. 1941 Nov 15;27(11):499-506
pubmed: 16588492
C R Biol. 2003 Oct-Nov;326(10-11):901-8
pubmed: 14744096
Cell. 2012 Mar 30;149(1):88-100
pubmed: 22386318
Curr Opin Struct Biol. 1997 Jun;7(3):394-8
pubmed: 9204282
Nucleic Acids Res. 2019 Jan 8;47(D1):D403-D410
pubmed: 30299502
Nucleic Acids Res. 2013 Jan;41(Database issue):D377-86
pubmed: 23193289
PLoS One. 2018 Oct 19;13(10):e0202513
pubmed: 30339683
Database (Oxford). 2012 May 20;2012:bas025
pubmed: 22613085
Proc Natl Acad Sci U S A. 2012 Sep 11;109(37):E2424-32
pubmed: 22927429
Nat Protoc. 2019 Mar;14(3):703-721
pubmed: 30804569
J Biotechnol. 2013 Sep 10;167(3):326-33
pubmed: 23876479
Nucleic Acids Res. 2020 Feb 20;48(3):1029-1042
pubmed: 31504789
EMBO Rep. 2021 Jan 7;22(1):e50640
pubmed: 33226175
Mol Cell. 2011 Sep 16;43(6):853-66
pubmed: 21925375
Genome Res. 2018 May;28(5):609-624
pubmed: 29626081
Science. 1969 May 16;164(3881):788-98
pubmed: 5767777
Mol Cell Biol. 1999 Nov;19(11):7357-68
pubmed: 10523624
J Mol Biol. 2019 Jun 14;431(13):2434-2441
pubmed: 31029701
RNA. 2006 Apr;12(4):666-73
pubmed: 16497657
Proc Natl Acad Sci U S A. 2010 Mar 23;107(12):5429-34
pubmed: 20212158
Nucleic Acids Res. 1987 Feb 11;15(3):1281-95
pubmed: 3547335
J Biol Chem. 2013 Jul 26;288(30):21824-35
pubmed: 23760502
FASEB J. 2011 Jul;25(7):2373-86
pubmed: 21478263
Nucleic Acids Res. 2020 Jul 2;48(W1):W395-W402
pubmed: 32479607
Nature. 2021 Aug;596(7873):583-589
pubmed: 34265844
Genome Biol. 2014 Feb 20;15(2):403
pubmed: 25001293
Elife. 2017 Oct 30;6:
pubmed: 29083303
Nucleic Acids Res. 2021 Jan 8;49(D1):D380-D388
pubmed: 33179748
Nucleic Acids Res. 2010 Jul;38(Web Server issue):W497-502
pubmed: 20507917
Cell. 2014 Jun 19;157(7):1605-18
pubmed: 24949972
BMC Genomics. 2017 Mar 13;18(1):227
pubmed: 28288557
J Mol Biol. 1987 Aug 20;196(4):947-50
pubmed: 3681984
Nature. 2010 Jan 28;463(7280):457-63
pubmed: 20110989
J Bacteriol. 2019 Jul 10;201(15):
pubmed: 31010904
Nature. 2013 May 2;497(7447):127-31
pubmed: 23615609
Biol Direct. 2008 Sep 16;3:38
pubmed: 18796141
BMC Mol Cell Biol. 2019 Aug 20;20(1):36
pubmed: 31429701
Cell Chem Biol. 2016 Aug 18;23(8):917-27
pubmed: 27478157

Auteurs

Kommireddy Vasu (K)

Department of Cardiovascular and Metabolic Sciences, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA.

Debjit Khan (D)

Department of Cardiovascular and Metabolic Sciences, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA.

Iyappan Ramachandiran (I)

Department of Cardiovascular and Metabolic Sciences, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA.

Daniel Blankenberg (D)

Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA.

Paul L Fox (PL)

Department of Cardiovascular and Metabolic Sciences, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA.

Classifications MeSH