Analysis of nested alternate open reading frames and their encoded proteins.
Journal
NAR genomics and bioinformatics
ISSN: 2631-9268
Titre abrégé: NAR Genom Bioinform
Pays: England
ID NLM: 101756213
Informations de publication
Date de publication:
Dec 2022
Dec 2022
Historique:
received:
03
07
2022
revised:
14
08
2022
accepted:
27
09
2022
entrez:
21
10
2022
pubmed:
22
10
2022
medline:
22
10
2022
Statut:
epublish
Résumé
Transcriptional and post-transcriptional mechanisms diversify the proteome beyond gene number, while maintaining a sequence relationship between original and altered proteins. A new mechanism breaks this paradigm, generating novel proteins by translating alternative open reading frames (Alt-ORFs) within canonical host mRNAs. Uniquely, 'alt-proteins' lack sequence homology with host ORF-derived proteins. We show global amino acid frequencies, and consequent biochemical characteristics of Alt-ORFs nested within host ORFs (nAlt-ORFs), are genetically-driven, and predicted by summation of frequencies of hundreds of encompassing host codon-pairs. Analysis of 101 human nAlt-ORFs of length ≥150 codons confirms the theoretical predictions, revealing an extraordinarily high median isoelectric point (pI) of 11.68, due to anomalous charged amino acid levels. Also, nAlt-ORF proteins exhibit a >2-fold preference for reading frame 2 versus 3, predicted mitochondrial and nuclear localization, and elevated codon adaptation index indicative of natural selection. Our results provide a theoretical and conceptual framework for exploration of these largely unannotated, but potentially significant, alternative ORFs and their encoded proteins.
Identifiants
pubmed: 36267124
doi: 10.1093/nargab/lqac076
pii: lqac076
pmc: PMC9580016
doi:
Types de publication
Journal Article
Langues
eng
Pagination
lqac076Informations de copyright
© The Author(s) 2022. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics.
Références
PLoS Pathog. 2018 Jan 29;14(1):e1006857
pubmed: 29377958
Proc Natl Acad Sci U S A. 1941 Nov 15;27(11):499-506
pubmed: 16588492
C R Biol. 2003 Oct-Nov;326(10-11):901-8
pubmed: 14744096
Cell. 2012 Mar 30;149(1):88-100
pubmed: 22386318
Curr Opin Struct Biol. 1997 Jun;7(3):394-8
pubmed: 9204282
Nucleic Acids Res. 2019 Jan 8;47(D1):D403-D410
pubmed: 30299502
Nucleic Acids Res. 2013 Jan;41(Database issue):D377-86
pubmed: 23193289
PLoS One. 2018 Oct 19;13(10):e0202513
pubmed: 30339683
Database (Oxford). 2012 May 20;2012:bas025
pubmed: 22613085
Proc Natl Acad Sci U S A. 2012 Sep 11;109(37):E2424-32
pubmed: 22927429
Nat Protoc. 2019 Mar;14(3):703-721
pubmed: 30804569
J Biotechnol. 2013 Sep 10;167(3):326-33
pubmed: 23876479
Nucleic Acids Res. 2020 Feb 20;48(3):1029-1042
pubmed: 31504789
EMBO Rep. 2021 Jan 7;22(1):e50640
pubmed: 33226175
Mol Cell. 2011 Sep 16;43(6):853-66
pubmed: 21925375
Genome Res. 2018 May;28(5):609-624
pubmed: 29626081
Science. 1969 May 16;164(3881):788-98
pubmed: 5767777
Mol Cell Biol. 1999 Nov;19(11):7357-68
pubmed: 10523624
J Mol Biol. 2019 Jun 14;431(13):2434-2441
pubmed: 31029701
RNA. 2006 Apr;12(4):666-73
pubmed: 16497657
Proc Natl Acad Sci U S A. 2010 Mar 23;107(12):5429-34
pubmed: 20212158
Nucleic Acids Res. 1987 Feb 11;15(3):1281-95
pubmed: 3547335
J Biol Chem. 2013 Jul 26;288(30):21824-35
pubmed: 23760502
FASEB J. 2011 Jul;25(7):2373-86
pubmed: 21478263
Nucleic Acids Res. 2020 Jul 2;48(W1):W395-W402
pubmed: 32479607
Nature. 2021 Aug;596(7873):583-589
pubmed: 34265844
Genome Biol. 2014 Feb 20;15(2):403
pubmed: 25001293
Elife. 2017 Oct 30;6:
pubmed: 29083303
Nucleic Acids Res. 2021 Jan 8;49(D1):D380-D388
pubmed: 33179748
Nucleic Acids Res. 2010 Jul;38(Web Server issue):W497-502
pubmed: 20507917
Cell. 2014 Jun 19;157(7):1605-18
pubmed: 24949972
BMC Genomics. 2017 Mar 13;18(1):227
pubmed: 28288557
J Mol Biol. 1987 Aug 20;196(4):947-50
pubmed: 3681984
Nature. 2010 Jan 28;463(7280):457-63
pubmed: 20110989
J Bacteriol. 2019 Jul 10;201(15):
pubmed: 31010904
Nature. 2013 May 2;497(7447):127-31
pubmed: 23615609
Biol Direct. 2008 Sep 16;3:38
pubmed: 18796141
BMC Mol Cell Biol. 2019 Aug 20;20(1):36
pubmed: 31429701
Cell Chem Biol. 2016 Aug 18;23(8):917-27
pubmed: 27478157