Investigating Open Reading Frames in Known and Novel Transcripts using ORFanage.
Journal
bioRxiv : the preprint server for biology
Titre abrégé: bioRxiv
Pays: United States
ID NLM: 101680187
Informations de publication
Date de publication:
25 Mar 2023
25 Mar 2023
Historique:
pubmed:
31
3
2023
medline:
31
3
2023
entrez:
30
3
2023
Statut:
epublish
Résumé
ORFanage is a system designed to assign open reading frames (ORFs) to both known and novel gene transcripts while maximizing similarity to annotated proteins. The primary intended use of ORFanage is the identification of ORFs in the assembled results of RNA sequencing (RNA-seq) experiments, a capability that most transcriptome assembly methods do not have. Our experiments demonstrate how ORFanage can be used to find novel protein variants in RNA-seq datasets, and to improve the annotations of ORFs in tens of thousands of transcript models in the RefSeq and GENCODE human annotation databases. Through its implementation of a highly accurate and efficient pseudo-alignment algorithm, ORFanage is substantially faster than other ORF annotation methods, enabling its application to very large datasets. When used to analyze transcriptome assemblies, ORFanage can aid in the separation of signal from transcriptional noise and the identification of likely functional transcript variants, ultimately advancing our understanding of biology and medicine.
Identifiants
pubmed: 36993373
doi: 10.1101/2023.03.23.533704
pmc: PMC10055401
pii:
doi:
Types de publication
Preprint
Langues
eng
Subventions
Organisme : NHGRI NIH HHS
ID : R01 HG006677
Pays : United States
Organisme : NIMH NIH HHS
ID : R01 MH123567
Pays : United States
Organisme : NIGMS NIH HHS
ID : R35 GM130151
Pays : United States
Commentaires et corrections
Type : UpdateIn
Déclaration de conflit d'intérêts
Competing Interests The authors have no conflicts of interest to declare.
Références
Nature. 2013 Sep 12;501(7466):217-21
pubmed: 23934111
Biol Chem. 2016 Dec 20;398(3):359-371
pubmed: 27676605
Am J Hum Genet. 2018 Jan 4;102(1):11-26
pubmed: 29304370
Genome Biol Evol. 2022 Apr 10;14(4):
pubmed: 35325119
Biomolecules. 2020 Apr 24;10(4):
pubmed: 32344647
Genome Biol. 2018 Nov 28;19(1):208
pubmed: 30486838
Nat Rev Mol Cell Biol. 2005 May;6(5):386-98
pubmed: 15956978
PLoS One. 2014 Jan 31;9(1):e87361
pubmed: 24498085
PLoS Genet. 2022 Aug 4;18(8):e1010342
pubmed: 35926060
Bioinformatics. 2021 Oct 25;37(20):3650-3651
pubmed: 33964128
Genome Res. 2020 Dec 23;:
pubmed: 33361112
Drug Discov Today. 2019 Jun;24(6):1258-1267
pubmed: 30953866
Science. 2012 Dec 21;338(6114):1593-9
pubmed: 23258891
Bioinformatics. 2022 Feb 7;38(5):1440-1442
pubmed: 34734986
Nat Methods. 2013 Dec;10(12):1177-84
pubmed: 24185837
Nat Methods. 2022 Jun;19(6):679-682
pubmed: 35637307
Nat Biotechnol. 2004 May;22(5):535-46
pubmed: 15122293
Nat Protoc. 2012 Mar 01;7(3):562-78
pubmed: 22383036
Trends Genet. 2000 Jun;16(6):276-7
pubmed: 10827456
Bioinformatics. 2011 Jul 1;27(13):i275-82
pubmed: 21685081
Nature. 2012 Sep 6;489(7414):101-8
pubmed: 22955620
Nat Biotechnol. 2019 Aug;37(8):907-915
pubmed: 31375807
Genome Biol. 2004;5(4):219
pubmed: 15059252
Biochim Biophys Acta. 2009 Jan;1792(1):14-26
pubmed: 18992329
Cancer Cell. 2018 Aug 13;34(2):211-224.e6
pubmed: 30078747
Nucleic Acids Res. 2008 Jan;36(Database issue):D1009-14
pubmed: 17986450
BMC Bioinformatics. 2018 Feb 19;19(Suppl 1):45
pubmed: 29504909
Nature. 2008 Nov 27;456(7221):470-6
pubmed: 18978772
Genome Res. 2020 Feb;30(2):299-312
pubmed: 32024661
Nucleic Acids Res. 2022 Jan 7;50(D1):D439-D444
pubmed: 34791371
Genome Biol. 2023 Oct 30;24(1):249
pubmed: 37904256
N Engl J Med. 2013 May 30;368(22):2059-74
pubmed: 23634996
Sci Transl Med. 2017 Apr 19;9(386):
pubmed: 28424332
Nature. 2022 Apr;604(7905):310-315
pubmed: 35388217
Nature. 2022 Aug;608(7922):353-359
pubmed: 35922509
Nat Genet. 2013 Jun;45(6):580-5
pubmed: 23715323
Nucleic Acids Res. 2013 Jan;41(Database issue):D110-7
pubmed: 23161672
Elife. 2022 Dec 15;11:
pubmed: 36519529
Nucleic Acids Res. 2023 Jan 6;51(D1):D942-D949
pubmed: 36420896
Genome Biol. 2019 Dec 16;20(1):278
pubmed: 31842956
Pflugers Arch. 2018 Jul;470(7):995-1016
pubmed: 29536164
F1000Res. 2020 Apr 28;9:304
pubmed: 32489650
BMC Biol. 2018 Aug 20;16(1):94
pubmed: 30124169
Nucleic Acids Res. 2020 Aug 20;48(14):7700-7711
pubmed: 32652016
Science. 1998 Dec 11;282(5396):2012-8
pubmed: 9851916
Mol Cancer Res. 2017 Sep;15(9):1206-1220
pubmed: 28584021
Bioinformatics. 2021 Jul 19;37(12):1639-1643
pubmed: 33320174
J Biol Chem. 2019 Mar 8;294(10):3454-3463
pubmed: 30610115
Trends Biochem Sci. 2017 Feb;42(2):98-110
pubmed: 27712956
Cancer Discov. 2015 Aug;5(8):850-9
pubmed: 25971938
J Biol Chem. 2018 Feb 9;293(6):1887-1896
pubmed: 29237729
Nucleic Acids Res. 2017 Jul 3;45(W1):W12-W16
pubmed: 28521017
Trends Biochem Sci. 2017 Jun;42(6):408-410
pubmed: 28483377
BMC Bioinformatics. 2014 Mar 23;15:81
pubmed: 24655717
Nat Protoc. 2013 Aug;8(8):1494-512
pubmed: 23845962
Bioinformatics. 2018 Sep 15;34(18):3094-3100
pubmed: 29750242
Nucleic Acids Res. 2016 Jan 4;44(D1):D733-45
pubmed: 26553804
PLoS One. 2013 Jul 16;8(7):e69504
pubmed: 23874967
Nucleic Acids Res. 2013 Apr 1;41(6):e74
pubmed: 23335781
Cell Rep. 2021 Nov 16;37(7):110022
pubmed: 34788620
Physiol Rev. 2002 Apr;82(2):331-71
pubmed: 11917092