LinAliFold and CentroidLinAliFold: fast RNA consensus secondary structure prediction for aligned sequences using beam search methods.
Journal
Bioinformatics advances
ISSN: 2635-0041
Titre abrégé: Bioinform Adv
Pays: England
ID NLM: 9918282081306676
Informations de publication
Date de publication:
2022
2022
Historique:
received:
28
07
2022
revised:
13
10
2022
accepted:
21
10
2022
entrez:
26
1
2023
pubmed:
27
1
2023
medline:
27
1
2023
Statut:
epublish
Résumé
RNA consensus secondary structure prediction from aligned sequences is a powerful approach for improving the secondary structure prediction accuracy. However, because the computational complexities of conventional prediction tools scale with the cube of the alignment lengths, their application to long RNA sequences, such as viral RNAs or long non-coding RNAs, requires significant computational time. In this study, we developed LinAliFold and CentroidLinAliFold, fast RNA consensus secondary structure prediction tools based on minimum free energy and maximum expected accuracy principles, respectively. We achieved software acceleration using beam search methods that were successfully used for fast secondary structure prediction from a single RNA sequence. Benchmark analyses showed that LinAliFold and CentroidLinAliFold were much faster than the existing methods while preserving the prediction accuracy. As an empirical application, we predicted the consensus secondary structure of coronaviruses with approximately 30 000 nt in 5 and 79 min by LinAliFold and CentroidLinAliFold, respectively. We confirmed that the predicted consensus secondary structure of coronaviruses was consistent with the experimental results. The source codes of LinAliFold and CentroidLinAliFold are freely available at https://github.com/fukunagatsu/LinAliFold-CentroidLinAliFold. Supplementary data are available at
Identifiants
pubmed: 36699418
doi: 10.1093/bioadv/vbac078
pii: vbac078
pmc: PMC9710674
doi:
Types de publication
Journal Article
Langues
eng
Pagination
vbac078Informations de copyright
© The Author(s) 2022. Published by Oxford University Press.
Références
Bioinformatics. 2020 Jul 1;36(Suppl_1):i258-i267
pubmed: 32657379
Nat Rev Mol Cell Biol. 2008 Mar;9(3):219-30
pubmed: 18270516
Bioinformatics. 2021 Oct 25;:
pubmed: 34694364
Signal Transduct Target Ther. 2022 Feb 23;7(1):58
pubmed: 35197441
Brief Bioinform. 2022 Jan 17;23(1):
pubmed: 34601552
Nature. 2021 Aug;596(7873):583-589
pubmed: 34265844
Bioinformatics. 2007 Feb 15;23(4):434-41
pubmed: 17182698
BMC Bioinformatics. 2011 Apr 20;12:108
pubmed: 21507242
Bioinformatics. 2006 Mar 1;22(5):614-5
pubmed: 16368769
Nucleic Acids Res. 2013 Apr;41(7):4307-23
pubmed: 23435231
BMC Bioinformatics. 2008 Nov 11;9:474
pubmed: 19014431
BMC Bioinformatics. 2010 Nov 30;11:586
pubmed: 21118522
BMC Bioinformatics. 2010 Mar 15;11:129
pubmed: 20230624
Algorithms Mol Biol. 2011 Nov 24;6:26
pubmed: 22115189
Nucleic Acids Res. 2017 Nov 16;45(20):11570-11581
pubmed: 29036420
Mol Biol Evol. 2013 Apr;30(4):772-80
pubmed: 23329690
Nucleic Acids Res. 2009 Aug;37(14):4533-44
pubmed: 19465384
Nat Commun. 2019 Nov 27;10(1):5407
pubmed: 31776342
RNA. 2010 Dec;16(12):2304-18
pubmed: 20940338
Bioinformatics. 2008 Feb 1;24(3):367-73
pubmed: 18056736
Nucleic Acids Res. 2023 Jan 25;51(2):e7
pubmed: 36401871
Genome Biol. 2014 Jan 21;15(1):R16
pubmed: 24447569
Bioinformatics. 2009 Dec 15;25(24):3236-43
pubmed: 19808876
Nat Methods. 2017 Jan;14(1):45-48
pubmed: 27819659
Nucleic Acids Res. 2004 Jan 1;32(Database issue):D101-3
pubmed: 14681368
RNA. 2020 Aug;26(8):982-995
pubmed: 32371455
Bioinformatics. 2009 Aug 1;25(15):1974-5
pubmed: 19398448
Mol Cell. 2020 Dec 17;80(6):1067-1077.e5
pubmed: 33259809
Bioinformatics. 2006 Jul 15;22(14):e90-8
pubmed: 16873527
Bioinformatics. 2009 Jun 15;25(12):i330-8
pubmed: 19478007
Nat Commun. 2021 Feb 11;12(1):941
pubmed: 33574226
Bioinformatics. 2017 Sep 01;33(17):2666-2674
pubmed: 28459942
BMC Bioinformatics. 2016 May 06;17(1):203
pubmed: 27153986
Nucleic Acids Res. 2011 Jan;39(2):393-402
pubmed: 20843778
Nat Commun. 2018 Oct 18;9(1):4328
pubmed: 30337527
Nat Commun. 2022 Mar 2;13(1):1128
pubmed: 35236847
Proc Natl Acad Sci U S A. 2021 Dec 28;118(52):
pubmed: 34887342
BMC Bioinformatics. 2003 Sep 22;4:44
pubmed: 14499004
Bioinformatics. 2009 Feb 15;25(4):465-73
pubmed: 19095700
J Mol Biol. 2002 Jun 21;319(5):1059-66
pubmed: 12079347
Bioinformatics. 2006 Dec 15;22(24):2988-95
pubmed: 17038338
Nat Rev Mol Cell Biol. 2021 Feb;22(2):96-118
pubmed: 33353982
Bioinformatics. 2006 Jul 15;22(14):1723-9
pubmed: 16690634
Bioinformatics. 2019 Jul 15;35(14):i295-i304
pubmed: 31510672
Elife. 2015 Aug 12;4:
pubmed: 26267216