EL-RMLocNet: An explainable LSTM network for RNA-associated multi-compartment localization prediction.
Attention mechanism
Deep learning
Explainable
GeneticSeq2Vec
Human
LSTM
Mouse
Multi-class
Multi-label
Neural tricks
RNA subcellular localization prediction
Single or multi compartment
Journal
Computational and structural biotechnology journal
ISSN: 2001-0370
Titre abrégé: Comput Struct Biotechnol J
Pays: Netherlands
ID NLM: 101585369
Informations de publication
Date de publication:
2022
2022
Historique:
received:
02
04
2022
revised:
16
07
2022
accepted:
16
07
2022
entrez:
19
8
2022
pubmed:
20
8
2022
medline:
20
8
2022
Statut:
epublish
Résumé
Subcellular localization of Ribonucleic Acid (RNA) molecules provide significant insights into the functionality of RNAs and helps to explore their association with various diseases. Predominantly developed single-compartment localization predictors (SCLPs) lack to demystify RNA association with diverse biochemical and pathological processes mainly happen through RNA co-localization in multiple compartments. Limited multi-compartment localization predictors (MCLPs) manage to produce decent performance only for target RNA class of particular sub-type. Further, existing computational approaches have limited practical significance and potential to optimize therapeutics due to the poor degree of model explainability. The paper in hand presents an explainable Long Short-Term Memory (LSTM) network "EL-RMLocNet", predictive performance and interpretability of which are optimized using a novel GeneticSeq2Vec statistical representation learning scheme and attention mechanism for accurate multi-compartment localization prediction of different RNAs solely using raw RNA sequences. GeneticSeq2Vec generates optimized statistical vectors of raw RNA sequences by capturing short and long range relations of nucleotide k-mers. Using sequence vectors generated by GeneticSeq2Vec scheme, Long Short Term Memory layers extract most informative features, weighting of which on the basis of discriminative potential for accurate multi-compartment localization prediction is performed using attention layer. Through reverse engineering, weights of statistical feature space are mapped to nucleotide k-mers patterns to make multi-compartment localization prediction decision making transparent and explainable for different RNA classes and species. Empirical evaluation indicates that EL-RMLocNet outperforms state-of-the-art predictor for subcellular localization prediction of 4 different RNA classes by an average accuracy figure of 8% for Homo Sapiens species and 6% for Mus Musculus species. EL-RMLocNet is freely available as a web server at (https://sds_genetic_analysis.opendfki.de/subcellular_loc/).
Identifiants
pubmed: 35983235
doi: 10.1016/j.csbj.2022.07.031
pii: S2001-0370(22)00311-7
pmc: PMC9356161
doi:
Types de publication
Journal Article
Langues
eng
Pagination
3986-4002Informations de copyright
© 2022 The Author(s).
Déclaration de conflit d'intérêts
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Références
Genes (Basel). 2020 Dec 09;11(12):
pubmed: 33316943
Plants (Basel). 2020 Aug 12;9(8):
pubmed: 32806552
Brief Bioinform. 2022 Jan 17;23(1):
pubmed: 34498677
Bioinformatics. 2021 Feb 25;:
pubmed: 33630066
Brief Bioinform. 2021 Jan 18;22(1):526-535
pubmed: 31994694
Anal Biochem. 2020 Dec 1;610:113995
pubmed: 33080214
IEEE Trans Neural Netw Learn Syst. 2021 Feb;32(2):604-624
pubmed: 32324570
Cell Mol Life Sci. 2012 Feb;69(4):535-52
pubmed: 21984598
Nucleic Acids Res. 2021 May 7;49(8):e46
pubmed: 33503258
Bioinformatics. 2018 Oct 15;34(20):3547-3556
pubmed: 29718114
Interdiscip Sci. 2017 Dec;9(4):540-544
pubmed: 27739055
Science. 2007 Jun 8;316(5830):1484-8
pubmed: 17510325
Bioinformatics. 2018 Jul 1;34(13):2185-2194
pubmed: 29462250
Int J Mol Sci. 2021 Aug 13;22(16):
pubmed: 34445436
J Mol Cell Biol. 2018 Apr 1;10(2):130-138
pubmed: 29390072
Bioinformatics. 2018 Dec 15;34(24):4196-4204
pubmed: 29931187
J Cell Biol. 2021 Feb 1;220(2):
pubmed: 33464299
Mol Ther. 2021 Aug 4;29(8):2617-2623
pubmed: 33823302
Brief Bioinform. 2021 Sep 2;22(5):
pubmed: 33388743
iScience. 2021 Oct 16;24(11):103298
pubmed: 34765919
Nucleic Acids Res. 2020 Aug 20;48(14):7623-7639
pubmed: 32644123
BMC Genomics. 2021 Jan 15;22(1):56
pubmed: 33451286
Nucleic Acids Res. 2017 Jan 4;45(D1):D135-D138
pubmed: 27543076
Int J Mol Sci. 2020 Oct 01;21(19):
pubmed: 33019721
Biotechnol J. 2016 Oct;11(10):1362-1367
pubmed: 27624596
Brief Bioinform. 2022 Jan 17;23(1):
pubmed: 34471921
Sci Rep. 2018 Nov 6;8(1):16385
pubmed: 30401954
Mol Cell. 2019 Aug 22;75(4):875-887.e5
pubmed: 31442426
Bioinformatics. 2019 Jul 15;35(14):i333-i342
pubmed: 31510698
BMC Bioinformatics. 2021 Jun 24;22(1):342
pubmed: 34167457
Methods Mol Biol. 2022;2404:247-266
pubmed: 34694613
Cell Rep Methods. 2021 Sep 13;1(5):100068
pubmed: 35474672
Nat Commun. 2017 Sep 19;8(1):583
pubmed: 28928394
Cell Rep. 2018 Sep 4;24(10):2553-2560.e5
pubmed: 30184490
Nucleic Acids Res. 2020 Jul 2;48(W1):W239-W243
pubmed: 32421834
Mol Cell. 2019 Mar 7;73(5):946-958.e7
pubmed: 30661979
Brief Bioinform. 2020 May 21;21(3):1047-1057
pubmed: 31067315
Sci Rep. 2020 Sep 3;10(1):14557
pubmed: 32884018
Genomics. 2020 May;112(3):2583-2589
pubmed: 32068122