Deciphering RNA splicing logic with interpretable machine learning.
RNA splicing
artificial intelligence
interpretable machine learning
Journal
Proceedings of the National Academy of Sciences of the United States of America
ISSN: 1091-6490
Titre abrégé: Proc Natl Acad Sci U S A
Pays: United States
ID NLM: 7505876
Informations de publication
Date de publication:
10 10 2023
10 10 2023
Historique:
medline:
23
10
2023
pubmed:
5
10
2023
entrez:
5
10
2023
Statut:
ppublish
Résumé
Machine learning methods, particularly neural networks trained on large datasets, are transforming how scientists approach scientific discovery and experimental design. However, current state-of-the-art neural networks are limited by their uninterpretability: Despite their excellent accuracy, they cannot describe how they arrived at their predictions. Here, using an "interpretable-by-design" approach, we present a neural network model that provides insights into RNA splicing, a fundamental process in the transfer of genomic information into functional biochemical products. Although we designed our model to emphasize interpretability, its predictive accuracy is on par with state-of-the-art models. To demonstrate the model's interpretability, we introduce a visualization that, for any given exon, allows us to trace and quantify the entire decision process from input sequence to output splicing prediction. Importantly, the model revealed uncharacterized components of the splicing logic, which we experimentally validated. This study highlights how interpretable machine learning can advance scientific discovery.
Identifiants
pubmed: 37796983
doi: 10.1073/pnas.2221165120
pmc: PMC10576025
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, Non-P.H.S.
Langues
eng
Sous-ensembles de citation
IM
Pagination
e2221165120Références
J Am Chem Soc. 2008 Jan 30;130(4):1392-401
pubmed: 18179216
Nat Genet. 2021 Jun;53(6):925-934
pubmed: 33941934
Algorithms Mol Biol. 2011 Nov 24;6:26
pubmed: 22115189
Genes Dev. 1999 Mar 1;13(5):593-606
pubmed: 10072387
Cell. 2004 Dec 17;119(6):831-45
pubmed: 15607979
F1000Res. 2019 May 22;8:
pubmed: 31164976
Genome Biol. 2004;5(10):R74
pubmed: 15461793
Genome Res. 2011 Aug;21(8):1360-74
pubmed: 21659425
BMC Bioinformatics. 2021 Nov 23;22(1):561
pubmed: 34814826
Nat Methods. 2011 Nov 20;9(1):72-4
pubmed: 22101854
RNA. 1999 Mar;5(3):468-83
pubmed: 10094314
Nat Genet. 2003 Aug;34(4):460-3
pubmed: 12833158
Genome Biol. 2018 Jun 1;19(1):71
pubmed: 29859120
Nat Commun. 2022 May 17;13(1):2720
pubmed: 35581216
J Comput Biol. 2004;11(2-3):377-94
pubmed: 15285897
Nucleic Acids Res. 2021 Jan 25;49(2):636-645
pubmed: 33337476
RNA. 2009 Mar;15(3):367-76
pubmed: 19155327
Mol Cell. 2001 Dec;8(6):1351-61
pubmed: 11779509
Nat Rev Genet. 2023 Feb;24(2):125-137
pubmed: 36192604
Biochem Biophys Res Commun. 2004 Mar 5;315(2):381-8
pubmed: 14766219
Nucleic Acids Res. 2007;35(2):371-89
pubmed: 17170000
Nature. 2020 Sep;585(7825):357-362
pubmed: 32939066
Mol Cell Biol. 1999 Mar;19(3):1705-19
pubmed: 10022858
Nucleic Acids Res. 1989 Jan 25;17(2):675-89
pubmed: 2915927
Nat Methods. 2020 Mar;17(3):261-272
pubmed: 32015543
Nature. 2008 Nov 27;456(7221):470-6
pubmed: 18978772
Cell. 2019 Jan 24;176(3):535-548.e24
pubmed: 30661751
Cell. 2019 Jan 24;176(3):549-563.e23
pubmed: 30661752
Nature. 2013 Jul 11;499(7457):172-7
pubmed: 23846655
J Biol Chem. 2008 Jan 18;283(3):1211-5
pubmed: 18024426
Genome Res. 2018 Jan;28(1):11-24
pubmed: 29242188
Mol Cell. 2019 Jan 3;73(1):183-194.e8
pubmed: 30503770
Methods Mol Biol. 2014;1126:243-55
pubmed: 24549669
Cell. 2009 Sep 4;138(5):898-910
pubmed: 19737518
Nucleic Acids Res. 1990 Oct 25;18(20):6097-100
pubmed: 2172928
Cell. 2005 Oct 7;123(1):65-73
pubmed: 16213213
Mol Cell. 2017 Dec 21;68(6):1083-1094.e5
pubmed: 29225039
Nature. 2023 Mar;615(7951):323-330
pubmed: 36813957
PLoS One. 2015 Jul 10;10(7):e0130140
pubmed: 26161953
Science. 2002 Aug 9;297(5583):1007-13
pubmed: 12114529
Proc Natl Acad Sci U S A. 2023 Oct 10;120(41):e2221165120
pubmed: 37796983
Nat Commun. 2016 May 10;7:11558
pubmed: 27161764
RNA. 2004 Aug;10(8):1291-305
pubmed: 15272122
BMC Bioinformatics. 2018 Dec 10;19(1):473
pubmed: 30526486
Mol Cell. 2018 Jun 7;70(5):854-867.e9
pubmed: 29883606
Nat Biotechnol. 2020 Jan;38(1):56-65
pubmed: 31792407
Nat Struct Mol Biol. 2020 Sep;27(9):814-821
pubmed: 32719458
Nat Struct Mol Biol. 2010 Jul;17(7):909-15
pubmed: 20601959
Cell. 2015 Oct 22;163(3):698-711
pubmed: 26496609