ABEILLE: a novel method for ABerrant Expression Identification empLoying machine LEarning from RNA-sequencing data.
Journal
Bioinformatics (Oxford, England)
ISSN: 1367-4811
Titre abrégé: Bioinformatics
Pays: England
ID NLM: 9808944
Informations de publication
Date de publication:
14 10 2022
14 10 2022
Historique:
received:
12
07
2021
revised:
22
06
2022
accepted:
01
09
2022
pubmed:
6
9
2022
medline:
19
10
2022
entrez:
5
9
2022
Statut:
ppublish
Résumé
Current advances in omics technologies are paving the diagnosis of rare diseases proposing a complementary assay to identify the responsible gene. The use of transcriptomic data to identify aberrant gene expression (AGE) has demonstrated to yield potential pathogenic events. However, popular approaches for AGE identification are limited by the use of statistical tests that imply the choice of arbitrary cut-off for significance assessment and the availability of several replicates not always possible in clinical contexts. Hence, we developed ABerrant Expression Identification empLoying machine LEarning from sequencing data (ABEILLE) a variational autoencoder (VAE)-based method for the identification of AGEs from the analysis of RNA-seq data without the need for replicates or a control group. ABEILLE combines the use of a VAE, able to model any data without specific assumptions on their distribution, and a decision tree to classify genes as AGE or non-AGE. An anomaly score is associated with each gene in order to stratify AGE by the severity of aberration. We tested ABEILLE on a semi-synthetic and an experimental dataset demonstrating the importance of the flexibility of the VAE configuration to identify potential pathogenic candidates. ABEILLE source code is freely available at: https://github.com/UCA-MSI/ABEILLE. Supplementary data are available at Bioinformatics online.
Identifiants
pubmed: 36063052
pii: 6692305
doi: 10.1093/bioinformatics/btac603
pmc: PMC9563686
doi:
Substances chimiques
RNA
63231-63-0
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
4754-4761Informations de copyright
© The Author(s) 2022. Published by Oxford University Press.
Références
Nat Rev Genet. 2016 May;17(5):257-71
pubmed: 26996076
Nat Commun. 2021 Jan 22;12(1):529
pubmed: 33483494
Genome Biol. 2014;15(12):550
pubmed: 25516281
Nat Med. 2019 Jun;25(6):911-919
pubmed: 31160820
Am J Hum Genet. 2019 Mar 7;104(3):466-483
pubmed: 30827497
Genet Med. 2020 Mar;22(3):490-499
pubmed: 31607746
Science. 2019 Oct 18;366(6463):351-356
pubmed: 31601707
PeerJ. 2015 Oct 29;3:e1360
pubmed: 26539333
Nucleic Acids Res. 2016 Jul 8;44(W1):W90-7
pubmed: 27141961
Nucleic Acids Res. 2014 Jan;42(Database issue):D980-5
pubmed: 24234437
Science. 2015 May 8;348(6235):648-60
pubmed: 25954001
Nat Genet. 2015 Jul;47(7):717-726
pubmed: 25985138
Lancet. 2018 Jun 23;391(10139):2560-2574
pubmed: 29903433
Sci Transl Med. 2017 Apr 19;9(386):
pubmed: 28424332
Bioinformatics. 2020 Nov 1;36(17):4609-4615
pubmed: 32315392
Front Mol Biosci. 2020 Nov 02;7:590842
pubmed: 33240932
Nat Rev Genet. 2009 Jan;10(1):57-63
pubmed: 19015660
Bioinformatics. 2010 Jan 1;26(1):139-40
pubmed: 19910308
J Inherit Metab Dis. 2015 May;38(3):437-43
pubmed: 25735936
Am J Hum Genet. 2018 Dec 6;103(6):907-917
pubmed: 30503520
Nat Commun. 2017 Jun 12;8:15824
pubmed: 28604674
Genome Biol. 2022 Mar 15;23(1):79
pubmed: 35292087