ABEILLE: a novel method for ABerrant Expression Identification empLoying machine LEarning from RNA-sequencing data.


Journal

Bioinformatics (Oxford, England)
ISSN: 1367-4811
Titre abrégé: Bioinformatics
Pays: England
ID NLM: 9808944

Informations de publication

Date de publication:
14 10 2022
Historique:
received: 12 07 2021
revised: 22 06 2022
accepted: 01 09 2022
pubmed: 6 9 2022
medline: 19 10 2022
entrez: 5 9 2022
Statut: ppublish

Résumé

Current advances in omics technologies are paving the diagnosis of rare diseases proposing a complementary assay to identify the responsible gene. The use of transcriptomic data to identify aberrant gene expression (AGE) has demonstrated to yield potential pathogenic events. However, popular approaches for AGE identification are limited by the use of statistical tests that imply the choice of arbitrary cut-off for significance assessment and the availability of several replicates not always possible in clinical contexts. Hence, we developed ABerrant Expression Identification empLoying machine LEarning from sequencing data (ABEILLE) a variational autoencoder (VAE)-based method for the identification of AGEs from the analysis of RNA-seq data without the need for replicates or a control group. ABEILLE combines the use of a VAE, able to model any data without specific assumptions on their distribution, and a decision tree to classify genes as AGE or non-AGE. An anomaly score is associated with each gene in order to stratify AGE by the severity of aberration. We tested ABEILLE on a semi-synthetic and an experimental dataset demonstrating the importance of the flexibility of the VAE configuration to identify potential pathogenic candidates. ABEILLE source code is freely available at: https://github.com/UCA-MSI/ABEILLE. Supplementary data are available at Bioinformatics online.

Identifiants

pubmed: 36063052
pii: 6692305
doi: 10.1093/bioinformatics/btac603
pmc: PMC9563686
doi:

Substances chimiques

RNA 63231-63-0

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

4754-4761

Informations de copyright

© The Author(s) 2022. Published by Oxford University Press.

Références

Nat Rev Genet. 2016 May;17(5):257-71
pubmed: 26996076
Nat Commun. 2021 Jan 22;12(1):529
pubmed: 33483494
Genome Biol. 2014;15(12):550
pubmed: 25516281
Nat Med. 2019 Jun;25(6):911-919
pubmed: 31160820
Am J Hum Genet. 2019 Mar 7;104(3):466-483
pubmed: 30827497
Genet Med. 2020 Mar;22(3):490-499
pubmed: 31607746
Science. 2019 Oct 18;366(6463):351-356
pubmed: 31601707
PeerJ. 2015 Oct 29;3:e1360
pubmed: 26539333
Nucleic Acids Res. 2016 Jul 8;44(W1):W90-7
pubmed: 27141961
Nucleic Acids Res. 2014 Jan;42(Database issue):D980-5
pubmed: 24234437
Science. 2015 May 8;348(6235):648-60
pubmed: 25954001
Nat Genet. 2015 Jul;47(7):717-726
pubmed: 25985138
Lancet. 2018 Jun 23;391(10139):2560-2574
pubmed: 29903433
Sci Transl Med. 2017 Apr 19;9(386):
pubmed: 28424332
Bioinformatics. 2020 Nov 1;36(17):4609-4615
pubmed: 32315392
Front Mol Biosci. 2020 Nov 02;7:590842
pubmed: 33240932
Nat Rev Genet. 2009 Jan;10(1):57-63
pubmed: 19015660
Bioinformatics. 2010 Jan 1;26(1):139-40
pubmed: 19910308
J Inherit Metab Dis. 2015 May;38(3):437-43
pubmed: 25735936
Am J Hum Genet. 2018 Dec 6;103(6):907-917
pubmed: 30503520
Nat Commun. 2017 Jun 12;8:15824
pubmed: 28604674
Genome Biol. 2022 Mar 15;23(1):79
pubmed: 35292087

Auteurs

Justine Labory (J)

Université Côte d'Azur, Center of Modeling, Simulation and Interactions, Nice 06000, France.
Université Côte d'Azur, Inserm U1081, CNRS UMR 7284, Institute for Research on Cancer and Aging, Nice (IRCAN), Centre Hospitalier Universitaire (CHU) de Nice, Nice 06200, France.

Gwendal Le Bideau (G)

Université Côte d'Azur, Inserm U1081, CNRS UMR 7284, Institute for Research on Cancer and Aging, Nice (IRCAN), Centre Hospitalier Universitaire (CHU) de Nice, Nice 06200, France.

David Pratella (D)

Université Côte d'Azur, Center of Modeling, Simulation and Interactions, Nice 06000, France.

Jean-Elisée Yao (JE)

Université Côte d'Azur, Center of Modeling, Simulation and Interactions, Nice 06000, France.

Samira Ait-El-Mkadem Saadi (S)

Université Côte d'Azur, Inserm U1081, CNRS UMR 7284, Institute for Research on Cancer and Aging, Nice (IRCAN), Centre Hospitalier Universitaire (CHU) de Nice, Nice 06200, France.

Sylvie Bannwarth (S)

Université Côte d'Azur, Inserm U1081, CNRS UMR 7284, Institute for Research on Cancer and Aging, Nice (IRCAN), Centre Hospitalier Universitaire (CHU) de Nice, Nice 06200, France.

Loubna El-Hami (L)

Université Côte d'Azur, Center of Modeling, Simulation and Interactions, Nice 06000, France.
Université Côte d'Azur, Inserm U1081, CNRS UMR 7284, Institute for Research on Cancer and Aging, Nice (IRCAN), Centre Hospitalier Universitaire (CHU) de Nice, Nice 06200, France.

Véronique Paquis-Fluckinger (V)

Université Côte d'Azur, Inserm U1081, CNRS UMR 7284, Institute for Research on Cancer and Aging, Nice (IRCAN), Centre Hospitalier Universitaire (CHU) de Nice, Nice 06200, France.

Silvia Bottini (S)

Université Côte d'Azur, Center of Modeling, Simulation and Interactions, Nice 06000, France.

Articles similaires

Selecting optimal software code descriptors-The case of Java.

Yegor Bugayenko, Zamira Kholmatova, Artem Kruglov et al.
1.00
Software Algorithms Programming Languages

Exploring blood-brain barrier passage using atomic weighted vector and machine learning.

Yoan Martínez-López, Paulina Phoobane, Yanaima Jauriga et al.
1.00
Blood-Brain Barrier Machine Learning Humans Support Vector Machine Software

Understanding the role of machine learning in predicting progression of osteoarthritis.

Simone Castagno, Benjamin Gompels, Estelle Strangmark et al.
1.00
Humans Disease Progression Machine Learning Osteoarthritis
Cephalometry Humans Anatomic Landmarks Software Internet

Classifications MeSH