Comprehensive evaluation of the implementation of episignatures for diagnosis of neurodevelopmental disorders (NDDs).


Journal

Human genetics
ISSN: 1432-1203
Titre abrégé: Hum Genet
Pays: Germany
ID NLM: 7613873

Informations de publication

Date de publication:
Dec 2023
Historique:
received: 28 07 2023
accepted: 10 10 2023
medline: 27 11 2023
pubmed: 27 10 2023
entrez: 27 10 2023
Statut: ppublish

Résumé

Episignatures are popular tools for the diagnosis of rare neurodevelopmental disorders. They are commonly based on a set of differentially methylated CpGs used in combination with a support vector machine model. DNA methylation (DNAm) data often include missing values due to changes in data generation technology and batch effects. While many normalization methods exist for DNAm data, their impact on episignature performance have never been assessed. In addition, technologies to quantify DNAm evolve quickly and this may lead to poor transposition of existing episignatures generated on deprecated array versions to new ones. Indeed, probe removal between array versions, technologies or during preprocessing leads to missing values. Thus, the effect of missing data on episignature performance must also be carefully evaluated and addressed through imputation or an innovative approach to episignatures design. In this paper, we used data from patients suffering from Kabuki and Sotos syndrome to evaluate the influence of normalization methods, classification models and missing data on the prediction performances of two existing episignatures. We compare how six popular normalization methods for methylarray data affect episignature classification performances in Kabuki and Sotos syndromes and provide best practice suggestions when building new episignatures. In this setting, we show that Illumina, Noob or Funnorm normalization methods achieved higher classification performances on the testing sets compared to Quantile, Raw and Swan normalization methods. We further show that penalized logistic regression and support vector machines perform best in the classification of Kabuki and Sotos syndrome patients. Then, we describe a new paradigm to build episignatures based on the detection of differentially methylated regions (DMRs) and evaluate their performance compared to classical differentially methylated cytosines (DMCs)-based episignatures in the presence of missing data. We show that the performance of classical DMC-based episignatures suffers from the presence of missing data more than the DMR-based approach. We present a comprehensive evaluation of how the normalization of DNA methylation data affects episignature performance, using three popular classification models. We further evaluate how missing data affect those models' predictions. Finally, we propose a novel methodology to develop episignatures based on differentially methylated regions identification and show how this method slightly outperforms classical episignatures in the presence of missing data.

Identifiants

pubmed: 37889307
doi: 10.1007/s00439-023-02609-2
pii: 10.1007/s00439-023-02609-2
pmc: PMC10676303
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

1721-1735

Subventions

Organisme : Fonds De La Recherche Scientifique - FNRS
ID : 1.E.013.20F
Organisme : Innoviris Foundation
ID : PFS-11e IgenCare

Informations de copyright

© 2023. The Author(s).

Références

Mol Cell. 2013 Jan 24;49(2):359-367
pubmed: 23177740
BMC Genomics. 2020 Jan 2;21(1):6
pubmed: 31898477
Nat Commun. 2015 Dec 22;6:10207
pubmed: 26690673
Nat Commun. 2018 May 25;9(1):2064
pubmed: 29802345
Epigenetics. 2023 Dec;18(1):2185742
pubmed: 36871255
Epigenomics. 2011 Dec;3(6):771-84
pubmed: 22126295
HGG Adv. 2021 Dec 03;3(1):100075
pubmed: 35047860
Genome Biol. 2014 Dec 03;15(12):503
pubmed: 25599564
Genome Biol. 2012 Jun 15;13(6):R44
pubmed: 22703947
BMC Med. 2009 Oct 22;7:62
pubmed: 19845972
Bioinformatics. 2013 Jan 15;29(2):189-96
pubmed: 23175756
Bioinformatics. 2014 May 15;30(10):1363-9
pubmed: 24478339
Epigenetics. 2019 Dec;14(12):1177-1182
pubmed: 31250700
Brief Bioinform. 2019 Nov 27;20(6):2224-2235
pubmed: 30239597
Epigenetics. 2020 Jun - Jul;15(6-7):594-603
pubmed: 31833794
Clin Epigenetics. 2021 May 26;13(1):119
pubmed: 34039421
Clin Cancer Res. 2016 Dec 15;22(24):6236-6246
pubmed: 27256309
Clin Epigenetics. 2022 Dec 16;14(1):174
pubmed: 36527161
Am J Hum Genet. 2017 May 4;100(5):773-788
pubmed: 28475860
Ageing Res Rev. 2021 Dec;72:101488
pubmed: 34662746
BMC Bioinformatics. 2022 Sep 5;23(1):364
pubmed: 36064314
Nat Protoc. 2009;4(1):44-57
pubmed: 19131956
Am J Hum Genet. 2020 Mar 5;106(3):356-370
pubmed: 32109418
Cell Mol Life Sci. 2017 Nov;74(22):4133-4157
pubmed: 28631008
Epigenomics. 2016 Mar;8(3):389-99
pubmed: 26673039
Epigenetics Chromatin. 2015 Jan 27;8:6
pubmed: 25972926
Hum Mutat. 2020 Oct;41(10):1722-1733
pubmed: 32623772
Genet Med. 2023 Jan;25(1):63-75
pubmed: 36399132
Nucleic Acids Res. 2013 Apr;41(7):e90
pubmed: 23476028
Elife. 2021 Feb 26;10:
pubmed: 33646943
Front Neurosci. 2021 Nov 03;15:776809
pubmed: 34803599
Epigenetics. 2022 Dec;17(13):2241-2258
pubmed: 36047742
Int J Mol Sci. 2022 Jul 20;23(14):
pubmed: 35887345
BMC Med Genomics. 2013 Jan 28;6:1
pubmed: 23356856
Epigenetics. 2022 Dec;17(13):2434-2454
pubmed: 36354000
Bioinformatics. 2017 Feb 15;33(4):558-560
pubmed: 28035024
Am J Med Genet B Neuropsychiatr Genet. 2005 Feb 5;133B(1):37-42
pubmed: 15635661
Int J Epidemiol. 2012 Feb;41(1):200-9
pubmed: 22422453
Am J Hum Genet. 2021 Aug 5;108(8):1359-1366
pubmed: 34297908
Bioinformatics. 2017 Dec 15;33(24):3982-3984
pubmed: 28961746
Epigenomics. 2012 Jun;4(3):325-41
pubmed: 22690668

Auteurs

Edoardo Giuili (E)

Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles-Vrije Universiteit Brussel, Brussels, Belgium.

Robin Grolaux (R)

Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles-Vrije Universiteit Brussel, Brussels, Belgium.

Catarina Z N M Macedo (CZNM)

Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles-Vrije Universiteit Brussel, Brussels, Belgium.

Laurence Desmyter (L)

Center for Human Genetics, Hôpital Erasme, Hôpital Universitaire de Bruxelles, Université Libre de Bruxelles, Brussels, Belgium.

Bruno Pichon (B)

Center for Human Genetics, Hôpital Erasme, Hôpital Universitaire de Bruxelles, Université Libre de Bruxelles, Brussels, Belgium.

Sebastian Neuens (S)

Center for Human Genetics, Hôpital Erasme, Hôpital Universitaire de Bruxelles, Université Libre de Bruxelles, Brussels, Belgium.
Department of Genetics, Hôpital Universitaire Des Enfants Reine Fabiola, Hôpital Universitaire de Bruxelles, Université Libre de Bruxelles, Brussels, Belgium.

Catheline Vilain (C)

Center for Human Genetics, Hôpital Erasme, Hôpital Universitaire de Bruxelles, Université Libre de Bruxelles, Brussels, Belgium.
Department of Genetics, Hôpital Universitaire Des Enfants Reine Fabiola, Hôpital Universitaire de Bruxelles, Université Libre de Bruxelles, Brussels, Belgium.

Catharina Olsen (C)

Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles-Vrije Universiteit Brussel, Brussels, Belgium.
Clinical Sciences, Research Group Reproduction and Genetics, Brussels Interuniversity Genomics High Throughput Core (BRIGHTcore), Vrije Universiteit Brussel (VUB), Universitair Ziekenhuis Brussel (UZ Brussel), Brussels, Belgium.
Clinical Sciences, Research Group Reproduction and Genetics, Centre for Medical Genetics, Vrije Universiteit Brussel (VUB), Universitair Ziekenhuis Brussel (UZ Brussel), Brussels, Belgium.

Sonia Van Dooren (S)

Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles-Vrije Universiteit Brussel, Brussels, Belgium.
Clinical Sciences, Research Group Reproduction and Genetics, Brussels Interuniversity Genomics High Throughput Core (BRIGHTcore), Vrije Universiteit Brussel (VUB), Universitair Ziekenhuis Brussel (UZ Brussel), Brussels, Belgium.
Clinical Sciences, Research Group Reproduction and Genetics, Centre for Medical Genetics, Vrije Universiteit Brussel (VUB), Universitair Ziekenhuis Brussel (UZ Brussel), Brussels, Belgium.

Guillaume Smits (G)

Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles-Vrije Universiteit Brussel, Brussels, Belgium.
Center for Human Genetics, Hôpital Erasme, Hôpital Universitaire de Bruxelles, Université Libre de Bruxelles, Brussels, Belgium.
Department of Genetics, Hôpital Universitaire Des Enfants Reine Fabiola, Hôpital Universitaire de Bruxelles, Université Libre de Bruxelles, Brussels, Belgium.

Matthieu Defrance (M)

Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles-Vrije Universiteit Brussel, Brussels, Belgium. matthieu.defrance@ulb.be.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH