A supervised learning method for classifying methylation disorders.
Angelman syndrome
Beckwith–Wiedemann syndrome
Congenital disease
Diagnosis
Machine learning
Methylation
Prader–Willi syndrome
Russell–Silver syndrome
Silver–Russell syndrome
Journal
BMC bioinformatics
ISSN: 1471-2105
Titre abrégé: BMC Bioinformatics
Pays: England
ID NLM: 100965194
Informations de publication
Date de publication:
12 Feb 2024
12 Feb 2024
Historique:
received:
20
09
2023
accepted:
24
01
2024
medline:
13
2
2024
pubmed:
13
2
2024
entrez:
12
2
2024
Statut:
epublish
Résumé
DNA methylation is one of the most stable and well-characterized epigenetic alterations in humans. Accordingly, it has already found clinical utility as a molecular biomarker in a variety of disease contexts. Existing methods for clinical diagnosis of methylation-related disorders focus on outlier detection in a small number of CpG sites using standardized cutoffs which differentiate healthy from abnormal methylation levels. The standardized cutoff values used in these methods do not take into account methylation patterns which are known to differ between the sexes and with age. Here we profile genome-wide DNA methylation from blood samples drawn from within a cohort composed of healthy controls of different age and sex alongside patients with Prader-Willi syndrome (PWS), Beckwith-Wiedemann syndrome, Fragile-X syndrome, Angelman syndrome, and Silver-Russell syndrome. We propose a Generalized Additive Model to perform age and sex adjusted outlier analysis of around 700,000 CpG sites throughout the human genome. Utilizing z-scores among the cohort for each site, we deployed an ensemble based machine learning pipeline and achieved a combined prediction accuracy of 0.96 (Binomial 95% Confidence Interval 0.868[Formula: see text]0.995). We demonstrate a method for age and sex adjusted outlier detection of differentially methylated loci based on a large cohort of healthy individuals. We present a custom machine learning pipeline utilizing this outlier analysis to classify samples for potential methylation associated congenital disorders. These methods are able to achieve high accuracy when used with machine learning methods to classify abnormal methylation patterns.
Sections du résumé
BACKGROUND
BACKGROUND
DNA methylation is one of the most stable and well-characterized epigenetic alterations in humans. Accordingly, it has already found clinical utility as a molecular biomarker in a variety of disease contexts. Existing methods for clinical diagnosis of methylation-related disorders focus on outlier detection in a small number of CpG sites using standardized cutoffs which differentiate healthy from abnormal methylation levels. The standardized cutoff values used in these methods do not take into account methylation patterns which are known to differ between the sexes and with age.
RESULTS
RESULTS
Here we profile genome-wide DNA methylation from blood samples drawn from within a cohort composed of healthy controls of different age and sex alongside patients with Prader-Willi syndrome (PWS), Beckwith-Wiedemann syndrome, Fragile-X syndrome, Angelman syndrome, and Silver-Russell syndrome. We propose a Generalized Additive Model to perform age and sex adjusted outlier analysis of around 700,000 CpG sites throughout the human genome. Utilizing z-scores among the cohort for each site, we deployed an ensemble based machine learning pipeline and achieved a combined prediction accuracy of 0.96 (Binomial 95% Confidence Interval 0.868[Formula: see text]0.995).
CONCLUSION
CONCLUSIONS
We demonstrate a method for age and sex adjusted outlier detection of differentially methylated loci based on a large cohort of healthy individuals. We present a custom machine learning pipeline utilizing this outlier analysis to classify samples for potential methylation associated congenital disorders. These methods are able to achieve high accuracy when used with machine learning methods to classify abnormal methylation patterns.
Identifiants
pubmed: 38347515
doi: 10.1186/s12859-024-05673-1
pii: 10.1186/s12859-024-05673-1
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
66Informations de copyright
© 2024. The Author(s).
Références
Jaenisch R, Bird A. Epigenetic regulation of gene expression: how the genome integrates intrinsic and environmental signals. Nat Genet. 2003;33:245–54. https://doi.org/10.1038/ng1089 .
doi: 10.1038/ng1089
pubmed: 12610534
Jaenisch R. DNA methylation and imprinting: Why bother? Trends Genet. 1997;13(8):323–9. https://doi.org/10.1016/S0168-9525(97)01180-3 .
doi: 10.1016/S0168-9525(97)01180-3
pubmed: 9260519
Luo Y, Lu X, Xie H. Dynamic Alu methylation during normal development, aging, and tumorigenesis. Biomed Res Int. 2014;2014: 784706. https://doi.org/10.1155/2014/784706 .
doi: 10.1155/2014/784706
pubmed: 25243180
pmcid: 4163490
Monk M, Boubelik M, Lehnert S. Temporal and regional changes in DNA methylation in the embryonic, extraembryonic and germ cell lineages during mouse embryo development. Development. 1987;99(3):371–82. https://doi.org/10.1242/dev.99.3.371 .
doi: 10.1242/dev.99.3.371
pubmed: 3653008
Titcombe P, Murray R, Hewitt M, Antoun E, Cooper C, Inskip HM, Holbrook JD, Godfrey KM, Lillycrop K, Hanson M, Barton SJ. Human non-CpG methylation patterns display both tissue-specific and inter-individual differences suggestive of underlying function. Epigenetics. 2022;17(6):653–64. https://doi.org/10.1080/15592294.2021.1950990 .
doi: 10.1080/15592294.2021.1950990
pubmed: 34461806
Saxonov S, Berg P, Brutlag DL. A genome-wide analysis of CpG dinucleotides in the human genome distinguishes two distinct classes of promoters. Proc Natl Acad Sci. 2006;103(5):1412–7. https://doi.org/10.1073/pnas.0510310103 .
doi: 10.1073/pnas.0510310103
pubmed: 16432200
pmcid: 1345710
Bird AP, Wolffe AP. Methylation-induced repression—belts, braces, and chromatin. Cell. 1999;99(5):451–4. https://doi.org/10.1016/S0092-8674(00)81532-9 .
doi: 10.1016/S0092-8674(00)81532-9
pubmed: 10589672
Grant OA, Wang Y, Kumari M, Zabet NR, Schalkwyk L. Characterising sex differences of autosomal DNA methylation in whole blood using the Illumina epic array. Clin Epigenetics. 2022;14(1):62. https://doi.org/10.1186/s13148-022-01279-7 .
doi: 10.1186/s13148-022-01279-7
pubmed: 35568878
pmcid: 9107695
Horvath S. Dna methylation age of human tissues and cell types. Genome Biol. 2013;14(10):1–20.
doi: 10.1186/gb-2013-14-10-r115
Lu T-P, Chen KT, Tsai M-H, Kuo K-T, Hsiao CK, Lai L-C, Chuang EY. Identification of genes with consistent methylation levels across different human tissues. Sci Rep. 2014;4(1):1–7.
Unnikrishnan A, Freeman WM, Jackson J, Wren JD, Porter H, Richardson A. The role of DNA methylation in epigenetics of aging. Pharmacol Ther. 2019;195:172–85. https://doi.org/10.1016/j.pharmthera.2018.11.001 .
doi: 10.1016/j.pharmthera.2018.11.001
pubmed: 30419258
Waggoner D. Mechanisms of disease: epigenesis. Semin Pediatr Neurol. 2007;14(1):7–14. https://doi.org/10.1016/j.spen.2006.11.004 .
doi: 10.1016/j.spen.2006.11.004
pubmed: 17331879
Rossignol S, Netchine I, Le Bouc Y, Gicquel C. Epigenetics in Silver–Russell syndrome. Best Pract Res Clin Endocrinol Metab. 2008;22(3):403–14. https://doi.org/10.1016/j.beem.2008.01.012 .
doi: 10.1016/j.beem.2008.01.012
pubmed: 18538282
Schouten JP, McElgunn CJ, Waaijer R, Zwijnenburg D, Diepvens F, Pals G. Relative quantification of 40 nucleic acid sequences by multiplex ligation-dependent probe amplification. Nucl Acids Res. 2002;30(12):57–57.
doi: 10.1093/nar/gnf056
Priolo M, Sparago A, Mammì C, Cerrato F, Lagana C, Riccio A. MS-MLPA is a specific and sensitive technique for detecting all chromosome 11p15.5 imprinting defects of BWS and SRS in a single-tube experiment. Eur J Human Genet. 2008;16(5):565–71.
doi: 10.1038/sj.ejhg.5202001
Dedeurwaerder S, Defrance M, Bizet M, Calonne E, Bontempi G, Fuks F. A comprehensive overview of Infinium HumanMethylation450 data processing. Brief Bioinform. 2014;15(6):929–41.
doi: 10.1093/bib/bbt054
pubmed: 23990268
Wang Z, Wu X, Wang Y. A framework for analyzing DNA methylation data from Illumina Infinium HumanMethylation450 BeadChip. BMC Bioinform. 2018;19(5):15–22.
Pidsley R, Zotenko E, Peters TJ, Lawrence MG, Risbridger GP, Molloy P, Van Djik S, Muhlhausler B, Stirzaker C, Clark SJ. Critical evaluation of the Illumina MethylationEPIC BeadChip microarray for whole-genome DNA methylation profiling. Genome Biol. 2016;17(1):1–17.
doi: 10.1186/s13059-016-1066-1
Morris TJ, Butcher LM, Feber A, Teschendorff AE, Chakravarthy AR, Wojdacz TK, Beck S. ChAMP: 450k chip analysis methylation pipeline. Bioinformatics. 2014;30(3):428–30.
doi: 10.1093/bioinformatics/btt684
pubmed: 24336642
Tian Y, Morris TJ, Webster AP, Yang Z, Beck S, Feber A, Teschendorff AE. ChAMP: updated methylation analysis pipeline for Illumina BeadChips. Bioinformatics. 2017;33(24):3982–4.
doi: 10.1093/bioinformatics/btx513
pubmed: 28961746
pmcid: 5860089
Cassidy SB, Schwartz S, Miller JL, Driscoll DJ. Prader–Willi syndrome. Genet Med. 2012;14(1):10–26.
doi: 10.1038/gim.0b013e31822bead0
pubmed: 22237428
Williams CA, Driscoll DJ, Dagli AI. Clinical and genetic aspects of Angelman syndrome. Genet Med. 2010;12(7):385–95.
doi: 10.1097/GIM.0b013e3181def138
pubmed: 20445456
Hornstra LK, Nelson DL, Warren ST, Yang TP. High resolution methylation analysis of the fmr1 gene trinucleotide repeat region in fragile x syndrome. Hum Mol Genet. 1993;2(10):1659–65.
doi: 10.1093/hmg/2.10.1659
pubmed: 8268919
Weksberg R, Nishikawa J, Caluseriu O, Fei Y-L, Shuman C, Wei C, Steele L, Cameron J, Smith A, Ambus I, et al. Tumor development in the Beckwith–Wiedemann syndrome is associated with a variety of constitutional molecular 11p15 alterations including imprinting defects of KCNQ1OT1. Hum Mol Genet. 2001;10(26):2989–3000.
doi: 10.1093/hmg/10.26.2989
pubmed: 11751681
Weksberg R, Shuman C, Caluseriu O, Smith AC, Fei Y-L, Nishikawa J, Stockley TL, Best L, Chitayat D, Olney A, et al. Discordant KCNQ1OT1 imprinting in sets of monozygotic twins discordant for Beckwith–Wiedemann syndrome. Hum Mol Genet. 2002;11(11):1317–25.
doi: 10.1093/hmg/11.11.1317
pubmed: 12019213
Prickett AR, Ishida M, Bohm S, Frost JM, Puszyk W, Abu-Amero S, Stanier P, Schulz R, Moore GE, Oakey RJ. Genome-wide methylation analysis in Silver–Russell syndrome patients. Hum Genet. 2015;134(3):317–32. https://doi.org/10.1007/s00439-014-1526-1 .
doi: 10.1007/s00439-014-1526-1
pubmed: 25563730
pmcid: 4568568
Erickson N, Mueller J, Shirkov A, Zhang H, Larroy P, Li M, Smola A. AutoGluon-tabular: robust and accurate AutoML for structured data (2020). arXiv:2003.06505
Stasinopoulos DM, Rigby RA. Generalized additive models for location scale and shape (GAMLSS) in R. J Stat Softw. 2008;23:1–46.
van der Maaten L, Hinton G. Visualizing data using t-SNE. J Mach Learn Res. 2008;9:2579–605.
Pratt D, Sahm F, Aldape K. DNA methylation profiling as a model for discovery and precision diagnostics in neuro-oncology. Neuro Oncol. 2021;23:16–29. https://doi.org/10.1093/neuonc/noab143 .
doi: 10.1093/neuonc/noab143
Yang Y, Sun H, Zhang Y, Zhang T, Gong J, Wei Y, Duan YG, Shu M, Yang Y, Wu D, Yu D. Dimensionality reduction by UMAP reinforces sample heterogeneity analysis in bulk transcriptomic data. Cell Rep. 2021;36(4): 109442. https://doi.org/10.1016/j.celrep.2021.109442 .
doi: 10.1016/j.celrep.2021.109442
pubmed: 34320340
Chari T, Pachter L. The specious art of single-cell genomics. PLoS Comput Biol. 2023;19(8):1011288. https://doi.org/10.1371/journal.pcbi.1011288 .
doi: 10.1371/journal.pcbi.1011288
Vassilaki M, Cha RH, Aakre JA, Therneau TM, Geda YE, Mielke MM, Knopman DS, Petersen RC, Roberts RO. Mortality in mild cognitive impairment varies by subtype, sex, and lifestyle factors: the mayo clinic study of aging. J Alzheimers Dis. 2015;45(4):1237–45.
doi: 10.3233/JAD-143078
pubmed: 25697699
pmcid: 4398642
Bennett DA, Wilson RS, Schneider JA, Evans DA, Beckett LA, Aggarwal NT, Barnes LL, Fox JH, Bach J. Natural history of mild cognitive impairment in older persons. Neurology. 2002;59(2):198–205.
doi: 10.1212/WNL.59.2.198
pubmed: 12136057
Sachs GA, Carter R, Holtz LR, Smith F, Stump TE, Tu W, Callahan CM. Cognitive impairment: an independent predictor of excess mortality: a cohort study. Ann Intern Med. 2011;155(5):300–8.
doi: 10.7326/0003-4819-155-5-201109060-00007
pubmed: 21893623
Lavery LL, Dodge HH, Snitz B, Ganguli M. Cognitive decline and mortality in a community-based cohort: the Monongahela valley independent elders survey. J Am Geriatr Soc. 2009;57(1):94–100.
doi: 10.1111/j.1532-5415.2008.02052.x
pubmed: 19016932
McInnes L, Healy J, Melville J. UMAP: Uniform manifold approximation and projection for dimension reduction. arXiv (2018).
Kuhn M. Building predictive models in R using the caret package. J Stat Softw. 2008;28(1):1–26.