Machine learning for predicting neurodegenerative diseases in the general older population: a cohort study.


Journal

BMC medical research methodology
ISSN: 1471-2288
Titre abrégé: BMC Med Res Methodol
Pays: England
ID NLM: 100968545

Informations de publication

Date de publication:
11 01 2023
Historique:
received: 14 06 2022
accepted: 06 01 2023
entrez: 11 1 2023
pubmed: 12 1 2023
medline: 14 1 2023
Statut: epublish

Résumé

In the older general population, neurodegenerative diseases (NDs) are associated with increased disability, decreased physical and cognitive function. Detecting risk factors can help implement prevention measures. Using deep neural networks (DNNs), a machine-learning algorithm could be an alternative to Cox regression in tabular datasets with many predictive features. We aimed to compare the performance of different types of DNNs with regularized Cox proportional hazards models to predict NDs in the older general population. We performed a longitudinal analysis with participants of the English Longitudinal Study of Ageing. We included men and women with no NDs at baseline, aged 60 years and older, assessed every 2 years from 2004 to 2005 (wave2) to 2016-2017 (wave 8). The features were a set of 91 epidemiological and clinical baseline variables. The outcome was new events of Parkinson's, Alzheimer or dementia. After applying multiple imputations, we trained three DNN algorithms: Feedforward, TabTransformer, and Dense Convolutional (Densenet). In addition, we trained two algorithms based on Cox models: Elastic Net regularization (CoxEn) and selected features (CoxSf). 5433 participants were included in wave 2. During follow-up, 12.7% participants developed NDs. Although the five models predicted NDs events, the discriminative ability was superior using TabTransformer (Uno's C-statistic (coefficient (95% confidence intervals)) 0.757 (0.702, 0.805). TabTransformer showed superior time-dependent balanced accuracy (0.834 (0.779, 0.889)) and specificity (0.855 (0.0.773, 0.909)) than the other models. With the CoxSf (hazard ratio (95% confidence intervals)), age (10.0 (6.9, 14.7)), poor hearing (1.3 (1.1, 1.5)) and weight loss 1.3 (1.1, 1.6)) were associated with a higher DNN risk. In contrast, executive function (0.3 (0.2, 0.6)), memory (0, 0, 0.1)), increased gait speed (0.2, (0.1, 0.4)), vigorous physical activity (0.7, 0.6, 0.9)) and higher BMI (0.4 (0.2, 0.8)) were associated with a lower DNN risk. TabTransformer is promising for prediction of NDs with heterogeneous tabular datasets with numerous features. Moreover, it can handle censored data. However, Cox models perform well and are easier to interpret than DNNs. Therefore, they are still a good choice for NDs.

Sections du résumé

BACKGROUND
In the older general population, neurodegenerative diseases (NDs) are associated with increased disability, decreased physical and cognitive function. Detecting risk factors can help implement prevention measures. Using deep neural networks (DNNs), a machine-learning algorithm could be an alternative to Cox regression in tabular datasets with many predictive features. We aimed to compare the performance of different types of DNNs with regularized Cox proportional hazards models to predict NDs in the older general population.
METHODS
We performed a longitudinal analysis with participants of the English Longitudinal Study of Ageing. We included men and women with no NDs at baseline, aged 60 years and older, assessed every 2 years from 2004 to 2005 (wave2) to 2016-2017 (wave 8). The features were a set of 91 epidemiological and clinical baseline variables. The outcome was new events of Parkinson's, Alzheimer or dementia. After applying multiple imputations, we trained three DNN algorithms: Feedforward, TabTransformer, and Dense Convolutional (Densenet). In addition, we trained two algorithms based on Cox models: Elastic Net regularization (CoxEn) and selected features (CoxSf).
RESULTS
5433 participants were included in wave 2. During follow-up, 12.7% participants developed NDs. Although the five models predicted NDs events, the discriminative ability was superior using TabTransformer (Uno's C-statistic (coefficient (95% confidence intervals)) 0.757 (0.702, 0.805). TabTransformer showed superior time-dependent balanced accuracy (0.834 (0.779, 0.889)) and specificity (0.855 (0.0.773, 0.909)) than the other models. With the CoxSf (hazard ratio (95% confidence intervals)), age (10.0 (6.9, 14.7)), poor hearing (1.3 (1.1, 1.5)) and weight loss 1.3 (1.1, 1.6)) were associated with a higher DNN risk. In contrast, executive function (0.3 (0.2, 0.6)), memory (0, 0, 0.1)), increased gait speed (0.2, (0.1, 0.4)), vigorous physical activity (0.7, 0.6, 0.9)) and higher BMI (0.4 (0.2, 0.8)) were associated with a lower DNN risk.
CONCLUSION
TabTransformer is promising for prediction of NDs with heterogeneous tabular datasets with numerous features. Moreover, it can handle censored data. However, Cox models perform well and are easier to interpret than DNNs. Therefore, they are still a good choice for NDs.

Identifiants

pubmed: 36631766
doi: 10.1186/s12874-023-01837-4
pii: 10.1186/s12874-023-01837-4
pmc: PMC9832793
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

8

Commentaires et corrections

Type : ErratumIn

Informations de copyright

© 2023. The Author(s).

Références

Erkkinen MG, Kim M-O, Geschwind MD. Clinical neurology and epidemiology of the major neurodegenerative diseases. Cold Spring Harb Perspect Biol. 2018;10(4):a033118.
doi: 10.1101/cshperspect.a033118
Hou Y, Dan X, Babbar M, Wei Y, Hasselbalch SG, Croteau DL, et al. Ageing as a risk factor for neurodegenerative disease. Nat Rev Neurol. 2019;15(10):565–81.
doi: 10.1038/s41582-019-0244-7
Vermunt L, Sikkes SA, Van Den Hout A, Handels R, Bos I, Van Der Flier WM, et al. Duration of preclinical, prodromal, and dementia stages of Alzheimer's disease in relation to age, sex, and APOE genotype. Alzheimers Dement. 2019;15(7):888–98.
doi: 10.1016/j.jalz.2019.04.001
Dommershuijsen LJ, Boon AJ, Ikram MK. Probing the pre-diagnostic phase of Parkinson's disease in population-based studies. Front Neurol. 2021;12:1–8.
Wingo TS, Liu Y, Gerasimov ES, Vattathil SM, Wynne ME, Liu J, et al. Shared mechanisms across the major psychiatric and neurodegenerative diseases. Nat Commun. 2022;13(1):1–19.
doi: 10.1038/s41467-022-31873-5
Ibañez A, Fittipaldi S, Trujillo C, Jaramillo T, Torres A, Cardona JF, et al. Predicting and characterizing neurodegenerative subtypes with multimodal neurocognitive signatures of social and cognitive processes. J Alzheimer's Dis. 2021;83(1):227–48.
doi: 10.3233/JAD-210163
Zhang XX, Tian Y, Wang ZT, Ma YH, Tan L, Yu JT. The epidemiology of Alzheimer's disease modifiable risk factors and prevention. J Prev Alzheimer's Dis. 2021;8(3):313–21.
Chen H, Ritz B. The search for environmental causes of Parkinson’s disease: moving forward. J Parkinsons Dis. 2018;8(s1):S9–S17.
doi: 10.3233/JPD-181493
Jacobs BM, Belete D, Bestwick J, Blauwendraat C, Bandres-Ciga S, Heilbron K, et al. Parkinson's disease determinants, prediction and gene-environment interactions in the UK biobank. J Neurol Neurosurg Psychiatry. 2020;91(10):1046–54.
doi: 10.1136/jnnp-2020-323646
Liew TM. Subjective cognitive decline, anxiety symptoms, and the risk of mild cognitive impairment and dementia. Alzheimers Res Ther. 2020;12(1):1–9.
Reinke C, Doblhammer G, Schmid M, Welchowski T. Dementia risk predictions from German claims data using methods of machine learning. Alzheimers Dement. 2022:1–10.
Myszczynska MA, Ojamies PN, Lacoste AM, Neil D, Saffari A, Mead R, et al. Applications of machine learning to diagnosis and treatment of neurodegenerative diseases. Nat Rev Neurol. 2020;16(8):440–56.
doi: 10.1038/s41582-020-0377-8
Spooner A, Chen E, Sowmya A, Sachdev P, Kochan NA, Trollor J, et al. A comparison of machine learning methods for survival analysis of high-dimensional clinical data for dementia prediction. Sci Rep. 2020;10(1):1–10.
doi: 10.1038/s41598-020-77220-w
Rajpurkar P, Chen E, Banerjee O, Topol EJ. AI in health and medicine. Nat Med. 2022;28:31–38.
Zhu X, Yao J, Huang J. Deep convolutional neural network for survival analysis with pathological images. In: 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). New York City: IEEE; 2016.
Zadeh Shirazi A, McDonnell MD, Fornaciari E, Bagherian NS, Scheer KG, Samuel MS, et al. A deep convolutional neural network for segmentation of whole-slide pathology images identifies novel tumour cell-perivascular niche interactions that are associated with poor survival in glioblastoma. Br J Cancer. 2021;125(3):337–50.
doi: 10.1038/s41416-021-01394-x
Steingrimsson JA, Morrison S. Deep learning for survival outcomes. Stat Med. 2020;39(17):2339–49.
doi: 10.1002/sim.8542
Borisov V, Leemann T, Seßler K, Haug J, Pawelczyk M, Kasneci G. Deep neural networks and tabular data: a survey. Transactions on Neural Networks and Learning Systems. 2022:20–21.
Steptoe A, Breeze E, Banks J, Nazroo J. Cohort profile: the English longitudinal study of ageing. Int J Epidemiol. 2013;42(6):1640–8.
doi: 10.1093/ije/dys168
Taylor R, Conway L, Calderwood L, Lessof C, Cheshire H, Cox K, et al. Health, wealth and lifestyles of the older population in England: the 2002 English longitudinal study of ageing technical report. London: Institute of Fiscal Studies; 2007.
Livingston G, Huntley J, Sommerlad A, Ames D, Ballard C, Banerjee S, et al. Dementia prevention, intervention, and care: 2020 report of the lancet commission. Lancet. 2020;396(10248):413–46.
doi: 10.1016/S0140-6736(20)30367-6
Perkins NJ, Cole SR, Harel O, Tchetgen Tchetgen EJ, Sun B, Mitchell EM, et al. Principled approaches to missing data in epidemiologic studies. Am J Epidemiol. 2018;187(3):568–75.
doi: 10.1093/aje/kwx348
Buuren S, Groothuis-Oudshoorn K. MICE: multivariate imputation by chained equations in R. J Stat Softw. 2011;45(3):1–68.
White IR, Royston P, Wood AM. Multiple imputation using chained equations: issues and guidance for practice. Stat Med. 2011;30(4):377–99.
doi: 10.1002/sim.4067
Demirtas H, Freels SA, Yucel RM. Plausibility of multivariate normality assumption when multiply imputing non-Gaussian continuous outcomes: a simulation assessment. J Stat Comput Simul. 2008;78(1):69–84.
doi: 10.1080/10629360600903866
Sirimongkolkasem T, Drikvandi R. On regularisation methods for analysis of high dimensional data. Ann Data Sci. 2019;6(4):737–63.
doi: 10.1007/s40745-019-00209-4
Fu WJ. Penalized regressions: the bridge versus the lasso. J Comput Graph Stat. 1998;7(3):397–416.
Zou H, Hastie T. Regularization and variable selection via the elastic net. J Royal Stat Soc Series B (Stat Methodol). 2005;67(2):301–20.
doi: 10.1111/j.1467-9868.2005.00503.x
Ebrahimi V, Sharifi M, Mousavi-Roknabadi RS, Sadegh R, Khademian MH, Moghadami M, et al. Predictive determinants of overall survival among re-infected COVID-19 patients using the elastic-net regularized Cox proportional hazards model: a machine-learning algorithm. BMC Public Health. 2022;22(1):1–10.
doi: 10.1186/s12889-021-12383-3
Smilkov D, Thorat N, Assogba Y, Nicholson C, Kreeger N, Yu P, et al. Tensorflow. Js: machine learning for the web and beyond. Proc Machine Learn Syst. 2019;1:309–21.
Morgan N, Bourlard H. Generalization and parameter estimation in feedforward nets: some experiments. Adv Neural Inf Proces Syst. 1989;2:630–37.
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ. Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 2017.
Huang X, Khetan A, Cvitkovic M, Karnin Z. Tabtransformer: Tabular data modeling using contextual embeddings. arXiv preprint arXiv. 2020:201206678.
Breslow N. Covariance analysis of censored survival data. Biometrics. 1974;30(1):89–99.
doi: 10.2307/2529620
Uno H, Cai T, Pencina MJ, D’Agostino RB, Wei LJ. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat Med. 2011;30(10):1105–17.
doi: 10.1002/sim.4154
Lundberg SM, Lee S-I. A unified approach to interpreting model predictions. Adv Neural Inf Proces Syst. 2017;30:1–10.
Kim WJ, Sung JM, Sung D, Chae M-H, An SK, Namkoong K, et al. Cox proportional Hazard regression versus a deep learning algorithm in the prediction of dementia: an analysis based on periodic health examination. JMIR Med Inform. 2019;7(3):e13139-e.
doi: 10.2196/13139
Pölsterl S, Sarasua I, Gutiérrez-Becker B, Wachinger C. A wide and deep neural network for survival analysis from anatomical shape and tabular clinical data. arXiv preprint arXiv. 2019:190903890:1:11.
Faraggi D, Simon R. A neural network model for survival data. Stat Med. 1995;14(1):73–82.
doi: 10.1002/sim.4780140108
Cremers LGM, Huizinga W, Niessen WJ, Krestin GP, Poot DHJ, Ikram MA, et al. Predicting global cognitive decline in the general population using the disease state index. Front Aging Neurosci. 2020;11(379):1–12.
Hung S-C, Liao K-F, Muo C-H, Lai S-W, Chang C-W, Hung H-C. Hearing loss is associated with risk of Alzheimer’s disease: a case-control study in older people. J Epidemiol. 2015;25(8):517–21.
doi: 10.2188/jea.JE20140147
Griffiths TD, Lad M, Kumar S, Holmes E, McMurray B, Maguire EA, et al. How can hearing loss cause dementia? Neuron. 2020;108(3):401–12.
Li S, Cheng C, Lu L, Ma X, Zhang X, Li A, et al. Hearing loss in neurological disorders. Front Cell Dev Biol. 2021;9:1–16.
Lai SW, Liao KF, Lin CL, Lin CC, Sung FC. Hearing loss may be a non-motor feature of Parkinson's disease in older people in Taiwan. Eur J Neurol. 2014;21(5):752–7.
doi: 10.1111/ene.12378
Tolppanen A-M, Ngandu T, Kåreholt I, Laatikainen T, Rusanen M, Soininen H, et al. Midlife and late-life body mass index and late-life dementia: results from a prospective population-based cohort. J Alzheimers Dis. 2014;38(1):201–9.
doi: 10.3233/JAD-130698
Rahmani J, Roudsari AH, Bawadi H, Clark C, Ryan PM, Salehisahlabadi A, et al. Body mass index and risk of Parkinson, Alzheimer, dementia, and dementia mortality: a systematic review and dose-response meta-analysis of cohort studies among 5 million participants. Nutr Neurosci. 2022;25(3):423–31.
Park JH, Choi Y, Kim H, Nam MJ, Cw L, Yoo JW, et al. Association between body weight variability and incidence of Parkinson disease: a nationwide, population-based cohort study. Eur J Neurol. 2021;28(11):3626–33.
doi: 10.1111/ene.15025
Pieruccini-Faria F, Black SE, Masellis M, Smith EE, Almeida QJ, Li KZ, et al. Gait variability across neurodegenerative and cognitive disorders: results from the Canadian consortium of neurodegeneration in aging (CCNA) and the gait and brain study. Alzheimers Dement. 2021;17(8):1317–28.
doi: 10.1002/alz.12298
Tian Q, Resnick SM, Mielke MM, Yaffe K, Launer LJ, Jonsson PV, et al. Association of dual decline in memory and gait speed with risk for dementia among adults older than 60 years: a multicohort individual-level meta-analysis. JAMA Netw Open. 2020;3(2):e1921636-e.
doi: 10.1001/jamanetworkopen.2019.21636
Grande G, Triolo F, Nuara A, Welmer A-K, Fratiglioni L, Vetrano DL. Measuring gait speed to better identify prodromal dementia. Exp Gerontol. 2019;124:110625.
doi: 10.1016/j.exger.2019.05.014
McKenzie C, Bucks RS, Weinborn M, Bourgeat P, Salvado O, Gavett BE, et al. Cognitive reserve predicts future executive function decline in older adults with Alzheimer's disease pathology but not age-associated pathology. Neurobiol Aging. 2020;88:119–27.
doi: 10.1016/j.neurobiolaging.2019.12.022
Llamas-Velasco S, Contador I, Méndez-Guerrero A, Ferreiro CR, Benito-León J, Villarejo-Galende A, et al. Physical activity and risk of Parkinson’s disease and parkinsonism in a prospective population-based study (NEDICES). Prev Med Rep. 2021;23:101485.
doi: 10.1016/j.pmedr.2021.101485
Goerdten J, Čukić I, Danso SO, Carrière I, Muniz-Terrera G. Statistical methods for dementia risk prediction and recommendations for future work: a systematic review. Alzheimer’s Dementia. 2019;5:563–9.
Fang X, Han D, Cheng Q, et al. Association of levels of physical activity with risk of parkinson disease: a systematic review and meta-analysis. JAMA Netw Open. 2018;1(5):e182421.
doi: 10.1001/jamanetworkopen.2018.2421
Park SY, Setiawan VW, White LR, Wu AH, Cheng I, Haiman CA, et al. Modifying effects of race and ethnicity and APOE on the association of physical activity with risk of Alzheimer's disease and related dementias. Alzheimers Dement. 2022;1:11.
Tian Q, Schrack JA, Landman BA, Resnick SM, Ferrucci L. Longitudinal associations of absolute versus relative moderate-to-vigorous physical activity with brain microstructural decline in aging. Neurobiol Aging. 2022;116:25–31.
doi: 10.1016/j.neurobiolaging.2022.04.007

Auteurs

Gloria A Aguayo (GA)

Deep Digital Phenotyping Research Unit, Department of Precision Health, Luxembourg Institute of Health, Strassen, Luxembourg. gloria.aguayo@lih.lu.

Lu Zhang (L)

Bioinformatics Platform, Luxembourg Institute of Health, Strassen, Luxembourg.

Michel Vaillant (M)

Competence Center for Methodology and Statistics, Translational Medicine Operations Hub, Luxembourg Institute of Health, Strassen, Luxembourg.

Moses Ngari (M)

Competence Center for Methodology and Statistics, Translational Medicine Operations Hub, Luxembourg Institute of Health, Strassen, Luxembourg.
KEMRI/Wellcome Trust Research Programme, Kilifi, Kenya.

Magali Perquin (M)

Department of Precision Health, Luxembourg Institute of Health, Strassen, Luxembourg.

Valerie Moran (V)

Department of Precision Health, Luxembourg Institute of Health, Strassen, Luxembourg.
Living Conditions Department, Luxembourg Institute of Socio-Economic Research, Esch-Sur-Alzette, Luxembourg.

Laetitia Huiart (L)

Department of Precision Health, Luxembourg Institute of Health, Strassen, Luxembourg.

Rejko Krüger (R)

LCSB, Luxembourg Centre for System Biomedicine, University of Luxembourg, Esch-Sur-Alzette, Luxembourg.
Parkinson Research Clinic, Centre Hospitalier de Luxembourg, Luxembourg, Luxembourg.
Transversal Translational Medicine, Luxembourg Institute of Health, Strassen, Luxembourg.

Francisco Azuaje (F)

Bioinformatics Platform, Luxembourg Institute of Health, Strassen, Luxembourg.
Genomics England, London, UK.

Cyril Ferdynus (C)

Methodological Support Unit, Félix Guyon University Hospital Center, Saint-Denis, La Réunion, France.

Guy Fagherazzi (G)

Deep Digital Phenotyping Research Unit, Department of Precision Health, Luxembourg Institute of Health, Strassen, Luxembourg.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH