Predicting the onset of Alzheimer's disease and related dementia using electronic health records: findings from the cache county study on memory in aging (1995-2008).
Alzheimer’s disease
Dementia
Diagnosis
Machine learning
Medical records
Prospective cohort
Journal
BMC medical informatics and decision making
ISSN: 1472-6947
Titre abrégé: BMC Med Inform Decis Mak
Pays: England
ID NLM: 101088682
Informations de publication
Date de publication:
28 Oct 2024
28 Oct 2024
Historique:
received:
13
05
2024
accepted:
17
10
2024
medline:
29
10
2024
pubmed:
29
10
2024
entrez:
29
10
2024
Statut:
epublish
Résumé
Clinical notes, biomarkers, and neuroimaging have proven valuable in dementia prediction models. Whether commonly available structured clinical data can predict dementia is an emerging area of research. We aimed to predict gold-standard, research-based diagnoses of dementia including Alzheimer's disease (AD) and/or Alzheimer's disease related dementias (ADRD), in addition to ICD-based AD and/or ADRD diagnoses, in a well-phenotyped, population-based cohort using a machine learning approach. Administrative healthcare data (k = 163 diagnostic features), in addition to census/vital record sociodemographic data (k = 6 features), were linked to the Cache County Study (CCS, 1995-2008). Among successfully linked UPDB-CCS participants (n = 4206), 522 (12.4%) had incident dementia (AD alone, AD comorbid with ADRD, or ADRD alone) as per the CCS "gold standard" assessments. Random Forest models, with a 1-year prediction window, achieved the best performance with an Area Under the Curve (AUC) of 0.67. Accuracy declined for dementia subtypes: AD/ADRD (AUC = 0.65); ADRD (AUC = 0.49). Accuracy improved when using ICD-based dementia diagnoses (AUC = 0.77). Commonly available structured clinical data (without labs, notes, or prescription information) demonstrate modest ability to predict "gold-standard" research-based AD/ADRD diagnoses, corroborated by prior research. Using ICD diagnostic codes to identify dementia as done in the majority of machine learning dementia prediction models, as compared to "gold-standard" dementia diagnoses, can result in higher accuracy, but whether these models are predicting true dementia warrants further research.
Identifiants
pubmed: 39468568
doi: 10.1186/s12911-024-02728-4
pii: 10.1186/s12911-024-02728-4
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
316Subventions
Organisme : NIA NIH HHS
ID : K01AG058781
Pays : United States
Organisme : NIA NIH HHS
ID : AG-11380, AG-18712 and AG-031272
Pays : United States
Organisme : NIA NIH HHS
ID : AG-11380, AG-18712 and AG-031272
Pays : United States
Organisme : National Institute of Aging
ID : R01AG022095
Informations de copyright
© 2024. The Author(s).
Références
2023. 2023 Alzheimer’s Disease Facts and Figures: https://www.alz.org/media/Documents/alzheimers-facts-and-figures.pdf . Accessed September 18th, 2024.
Bradford A, Kunik ME, Schulz P, Williams SP, Singh H. Missed and delayed diagnosis of dementia in primary care: prevalence and contributing factors. Alzheimer Dis Assoc Disord. 2009;23(4):306.
doi: 10.1097/WAD.0b013e3181a6bebc
pubmed: 19568149
pmcid: 2787842
Schliep KC, Ju S, Foster NL, et al. How good are medical and death records for identifying dementia? Alzheimers Dement Dec. 2021;7. https://doi.org/10.1002/alz.12526 .
Wilkinson T, Ly A, Schnier C, et al. Identifying dementia cases with routinely collected health data: a systematic review. Alzheimers Dement Aug. 2018;14(8):1038–51. https://doi.org/10.1016/j.jalz.2018.02.016 .
doi: 10.1016/j.jalz.2018.02.016
Barnes DE, Zhou J, Walker RL, et al. Development and Validation of eRADAR: a Tool using EHR Data to detect unrecognized dementia. J Am Geriatr Soc Jan. 2020;68(1):103–11. https://doi.org/10.1111/jgs.16182 .
doi: 10.1111/jgs.16182
VandeVrede L, Rabinovici GD. Blood-based biomarkers for Alzheimer Disease-Ready for Primary Care? JAMA Neurol. Jul. 2024;28. https://doi.org/10.1001/jamaneurol.2024.2801 .
Palmqvist S, Tideman P, Mattsson-Carlgren N, et al. Blood biomarkers to detect Alzheimer Disease in Primary Care and secondary care. JAMA Jul. 2024;28. https://doi.org/10.1001/jama.2024.13855 .
Javeed A, Dallora AL, Berglund JS, Ali A, Ali L, Anderberg P. Machine learning for Dementia Prediction: a systematic review and future research directions. J Med Syst Feb. 2023;1(1):17. https://doi.org/10.1007/s10916-023-01906-7 .
doi: 10.1007/s10916-023-01906-7
Dallora AL, Minku L, Mendes E, Rennemark M, Anderberg P, Sanmartin Berglund J. Multifactorial 10-Year prior diagnosis prediction model of Dementia. Int J Environ Res Public Health. 2020;17(18):6674.
doi: 10.3390/ijerph17186674
pubmed: 32937765
pmcid: 7557767
Ford E, Sheppard J, Oliver S, Rooney P, Banerjee S, Cassell JA. Automated detection of patients with dementia whose symptoms have been identified in primary care but have no formal diagnosis: a retrospective case-control study using electronic primary care records. BMJ Open Jan. 2021;22(1):e039248. https://doi.org/10.1136/bmjopen-2020-039248 .
doi: 10.1136/bmjopen-2020-039248
Li Q, Yang X, Xu J, et al. Early prediction of Alzheimer’s disease and related dementias using real-world electronic health records. Alzheimers Dement Feb. 2023;23. https://doi.org/10.1002/alz.12967 .
Ben Miled Z, Haas K, Black CM, et al. Predicting dementia with routine care EMR data. Artif Intell Med Jan. 2020;102:101771. https://doi.org/10.1016/j.artmed.2019.101771 .
doi: 10.1016/j.artmed.2019.101771
Park JH, Cho HE, Kim JH, et al. Machine learning prediction of incidence of Alzheimer’s disease using large-scale administrative health data. NPJ Digit Med. 2020;3:46. https://doi.org/10.1038/s41746-020-0256-0 .
doi: 10.1038/s41746-020-0256-0
pubmed: 32258428
pmcid: 7099065
Shao Y, Zeng QT, Chen KK, Shutes-David A, Thielke SM, Tsuang DW. Detection of probable dementia cases in undiagnosed patients using structured and unstructured electronic health records. BMC Med Inf Decis Mak Jul. 2019;9(1):128. https://doi.org/10.1186/s12911-019-0846-4 .
doi: 10.1186/s12911-019-0846-4
Tang AS, Oskotsky T, Havaldar S, et al. Deep phenotyping of Alzheimer’s disease leveraging electronic medical records identifies sex-specific clinical associations. Nat Commun Feb. 2022;3(1):675. https://doi.org/10.1038/s41467-022-28273-0 .
doi: 10.1038/s41467-022-28273-0
Nori VS, Hane CA, Sun Y, Crown WH, Bleicher PA. Deep neural network models for identifying incident dementia using claims and EHR datasets. PLoS ONE. 2020;15(9):e0236400.
doi: 10.1371/journal.pone.0236400
pubmed: 32970677
pmcid: 7514098
Xu J, Wang F, Xu Z, et al. Data-driven discovery of probable Alzheimer’s disease and related dementia subphenotypes using electronic health records. Learn Health Syst Oct. 2020;4(4):e10246. https://doi.org/10.1002/lrh2.10246 .
doi: 10.1002/lrh2.10246
Jammeh EA, Carroll CB, Pearson SW, et al. Machine-learning based identification of undiagnosed dementia in primary care: a feasibility study. BJGP Open Jul. 2018;2(2):bjgpopen18X101589. https://doi.org/10.3399/bjgpopen18X101589 .
doi: 10.3399/bjgpopen18X101589
Uspenskaya-Cadoz O, Alamuri C, Wang L, et al. Machine learning algorithm helps identify Non-diagnosed Prodromal Alzheimer’s Disease patients in the General Population. J Prev Alzheimers Dis. 2019;6(3):185–91. https://doi.org/10.14283/jpad.2019.10 .
doi: 10.14283/jpad.2019.10
pubmed: 31062833
Fukunishi H, Nishiyama M, Luo Y, Kubo M, Kobayashi Y. Alzheimer-type dementia prediction by sparse logistic regression using claim data. Comput Methods Programs Biomed Nov. 2020;196:105582. https://doi.org/10.1016/j.cmpb.2020.105582 .
doi: 10.1016/j.cmpb.2020.105582
Tschanz JT, Norton MC, Zandi PP, Lyketsos CG. The Cache County study on memory in aging: factors affecting risk of alzheimers disease and its progression after onset. Int Rev Psychiatry. 2013;25(6):673–85. https://doi.org/10.3109/09540261.2013.849663 .
doi: 10.3109/09540261.2013.849663
pubmed: 24423221
pmcid: 4089882
Breitner JC, Wyse BW, Anthony JC, et al. APOE-epsilon4 count predicts age when prevalence of AD increases, then declines: the Cache County study. Neurol Jul. 1999;22(2):321–31. https://doi.org/10.1212/wnl.53.2.321 .
doi: 10.1212/wnl.53.2.321
McKhann G, Drachman D, Folstein M, Katzman R, Price D, Stadlan EM. Clinical diagnosis of Alzheimer’s disease: report of the NINCDS-ADRDA Work Group under the auspices of Department of Health and Human Services Task Force on Alzheimer’s Disease. Neurol Jul. 1984;34(7):939–44.
Hayden KM, Warren LH, Pieper CF, et al. Identification of VaD and AD prodromes: the Cache County study. Alzheimers Dement Jul. 2005;1(1):19–29. https://doi.org/10.1016/j.jalz.2005.06.002 .
doi: 10.1016/j.jalz.2005.06.002
Khachaturian AS, Gallo JJ, Breitner JC. Performance characteristics of a two-stage dementia screen in a population sample. J Clin Epidemiol May. 2000;53(5):531–40. https://doi.org/10.1016/s0895-4356(99)00196-1 .
doi: 10.1016/s0895-4356(99)00196-1
Smith KR, Fraser A, Reed DL, et al. The Utah Population Database. A model for Linking Medical and Genealogical Records for Population Health Research. Hist Life Course Stud. 2022;12:58–77.
doi: 10.51964/hlcs11681
Biswas A, Saran I, Wilson FP. Introduction to Supervised Machine Learning. Kidney360. May 27. 2021;2(5):878–880. https://doi.org/10.34067/KID.0000182021
Vabalas A, Gowen E, Poliakoff E, Casson AJ. Machine learning algorithm validation with a limited sample size. PLoS ONE. 2019;14(11):e0224365.
doi: 10.1371/journal.pone.0224365
pubmed: 31697686
pmcid: 6837442
Varma S, Simon R. Bias in error estimation when using cross-validation for model selection. BMC Bioinformatics. 2006;7(1):91.
doi: 10.1186/1471-2105-7-91
pubmed: 16504092
pmcid: 1397873
Ostbye T, Taylor DH Jr., Clipp EC, Scoyoc LV, Plassman BL. Identification of dementia: agreement among national survey data, medicare claims, and death certificates. Research Support, Extramural NIH. Health services research. Feb 2008;43(1 Pt 1):313 – 26. https://doi.org/10.1111/j.1475-6773.2007.00748.x