Influence of medical domain knowledge on deep learning for Alzheimer's disease prediction.

Alzheimer's disease prediction Cognitive impairment Deep learning Electronic medical records Recurrent Neural Networks

Journal

Computer methods and programs in biomedicine
ISSN: 1872-7565
Titre abrégé: Comput Methods Programs Biomed
Pays: Ireland
ID NLM: 8506513

Informations de publication

Date de publication:
Dec 2020
Historique:
received: 23 04 2020
accepted: 16 09 2020
pubmed: 5 10 2020
medline: 15 5 2021
entrez: 4 10 2020
Statut: ppublish

Résumé

Alzheimer's disease (AD) is the most common type of dementia that can seriously affect a person's ability to perform daily activities. Estimates indicate that AD may rank third as a cause of death for older people, after heart disease and cancer. Identification of individuals at risk for developing AD is imperative for testing therapeutic interventions. The objective of the study was to determine could diagnostics of AD from EMR data alone (without relying on diagnostic imaging) be significantly improved by applying clinical domain knowledge in data preprocessing and positive dataset selection rather than setting naïve filters. Data were extracted from the repository of heterogeneous ambulatory EMR data, collected from primary care medical offices all over the U.S. Medical domain knowledge was applied to build a positive dataset from data relevant to AD. Selected Clinically Relevant Positive (SCRP) datasets were used as inputs to a Long-Short-Term Memory (LSTM) Recurrent Neural Network (RNN) deep learning model to predict will the patient develop AD. Risk scores prediction of AD using the drugs domain information in an SCRP AD dataset of 2,324 patients achieved high out-of-sample score - 0.98-0.99 Area Under the Precision-Recall Curve (AUPRC) when using 90% of SCRP dataset for training. AUPRC dropped to 0.89 when training the model using less than 1,500 cases from the SCRP dataset. The model was still significantly better than when using naïve dataset selection. The LSTM RNN method that used data relevant to AD performed significantly better when learning from the SCRP dataset than when datasets were selected naïvely. The integration of qualitative medical knowledge for dataset selection and deep learning technology provided a mechanism for significant improvement of AD prediction. Accurate and early prediction of AD is significant in the identification of patients for clinical trials, which can possibly result in the discovery of new drugs for treatments of AD. Also, the contribution of the proposed predictions of AD is a better selection of patients who need imaging diagnostics for differential diagnosis of AD from other degenerative brain disorders.

Sections du résumé

BACKGROUND AND OBJECTIVE OBJECTIVE
Alzheimer's disease (AD) is the most common type of dementia that can seriously affect a person's ability to perform daily activities. Estimates indicate that AD may rank third as a cause of death for older people, after heart disease and cancer. Identification of individuals at risk for developing AD is imperative for testing therapeutic interventions. The objective of the study was to determine could diagnostics of AD from EMR data alone (without relying on diagnostic imaging) be significantly improved by applying clinical domain knowledge in data preprocessing and positive dataset selection rather than setting naïve filters.
METHODS METHODS
Data were extracted from the repository of heterogeneous ambulatory EMR data, collected from primary care medical offices all over the U.S. Medical domain knowledge was applied to build a positive dataset from data relevant to AD. Selected Clinically Relevant Positive (SCRP) datasets were used as inputs to a Long-Short-Term Memory (LSTM) Recurrent Neural Network (RNN) deep learning model to predict will the patient develop AD.
RESULTS RESULTS
Risk scores prediction of AD using the drugs domain information in an SCRP AD dataset of 2,324 patients achieved high out-of-sample score - 0.98-0.99 Area Under the Precision-Recall Curve (AUPRC) when using 90% of SCRP dataset for training. AUPRC dropped to 0.89 when training the model using less than 1,500 cases from the SCRP dataset. The model was still significantly better than when using naïve dataset selection.
CONCLUSION CONCLUSIONS
The LSTM RNN method that used data relevant to AD performed significantly better when learning from the SCRP dataset than when datasets were selected naïvely. The integration of qualitative medical knowledge for dataset selection and deep learning technology provided a mechanism for significant improvement of AD prediction. Accurate and early prediction of AD is significant in the identification of patients for clinical trials, which can possibly result in the discovery of new drugs for treatments of AD. Also, the contribution of the proposed predictions of AD is a better selection of patients who need imaging diagnostics for differential diagnosis of AD from other degenerative brain disorders.

Identifiants

pubmed: 33011665
pii: S0169-2607(20)31598-4
doi: 10.1016/j.cmpb.2020.105765
pmc: PMC7502243
pii:
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

105765

Informations de copyright

Copyright © 2020. Published by Elsevier B.V.

Déclaration de conflit d'intérêts

Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Références

Clin Geriatr Med. 2013 Nov;29(4):753-72
pubmed: 24094295
Exp Gerontol. 2007 Jan-Feb;42(1-2):129-38
pubmed: 16839732
PLoS One. 2019 Sep 19;14(9):e0222212
pubmed: 31536538
PLoS One. 2020 Mar 24;15(3):e0230409
pubmed: 32208428
BMC Med Inform Decis Mak. 2019 Dec 2;19(1):248
pubmed: 31791325
Radiology. 2003 Aug;228(2):515-22
pubmed: 12802006
Neurobiol Aging. 2016 Oct;46:180-91
pubmed: 27500865
J Biomed Inform. 2017 Jan;65:105-119
pubmed: 27919732
J Am Med Inform Assoc. 2020 Jul 1;27(9):1343-1351
pubmed: 32869093
PLoS One. 2019 Feb 14;14(2):e0211558
pubmed: 30763336
IEEE J Biomed Health Inform. 2020 Jan;24(1):17-26
pubmed: 31217131
Neuroimage. 2017 Jan 15;145(Pt B):137-165
pubmed: 27012503
Med Image Comput Comput Assist Interv. 2018 Sep;11072:293-301
pubmed: 31106304
Neuroimage. 2015 Jan 1;104:398-412
pubmed: 25312773
Neuroimage Clin. 2019;23:101837
pubmed: 31078938
Neural Comput. 1997 Nov 15;9(8):1735-80
pubmed: 9377276
Neuroimage. 2019 Apr 1;189:276-287
pubmed: 30654174
Curr Pharm Des. 2018;24(28):3347-3358
pubmed: 29879881
J Am Med Inform Assoc. 2017 Jan;24(1):198-208
pubmed: 27189013
Sci Rep. 2018 Jun 15;8(1):9161
pubmed: 29907747
JAMIA Open. 2018 Jun 04;1(1):87-98
pubmed: 31984321
Alzheimers Dement (N Y). 2019 Sep 25;5:483-491
pubmed: 31650004

Auteurs

Branimir Ljubic (B)

Center for Data Analytics and Biomedical Informatics (DABI), Temple University, 1925 N 12th Street, SERC 035-02, Philadelphia, PA 19122, USA.

Shoumik Roychoudhury (S)

Center for Data Analytics and Biomedical Informatics (DABI), Temple University, 1925 N 12th Street, SERC 035-02, Philadelphia, PA 19122, USA.

Xi Hang Cao (XH)

Center for Data Analytics and Biomedical Informatics (DABI), Temple University, 1925 N 12th Street, SERC 035-02, Philadelphia, PA 19122, USA.

Martin Pavlovski (M)

Center for Data Analytics and Biomedical Informatics (DABI), Temple University, 1925 N 12th Street, SERC 035-02, Philadelphia, PA 19122, USA.

Stefan Obradovic (S)

Department of Computer Science, Brendan Iribe Center for Computer Science and Engineering, University of Maryland, 8125 Paint Branch Drive, College Park, MD 20742, USA.

Richard Nair (R)

IQVIA, Plymouth Meeting, PA 19462, USA.

Lucas Glass (L)

IQVIA, Plymouth Meeting, PA 19462, USA.

Zoran Obradovic (Z)

Center for Data Analytics and Biomedical Informatics (DABI), Temple University, 1925 N 12th Street, SERC 035-02, Philadelphia, PA 19122, USA. Electronic address: zoran.obradovic@temple.edu.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH