Identifying undetected dementia in UK primary care patients: a retrospective case-control study comparing machine-learning and standard epidemiological approaches.


Journal

BMC medical informatics and decision making
ISSN: 1472-6947
Titre abrégé: BMC Med Inform Decis Mak
Pays: England
ID NLM: 101088682

Informations de publication

Date de publication:
02 12 2019
Historique:
received: 05 06 2019
accepted: 21 11 2019
entrez: 4 12 2019
pubmed: 4 12 2019
medline: 14 4 2020
Statut: epublish

Résumé

Identifying dementia early in time, using real world data, is a public health challenge. As only two-thirds of people with dementia now ultimately receive a formal diagnosis in United Kingdom health systems and many receive it late in the disease process, there is ample room for improvement. The policy of the UK government and National Health Service (NHS) is to increase rates of timely dementia diagnosis. We used data from general practice (GP) patient records to create a machine-learning model to identify patients who have or who are developing dementia, but are currently undetected as having the condition by the GP. We used electronic patient records from Clinical Practice Research Datalink (CPRD). Using a case-control design, we selected patients aged >65y with a diagnosis of dementia (cases) and matched them 1:1 by sex and age to patients with no evidence of dementia (controls). We developed a list of 70 clinical entities related to the onset of dementia and recorded in the 5 years before diagnosis. After creating binary features, we trialled machine learning classifiers to discriminate between cases and controls (logistic regression, naïve Bayes, support vector machines, random forest and neural networks). We examined the most important features contributing to discrimination. The final analysis included data on 93,120 patients, with a median age of 82.6 years; 64.8% were female. The naïve Bayes model performed least well. The logistic regression, support vector machine, neural network and random forest performed very similarly with an AUROC of 0.74. The top features retained in the logistic regression model were disorientation and wandering, behaviour change, schizophrenia, self-neglect, and difficulty managing. Our model could aid GPs or health service planners with the early detection of dementia. Future work could improve the model by exploring the longitudinal nature of patient data and modelling decline in function over time.

Sections du résumé

BACKGROUND
Identifying dementia early in time, using real world data, is a public health challenge. As only two-thirds of people with dementia now ultimately receive a formal diagnosis in United Kingdom health systems and many receive it late in the disease process, there is ample room for improvement. The policy of the UK government and National Health Service (NHS) is to increase rates of timely dementia diagnosis. We used data from general practice (GP) patient records to create a machine-learning model to identify patients who have or who are developing dementia, but are currently undetected as having the condition by the GP.
METHODS
We used electronic patient records from Clinical Practice Research Datalink (CPRD). Using a case-control design, we selected patients aged >65y with a diagnosis of dementia (cases) and matched them 1:1 by sex and age to patients with no evidence of dementia (controls). We developed a list of 70 clinical entities related to the onset of dementia and recorded in the 5 years before diagnosis. After creating binary features, we trialled machine learning classifiers to discriminate between cases and controls (logistic regression, naïve Bayes, support vector machines, random forest and neural networks). We examined the most important features contributing to discrimination.
RESULTS
The final analysis included data on 93,120 patients, with a median age of 82.6 years; 64.8% were female. The naïve Bayes model performed least well. The logistic regression, support vector machine, neural network and random forest performed very similarly with an AUROC of 0.74. The top features retained in the logistic regression model were disorientation and wandering, behaviour change, schizophrenia, self-neglect, and difficulty managing.
CONCLUSIONS
Our model could aid GPs or health service planners with the early detection of dementia. Future work could improve the model by exploring the longitudinal nature of patient data and modelling decline in function over time.

Identifiants

pubmed: 31791325
doi: 10.1186/s12911-019-0991-9
pii: 10.1186/s12911-019-0991-9
pmc: PMC6889642
doi:

Types de publication

Comparative Study Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

248

Subventions

Organisme : Wellcome Trust (GB)
ID : 202133/Z/16/Z
Pays : International

Références

BMJ Open. 2013 Dec 23;3(12):e004023
pubmed: 24366579
Ther Adv Drug Saf. 2012 Apr;3(2):89-99
pubmed: 25083228
J Thorac Oncol. 2010 Sep;5(9):1315-6
pubmed: 20736804
Nat Rev Neurol. 2010 Jun;6(6):318-26
pubmed: 20498679
BMJ. 1990 Apr 28;300(6732):1092
pubmed: 2344534
Int J Geriatr Psychiatry. 2008 Jul;23(7):663-9
pubmed: 18229882
BMC Med Inform Decis Mak. 2009 Jan 21;9:6
pubmed: 19159458
Aging Ment Health. 2011 Nov;15(8):978-84
pubmed: 21777080
PLoS One. 2015 Sep 03;10(9):e0136181
pubmed: 26334524
J Alzheimers Dis. 2014;42 Suppl 4:S329-38
pubmed: 25261451
Curr Probl Pediatr Adolesc Health Care. 2011 Mar;41(3):60-88
pubmed: 21315295
Int J Epidemiol. 2015 Jun;44(3):827-36
pubmed: 26050254
Fam Pract. 2007 Apr;24(2):108-16
pubmed: 17237496
BJGP Open. 2018 Jun 13;2(2):bjgpopen18X101589
pubmed: 30564722
Patient Educ Couns. 2000 Feb;39(2-3):219-25
pubmed: 11040721
BMJ. 2010 Aug 05;341:c3584
pubmed: 20688840
PLoS One. 2018 Mar 29;13(3):e0194735
pubmed: 29596471
Am J Geriatr Psychiatry. 2009 Nov;17(11):965-75
pubmed: 20104054
BMC Med. 2016 Jan 21;14:6
pubmed: 26797096
Dement Geriatr Cogn Disord. 2007;24(4):300-6
pubmed: 17717417
PLoS One. 2011 Feb 18;6(2):e16852
pubmed: 21364746
Arch Med Res. 2012 Nov;43(8):705-9
pubmed: 23085453
Curr Opin Psychiatry. 2016 Mar;29(2):174-80
pubmed: 26779863
Ther Adv Drug Saf. 2019 May 31;10:2042098619854010
pubmed: 31210923

Auteurs

Elizabeth Ford (E)

Department of Primary Care and Public Health, Brighton and Sussex Medical School, Watson Building, Village Way, Falmer, Brighton, BN1 9PH, England. e.m.ford@bsms.ac.uk.

Philip Rooney (P)

Department of Physics and Astronomy, University of Sussex, Brighton, BN1 9RQ, England.

Seb Oliver (S)

Department of Physics and Astronomy, University of Sussex, Brighton, BN1 9RQ, England.

Richard Hoile (R)

Department of Primary Care and Public Health, Brighton and Sussex Medical School, Watson Building, Village Way, Falmer, Brighton, BN1 9PH, England.

Peter Hurley (P)

Department of Physics and Astronomy, University of Sussex, Brighton, BN1 9RQ, England.

Sube Banerjee (S)

Faculty of Health, University of Plymouth, Plymouth, PL4 8AA, England.

Harm van Marwijk (H)

Department of Primary Care and Public Health, Brighton and Sussex Medical School, Watson Building, Village Way, Falmer, Brighton, BN1 9PH, England.

Jackie Cassell (J)

Department of Primary Care and Public Health, Brighton and Sussex Medical School, Watson Building, Village Way, Falmer, Brighton, BN1 9PH, England.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH