Prediction of progression from pre-diabetes to diabetes: Development and validation of a machine learning model.
Adult
Aged
Aged, 80 and over
Canada
/ epidemiology
Cohort Studies
Databases, Factual
Diabetes Mellitus
/ diagnosis
Disease Progression
Electronic Health Records
/ statistics & numerical data
Female
Follow-Up Studies
Humans
Israel
/ epidemiology
Machine Learning
Male
Middle Aged
Patient Selection
Prediabetic State
/ physiopathology
Prognosis
Risk Assessment
/ methods
Risk Factors
Time Factors
United Kingdom
/ epidemiology
electronic medical records
machine learning
pre-diabetes
Journal
Diabetes/metabolism research and reviews
ISSN: 1520-7560
Titre abrégé: Diabetes Metab Res Rev
Pays: England
ID NLM: 100883450
Informations de publication
Date de publication:
02 2020
02 2020
Historique:
received:
24
07
2019
revised:
17
11
2019
accepted:
19
11
2019
pubmed:
17
1
2020
medline:
2
12
2020
entrez:
17
1
2020
Statut:
ppublish
Résumé
Identification, a priori, of those at high risk of progression from pre-diabetes to diabetes may enable targeted delivery of interventional programmes while avoiding the burden of prevention and treatment in those at low risk. We studied whether the use of a machine-learning model can improve the prediction of incident diabetes utilizing patient data from electronic medical records. A machine-learning model predicting the progression from pre-diabetes to diabetes was developed using a gradient boosted trees model. The model was trained on data from The Health Improvement Network (THIN) database cohort, internally validated on THIN data not used for training, and externally validated on the Canadian AppleTree and the Israeli Maccabi Health Services (MHS) data sets. The model's predictive ability was compared with that of a logistic-regression model within each data set. A cohort of 852 454 individuals with pre-diabetes (glucose ≥ 100 mg/dL and/or HbA1c ≥ 5.7) was used for model training including 4.9 million time points using 900 features. The full model was eventually implemented using 69 variables, generated from 11 basic signals. The machine-learning model demonstrated superiority over the logistic-regression model, which was maintained at all sensitivity levels - comparing AUC [95% CI] between the models; in the THIN data set (0.865 [0.860,0.869] vs 0.778 [0.773,0.784] P < .05), the AppleTree data set (0.907 [0.896, 0.919] vs 0.880 [0.867, 0.894] P < .05) and the MHS data set (0.925 [0.923, 0.927] vs 0.876 [0.872, 0.879] P < .05). Machine-learning models preserve their performance across populations in diabetes prediction, and can be integrated into large clinical systems, leading to judicious selection of persons for interventional programmes.
Types de publication
Journal Article
Validation Study
Langues
eng
Sous-ensembles de citation
IM
Pagination
e3252Informations de copyright
© 2020 John Wiley & Sons, Ltd.
Références
IDF Diabetes Atlas, 8th edition. Available at: http://www.diabetesatlas.org/resources/2017-atlas.html. Last accessed 6.4.2019.
Gerstein HC, Santaguida P, Raina P, et al. Annual incidence and relative risk of diabetes in people with various categories of dysglycemia: a systematic overview and meta-analysis of prospective studies. Diabetes Res Clin Pract. 2007;78(3):305-312.
Park BZ, Cantrell L, Hunt H, Farris RP, Schumacher P, Bauer UE. State public health actions to prevent and control diabetes, heart disease, obesity and associated risk factors, and promote school health. Prev Chronic Dis. 2017;14:160437
Tuomilehto J, Schwarz PE. Preventing diabetes: early versus late preventive interventions. Diabetes Care. 2016;39(Suppl 2):S115-S120.
Ibrahim M, Tuomilehto J, Aschner P, et al. Global status of diabetes prevention and prospects for action: A consensus statement. Diabetes Metab Res Rev. 2018;34(6):e3021.
Barry E, Roberts S, Oke J, Vijayaraghavan S, Normansell R, Greenhalgh T. Efficacy and effectiveness of screen and treat policies in prevention of type 2 diabetes: systematic review and meta-analysis of screening tests and interventions. BMJ. 2017;356:i6538.
Phillips LS, Ratner RE, Buse JB, Kahn SE. We can change the natural history of type 2 diabetes. Diabetes Care. 2014;37:2668-2676.
Ferrannini E. Definition of intervention points in prediabetes. Lancet Diabetes Endocrinol. 2014;2(8):667-675.
Knowler WC, Barrett-Connor E, Fowler SE, et al. Diabetes prevention program research group. Reduction in the incidence of type 2 diabetes with lifestyle intervention or metformin. N Engl J Med. 2002;346(6):393-403.
Diabetes Prevention Program Research Group, Knowler WC, Fowler SE, Hamman RF, et al. 10-year follow-up of diabetes incidence and weight loss in the diabetes prevention program outcomes study. Lancet. 2009;374(9702):1677-1686.
Chiasson JL, Josse RG, Gomis R, Hanefeld M, Karasik A. Laakso M; STOP-NIDDM Trail research group. Acarbose for prevention of type 2 diabetes mellitus: the STOP-NIDDM randomised trial. Lancet. 2002;359(9323):2072-2077.
DeFronzo RA, Tripathy D, Schwenke DC, et al. Reaven PD; ACT NOW study. Pioglitazone for diabetes prevention in impaired glucose tolerance. N Engl J Med. 2011;364(12):1104-1115.
Roberts S, Barry E, Craig D, Airoldi M, Bevan G, Greenhalgh T. Preventing type 2 diabetes: systematic review of studies of cost-effectiveness of lifestyle programs and metformin, with and without screening, for pre-diabetes. BMJ Open. 2017;7(11):e017184.
Ali MK, Echouffo-Tcheugui J, Williamson DF. How effective were lifestyle interventions in real-world settings that were modeled on the diabetes prevention program? Health Aff (Millwood). 2012;31(1):67-75.
Aziz Z, Absetz P, Oldroyd J, Pronk NP, Oldenburg B. A systematic review of real-world diabetes prevention programs: learnings from the last 15 years. Implement Sci. 2015;10:172-189.
Abbasi A, Peelen LM, Corpeleijn E, et al. Prediction models for risk of developing type 2 diabetes: systematic literature search and independent external validation study. BMJ. 2012;345:e5900.
Ahn CH, Yoon JW, Hahn S, Moon MK, Park KS, Cho YM. Evaluation of non-laboratory and laboratory prediction models for current and future diabetes mellitus: a cross-sectional and retrospective cohort study. PLoS One. 2016;11(5):e0156155.
Zhang Y, Hu G, Zhang L, Mayo R, Chen L. A novel testing model for opportunistic screening of pre-diabetes and diabetes among U.S. Adults. PLoS One. 2015;10(3):e0120382.
Abbasi A, Sahlqvist AS, Lotta L, et al. A systematic review of biomarkers and risk of incident type 2 diabetes: an overview of epidemiological, prediction and aetiological research literature. PLoS One. 2016;11(10):e0163721.
The THIN database. Available at: https://www.ucl.ac.uk/iehc/research/primary-care-and-population-health/research. Last accessed: 6.4.2019.
Blak BT, Thompson M, Dattani H, Bourke A. Generalizability of the health improvement network (THIN) database: demographics, chronic disease prevalence and mortality rates. Inform Prim Care. 2011;19:251-255.
IQVIA Canada de-identified EMR from 1.2 million patients (subset of 301,443) 2002-01-01; 2017-10-31. Available at: https://www.iqvia.com/solutions/real-world-value-and-outcomes/realworld-data. Last accessed at: 6.4.2019.
Morris Kahn Maccabi Health Data Science Institute. Available at: https://www.mkm-research.org/our-data. Last accessed at: 6.4.2019.
McGovern AP, Fieldhouse H, Tippu Z, et al. Glucose test provenance recording in UKprimary care: was that fasted or random? Diabet Med. 2017;34(1):93-98.
Guolin Ke, Qi Meng, Thomas Finley, et al. (2017). LightGBM: a highly efficient gradient boosting decision tree. Adv Neural Inf Process Syst. 2017. p. 3146-3154
American Diabetes Association. 5. Prevention or delay of type 2 diabetes: standards of medical Care in Diabetes-2018. Diabetes Care. 2018;41(Suppl 1):S51-S54.
Casanova R, Saldana S, Simpson SL, et al. Prediction of incident diabetes in the Jackson heart study using high-dimensional machine learning. PLoS One. 2016;11(10):e0163942.