Can Hyperparameter Tuning Improve the Performance of a Super Learner?: A Case Study.


Journal

Epidemiology (Cambridge, Mass.)
ISSN: 1531-5487
Titre abrégé: Epidemiology
Pays: United States
ID NLM: 9009644

Informations de publication

Date de publication:
07 2019
Historique:
pubmed: 16 4 2019
medline: 18 7 2020
entrez: 16 4 2019
Statut: ppublish

Résumé

Super learning is an ensemble machine learning approach used increasingly as an alternative to classical prediction techniques. When implementing super learning, however, not tuning the hyperparameters of the algorithms in it may adversely affect the performance of the super learner. In this case study, we used data from a Canadian electronic prescribing system to predict when primary care physicians prescribed antidepressants for indications other than depression. The analysis included 73,576 antidepressant prescriptions and 373 candidate predictors. We derived two super learners: one using tuned hyperparameter values for each machine learning algorithm identified through an iterative grid search procedure and the other using the default values. We compared the performance of the tuned super learner to that of the super learner using default values ("untuned") and a carefully constructed logistic regression model from a previous analysis. The tuned super learner had a scaled Brier score (R) of 0.322 (95% [confidence interval] CI = 0.267, 0.362). In comparison, the untuned super learner had a scaled Brier score of 0.309 (95% CI = 0.256, 0.353), corresponding to an efficiency loss of 4% (relative efficiency 0.96; 95% CI = 0.93, 0.99). The previously-derived logistic regression model had a scaled Brier score of 0.307 (95% CI = 0.245, 0.360), corresponding to an efficiency loss of 5% relative to the tuned super learner (relative efficiency 0.95; 95% CI = 0.88, 1.01). In this case study, hyperparameter tuning produced a super learner that performed slightly better than an untuned super learner. Tuning the hyperparameters of individual algorithms in a super learner may help optimize performance.

Sections du résumé

BACKGROUND
Super learning is an ensemble machine learning approach used increasingly as an alternative to classical prediction techniques. When implementing super learning, however, not tuning the hyperparameters of the algorithms in it may adversely affect the performance of the super learner.
METHODS
In this case study, we used data from a Canadian electronic prescribing system to predict when primary care physicians prescribed antidepressants for indications other than depression. The analysis included 73,576 antidepressant prescriptions and 373 candidate predictors. We derived two super learners: one using tuned hyperparameter values for each machine learning algorithm identified through an iterative grid search procedure and the other using the default values. We compared the performance of the tuned super learner to that of the super learner using default values ("untuned") and a carefully constructed logistic regression model from a previous analysis.
RESULTS
The tuned super learner had a scaled Brier score (R) of 0.322 (95% [confidence interval] CI = 0.267, 0.362). In comparison, the untuned super learner had a scaled Brier score of 0.309 (95% CI = 0.256, 0.353), corresponding to an efficiency loss of 4% (relative efficiency 0.96; 95% CI = 0.93, 0.99). The previously-derived logistic regression model had a scaled Brier score of 0.307 (95% CI = 0.245, 0.360), corresponding to an efficiency loss of 5% relative to the tuned super learner (relative efficiency 0.95; 95% CI = 0.88, 1.01).
CONCLUSIONS
In this case study, hyperparameter tuning produced a super learner that performed slightly better than an untuned super learner. Tuning the hyperparameters of individual algorithms in a super learner may help optimize performance.

Identifiants

pubmed: 30985529
doi: 10.1097/EDE.0000000000001027
pmc: PMC6553550
doi:

Substances chimiques

Antidepressive Agents 0

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

521-531

Références

J Fam Pract. 2002 Nov;51(11):938-42
pubmed: 12485547
J Am Med Inform Assoc. 2006 Mar-Apr;13(2):148-59
pubmed: 16357357
Stat Appl Genet Mol Biol. 2007;6:Article25
pubmed: 17910531
Epidemiology. 2010 Jan;21(1):128-38
pubmed: 20010215
Stat Med. 2010 Mar 30;29(7-8):915-23
pubmed: 20213705
Drug Saf. 2010 Jul 1;33(7):559-67
pubmed: 20553057
J Stat Softw. 2010;33(1):1-22
pubmed: 20808728
Genet Epidemiol. 2011;35 Suppl 1:S5-11
pubmed: 22128059
Am J Epidemiol. 2013 Mar 1;177(5):443-52
pubmed: 23364879
J Trauma Acute Care Surg. 2013 Jul;75(1 Suppl 1):S53-60
pubmed: 23778512
J Clin Epidemiol. 2013 Aug;66(8 Suppl):S99-109
pubmed: 23849160
Stat Methods Med Res. 2016 Oct;25(5):1804-1823
pubmed: 24047600
Biom J. 2014 Jul;56(4):534-63
pubmed: 24478134
Biomed Eng Online. 2014 Jul 05;13:94
pubmed: 24998888
Lancet Respir Med. 2015 Jan;3(1):42-52
pubmed: 25466337
J Acquir Immune Defic Syndr. 2015 May 1;69(1):109-18
pubmed: 25942462
Health Serv Res. 2016 Dec;51(6):2358-2374
pubmed: 26891974
JAMA. 2016 May 24-31;315(20):2230-2
pubmed: 27218634
JMIR Med Inform. 2016 Nov 21;4(4):e38
pubmed: 27872036
Stat Med. 2017 Jun 15;36(13):2032-2047
pubmed: 28219110
BMJ. 2017 Feb 21;356:j603
pubmed: 28228380
PLoS One. 2017 Apr 4;12(4):e0174944
pubmed: 28376093
PLoS One. 2017 Apr 10;12(4):e0175383
pubmed: 28394905
J Psychiatr Res. 2018 Jan;96:15-22
pubmed: 28950110
Environ Health. 2017 Sep 26;16(1):102
pubmed: 28950902
Epidemiology. 2018 Jan;29(1):96-106
pubmed: 28991001
Clin Infect Dis. 2018 Jan 6;66(1):149-153
pubmed: 29020316
J Thorac Cardiovasc Surg. 2018 Mar;155(3):1130-1136.e4
pubmed: 29306487
Clin Epidemiol. 2018 Apr 18;10:457-474
pubmed: 29713202
SSM Popul Health. 2018 Mar 27;4:347-349
pubmed: 29854919

Auteurs

Jenna Wong (J)

Department of Epidemiology, Biostatistics, and Occupational Health, McGill University, Montreal, Canada.

Travis Manderson (T)

School of Computer Science, McGill University, Montreal, Canada.

Michal Abrahamowicz (M)

Department of Epidemiology, Biostatistics, and Occupational Health, McGill University, Montreal, Canada.

David L Buckeridge (DL)

Department of Epidemiology, Biostatistics, and Occupational Health, McGill University, Montreal, Canada.

Robyn Tamblyn (R)

Department of Epidemiology, Biostatistics, and Occupational Health, McGill University, Montreal, Canada.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH