A Bayesian learning model to predict the risk for cannabis use disorder.
Bayesian methods
Cannabis use disorder
Machine learning
Model validation
Prediction model
Journal
Drug and alcohol dependence
ISSN: 1879-0046
Titre abrégé: Drug Alcohol Depend
Pays: Ireland
ID NLM: 7513587
Informations de publication
Date de publication:
01 07 2022
01 07 2022
Historique:
received:
28
12
2021
revised:
19
04
2022
accepted:
23
04
2022
pubmed:
20
5
2022
medline:
18
6
2022
entrez:
19
5
2022
Statut:
ppublish
Résumé
The prevalence of cannabis use disorder (CUD) has been increasing recently and is expected to increase further due to the rising trend of cannabis legalization. To help stem this public health concern, a model is needed that predicts for an adolescent or young adult cannabis user their personalized risk of developing CUD in adulthood. However, there exists no such model that is built using nationally representative longitudinal data. We use a novel Bayesian learning approach and data from Add Health (n = 8712), a nationally representative longitudinal study, to build logistic regression models using four different regularization priors: lasso, ridge, horseshoe, and t. The models are compared by their prediction performance on unseen data via 5-fold-cross-validation (CV). We assess model discrimination using the area under the curve (AUC) and calibration by comparing the expected (E) and observed (O) number of CUD cases. We also externally validate the final model on independent test data from Add Health (n = 570). Our final model is based on lasso prior and has seven predictors: biological sex; scores on personality traits of neuroticism, openness, and conscientiousness; and measures of adverse childhood experiences, delinquency, and peer cannabis use. It has good discrimination and calibration performance as reflected by its respective AUC and E/O of 0.69 and 0.95 based on 5-fold CV and 0.71 and 1.10 on validation data. This externally validated model may help in identifying adolescent or young adult cannabis users at high risk of developing CUD in adulthood.
Sections du résumé
BACKGROUND
The prevalence of cannabis use disorder (CUD) has been increasing recently and is expected to increase further due to the rising trend of cannabis legalization. To help stem this public health concern, a model is needed that predicts for an adolescent or young adult cannabis user their personalized risk of developing CUD in adulthood. However, there exists no such model that is built using nationally representative longitudinal data.
METHODS
We use a novel Bayesian learning approach and data from Add Health (n = 8712), a nationally representative longitudinal study, to build logistic regression models using four different regularization priors: lasso, ridge, horseshoe, and t. The models are compared by their prediction performance on unseen data via 5-fold-cross-validation (CV). We assess model discrimination using the area under the curve (AUC) and calibration by comparing the expected (E) and observed (O) number of CUD cases. We also externally validate the final model on independent test data from Add Health (n = 570).
RESULTS
Our final model is based on lasso prior and has seven predictors: biological sex; scores on personality traits of neuroticism, openness, and conscientiousness; and measures of adverse childhood experiences, delinquency, and peer cannabis use. It has good discrimination and calibration performance as reflected by its respective AUC and E/O of 0.69 and 0.95 based on 5-fold CV and 0.71 and 1.10 on validation data.
CONCLUSION
This externally validated model may help in identifying adolescent or young adult cannabis users at high risk of developing CUD in adulthood.
Identifiants
pubmed: 35588608
pii: S0376-8716(22)00213-7
doi: 10.1016/j.drugalcdep.2022.109476
pii:
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
109476Informations de copyright
Copyright © 2022 Elsevier B.V. All rights reserved.