Use of electronic health record data and machine learning to identify candidates for HIV pre-exposure prophylaxis: a modelling study.
Journal
The lancet. HIV
ISSN: 2352-3018
Titre abrégé: Lancet HIV
Pays: Netherlands
ID NLM: 101645355
Informations de publication
Date de publication:
10 2019
10 2019
Historique:
received:
22
01
2019
revised:
28
03
2019
accepted:
11
04
2019
pubmed:
10
7
2019
medline:
10
6
2020
entrez:
10
7
2019
Statut:
ppublish
Résumé
The limitations of existing HIV risk prediction tools are a barrier to implementation of pre-exposure prophylaxis (PrEP). We developed and validated an HIV prediction model to identify potential PrEP candidates in a large health-care system. Our study population was HIV-uninfected adult members of Kaiser Permanente Northern California, a large integrated health-care system, who were not yet using PrEP and had at least 2 years of previous health plan enrolment with at least one outpatient visit from Jan 1, 2007, to Dec 31, 2017. Using 81 electronic health record (EHR) variables, we applied least absolute shrinkage and selection operator (LASSO) regression to predict incident HIV diagnosis within 3 years on a subset of patients who entered the cohort in 2007-14 (development dataset), assessing ten-fold cross-validated area under the receiver operating characteristic curve (AUC) and 95% CIs. We compared the full model to simpler models including only men who have sex with men (MSM) status and sexually transmitted infection (STI) positivity, testing, and treatment. Models were validated prospectively with data from an independent set of patients who entered the cohort in 2015-17. We computed predicted probabilities of incident HIV diagnosis within 3 years (risk scores), categorised as low risk (<0·05%), moderate risk (0·05% to <0·20%), high risk (0·20% to <1·0%), and very high risk (≥1·0%), for all patients in the validation dataset. Of 3 750 664 patients in 2007-17 (3 143 963 in the development dataset and 606 701 in the validation dataset), there were 784 incident HIV cases within 3 years of baseline. The LASSO procedure retained 44 predictors in the full model, with an AUC of 0·86 (95% CI 0·85-0·87) for incident HIV cases in 2007-14. Model performance remained high in the validation dataset (AUC 0·84, 0·80-0·89). The full model outperformed simpler models including only MSM status and STI positivity. For the full model, flagging 13 463 (2·2%) patients with high or very high HIV risk scores in the validation dataset identified 32 (38·6%) of the 83 incident HIV cases, including 32 (46·4%) of 69 male cases and none of the 14 female cases. The full model had equivalent sensitivity by race whereas simpler models identified fewer black than white HIV cases. Prediction models using EHR data can identify patients at high risk of HIV acquisition who could benefit from PrEP. Future studies should optimise EHR-based HIV risk prediction tools and evaluate their effect on prescription of PrEP. Kaiser Permanente Community Benefit Research Program and the US National Institutes of Health.
Sections du résumé
BACKGROUND
The limitations of existing HIV risk prediction tools are a barrier to implementation of pre-exposure prophylaxis (PrEP). We developed and validated an HIV prediction model to identify potential PrEP candidates in a large health-care system.
METHODS
Our study population was HIV-uninfected adult members of Kaiser Permanente Northern California, a large integrated health-care system, who were not yet using PrEP and had at least 2 years of previous health plan enrolment with at least one outpatient visit from Jan 1, 2007, to Dec 31, 2017. Using 81 electronic health record (EHR) variables, we applied least absolute shrinkage and selection operator (LASSO) regression to predict incident HIV diagnosis within 3 years on a subset of patients who entered the cohort in 2007-14 (development dataset), assessing ten-fold cross-validated area under the receiver operating characteristic curve (AUC) and 95% CIs. We compared the full model to simpler models including only men who have sex with men (MSM) status and sexually transmitted infection (STI) positivity, testing, and treatment. Models were validated prospectively with data from an independent set of patients who entered the cohort in 2015-17. We computed predicted probabilities of incident HIV diagnosis within 3 years (risk scores), categorised as low risk (<0·05%), moderate risk (0·05% to <0·20%), high risk (0·20% to <1·0%), and very high risk (≥1·0%), for all patients in the validation dataset.
FINDINGS
Of 3 750 664 patients in 2007-17 (3 143 963 in the development dataset and 606 701 in the validation dataset), there were 784 incident HIV cases within 3 years of baseline. The LASSO procedure retained 44 predictors in the full model, with an AUC of 0·86 (95% CI 0·85-0·87) for incident HIV cases in 2007-14. Model performance remained high in the validation dataset (AUC 0·84, 0·80-0·89). The full model outperformed simpler models including only MSM status and STI positivity. For the full model, flagging 13 463 (2·2%) patients with high or very high HIV risk scores in the validation dataset identified 32 (38·6%) of the 83 incident HIV cases, including 32 (46·4%) of 69 male cases and none of the 14 female cases. The full model had equivalent sensitivity by race whereas simpler models identified fewer black than white HIV cases.
INTERPRETATION
Prediction models using EHR data can identify patients at high risk of HIV acquisition who could benefit from PrEP. Future studies should optimise EHR-based HIV risk prediction tools and evaluate their effect on prescription of PrEP.
FUNDING
Kaiser Permanente Community Benefit Research Program and the US National Institutes of Health.
Identifiants
pubmed: 31285183
pii: S2352-3018(19)30137-7
doi: 10.1016/S2352-3018(19)30137-7
pmc: PMC7152802
mid: NIHMS1567249
pii:
doi:
Substances chimiques
Anti-HIV Agents
0
Types de publication
Evaluation Study
Journal Article
Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
e688-e695Subventions
Organisme : NICHD NIH HHS
ID : DP2 HD084070
Pays : United States
Organisme : NIAID NIH HHS
ID : K01 AI122853
Pays : United States
Organisme : NIMH NIH HHS
ID : K23 MH098795
Pays : United States
Organisme : NIAID NIH HHS
ID : P30 AI060354
Pays : United States
Commentaires et corrections
Type : CommentIn
Informations de copyright
Copyright © 2019 Elsevier Ltd. All rights reserved.
Références
J Gen Intern Med. 2017 Feb;32(2):192-198
pubmed: 27761767
AIDS. 2018 Jan 28;32(3):383-392
pubmed: 29194116
J Stat Softw. 2010;33(1):1-22
pubmed: 20808728
Am J Epidemiol. 2012 Apr 15;175(8):838-46
pubmed: 22431561
N Engl J Med. 2018 Mar 15;378(11):981-983
pubmed: 29539284
J Acquir Immune Defic Syndr. 2018 Feb 1;77(2):160-166
pubmed: 29084046
J Acquir Immune Defic Syndr. 2016 Jul 1;72(3):333-43
pubmed: 26918545
J Natl Med Assoc. 2002 Aug;94(8):666-8
pubmed: 12152921
BMC Infect Dis. 2016 Oct 17;16(1):571
pubmed: 27751179
MMWR Morb Mortal Wkly Rep. 2018 Oct 19;67(41):1147-1150
pubmed: 30335734
J Acquir Immune Defic Syndr. 2012 Aug 1;60(4):421-7
pubmed: 22487585
AIDS Behav. 2019 Feb;23(2):544-547
pubmed: 30101394
Ann Intern Med. 2015 May 19;162(10):735-6
pubmed: 25984857
Sex Transm Dis. 2017 May;44(5):297-302
pubmed: 28407646
AIDS Patient Care STDS. 2018 May;32(5):202-207
pubmed: 29672136
JAMA. 2015 Sep 8;314(10):1063-4
pubmed: 26348755
Circulation. 2014 Jun 24;129(25 Suppl 2):S49-73
pubmed: 24222018
BMC Cardiovasc Disord. 2013 Oct 22;13:90
pubmed: 24148829
J Gen Intern Med. 2003 Feb;18(2):146-52
pubmed: 12542590
Am J Public Health. 2015 Feb;105(2):e75-82
pubmed: 25521875
AIDS. 2007 Jul 31;21(12):1617-24
pubmed: 17630557
Biometrics. 2000 Sep;56(3):779-88
pubmed: 10985216