A Machine Learning Model to Successfully Predict Future Diagnosis of Chronic Myelogenous Leukemia With Retrospective Electronic Health Records Data.
Chronic myelogenous leukemia
Decision support techniques
Decision trees
Logistic regression
Machine learning
Prediction model studies
Predictions and projections
Statistical data analyses
Journal
American journal of clinical pathology
ISSN: 1943-7722
Titre abrégé: Am J Clin Pathol
Pays: England
ID NLM: 0370470
Informations de publication
Date de publication:
08 Nov 2021
08 Nov 2021
Historique:
pubmed:
30
6
2021
medline:
7
1
2022
entrez:
29
6
2021
Statut:
ppublish
Résumé
Chronic myelogenous leukemia (CML) is a clonal stem cell disorder accounting for 15% of adult leukemias. We aimed to determine if machine learning models could predict CML using blood cell counts prior to diagnosis. We identified patients with a diagnostic test for CML (BCR-ABL1) and at least 6 consecutive prior years of differential blood cell counts between 1999 and 2020 in the largest integrated health care system in the United States. Blood cell counts from different time periods prior to CML diagnostic testing were used to train, validate, and test machine learning models. The sample included 1,623 patients with BCR-ABL1 positivity rate 6.2%. The predictive ability of machine learning models improved when trained with blood cell counts closer to time of diagnosis: 2 to 5 years area under the curve (AUC), 0.59 to 0.67, 0.5 to 1 years AUC, 0.75 to 0.80, at diagnosis AUC, 0.87 to 0.92. Blood cell counts collected up to 5 years prior to diagnostic workup of CML successfully predicted the BCR-ABL1 test result. These findings suggest a machine learning model trained with blood cell counts could lead to diagnosis of CML earlier in the disease course compared to usual medical care.
Sections du résumé
BACKGROUND
BACKGROUND
Chronic myelogenous leukemia (CML) is a clonal stem cell disorder accounting for 15% of adult leukemias. We aimed to determine if machine learning models could predict CML using blood cell counts prior to diagnosis.
METHODS
METHODS
We identified patients with a diagnostic test for CML (BCR-ABL1) and at least 6 consecutive prior years of differential blood cell counts between 1999 and 2020 in the largest integrated health care system in the United States. Blood cell counts from different time periods prior to CML diagnostic testing were used to train, validate, and test machine learning models.
RESULTS
RESULTS
The sample included 1,623 patients with BCR-ABL1 positivity rate 6.2%. The predictive ability of machine learning models improved when trained with blood cell counts closer to time of diagnosis: 2 to 5 years area under the curve (AUC), 0.59 to 0.67, 0.5 to 1 years AUC, 0.75 to 0.80, at diagnosis AUC, 0.87 to 0.92.
CONCLUSIONS
CONCLUSIONS
Blood cell counts collected up to 5 years prior to diagnostic workup of CML successfully predicted the BCR-ABL1 test result. These findings suggest a machine learning model trained with blood cell counts could lead to diagnosis of CML earlier in the disease course compared to usual medical care.
Identifiants
pubmed: 34184028
pii: 6310949
doi: 10.1093/ajcp/aqab086
doi:
Substances chimiques
Fusion Proteins, bcr-abl
EC 2.7.10.2
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
1142-1148Informations de copyright
© American Society for Clinical Pathology, 2021.