Using the Textual Content of Radiological Reports to Detect Emerging Diseases: A Proof-of-Concept Study of COVID-19.
Computed tomography
Coronavirus disease 2019
Natural language processing
Radiological reports
SARS-CoV-2
Time series analysis
Unsupervised clustering
Journal
Journal of imaging informatics in medicine
ISSN: 2948-2933
Titre abrégé: J Imaging Inform Med
Pays: Switzerland
ID NLM: 9918663679206676
Informations de publication
Date de publication:
12 Jan 2024
12 Jan 2024
Historique:
received:
15
08
2023
accepted:
04
10
2023
revised:
02
10
2023
medline:
12
2
2024
pubmed:
12
2
2024
entrez:
12
2
2024
Statut:
aheadofprint
Résumé
Changes in the content of radiological reports at population level could detect emerging diseases. Herein, we developed a method to quantify similarities in consecutive temporal groupings of radiological reports using natural language processing, and we investigated whether appearance of dissimilarities between consecutive periods correlated with the beginning of the COVID-19 pandemic in France. CT reports from 67,368 consecutive adults across 62 emergency departments throughout France between October 2019 and March 2020 were collected. Reports were vectorized using time frequency-inverse document frequency (TF-IDF) analysis on one-grams. For each successive 2-week period, we performed unsupervised clustering of the reports based on TF-IDF values and partition-around-medoids. Next, we assessed the similarities between this clustering and a clustering from two weeks before according to the average adjusted Rand index (AARI). Statistical analyses included (1) cross-correlation functions (CCFs) with the number of positive SARS-CoV-2 tests and advanced sanitary index for flu syndromes (ASI-flu, from open-source dataset), and (2) linear regressions of time series at different lags to understand the variations of AARI over time. Overall, 13,235 chest CT reports were analyzed. AARI was correlated with ASI-flu at lag = + 1, + 5, and + 6 weeks (P = 0.0454, 0.0121, and 0.0042, respectively) and with SARS-CoV-2 positive tests at lag = - 1 and 0 week (P = 0.0057 and 0.0001, respectively). In the best fit, AARI correlated with the ASI-flu with a lag of 2 weeks (P = 0.0026), SARS-CoV-2-positive tests in the same week (P < 0.0001) and their interaction (P < 0.0001) (adjusted R
Identifiants
pubmed: 38343242
doi: 10.1007/s10278-023-00949-z
pii: 10.1007/s10278-023-00949-z
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Informations de copyright
© 2024. The Author(s).
Références
Cai T, Giannopoulos AA, Yu S, et al.: Natural Language Processing Technologies in Radiology Research and Clinical Applications. Radiographics. 2016; 36:176–191.
doi: 10.1148/rg.2016150080
pubmed: 26761536
Chen P-H: Essential Elements of Natural Language Processing: What the Radiologist Should Know. Acad Radiol. 2020; 27:6–12.
doi: 10.1016/j.acra.2019.08.010
pubmed: 31537505
Casey A, Davidson E, Poon M, et al.: A systematic review of natural language processing applied to radiology reports. BMC Medical Informatics and Decision Making. 2021; 21:179.
doi: 10.1186/s12911-021-01533-7
pubmed: 34082729
pmcid: 8176715
Crombé A, Seux M, Bratan F, et al.: What Influences the Way Radiologists Express Themselves in Their Reports? A Quantitative Assessment Using Natural Language Processing. J Digit Imaging. 2022; 35:993–1007.
doi: 10.1007/s10278-022-00619-6
pubmed: 35318544
pmcid: 8939885
Hassard F, Bajón-Fernández Y, Castro-Gutierrez V: Wastewater-based epidemiology for surveillance of infectious diseases in healthcare settings. Curr Opin Infect Dis. 2023; 36:288–295.
doi: 10.1097/QCO.0000000000000929
pubmed: 37260286
Sharkey ME, Kumar N, Mantero AMA, et al.: Lessons learned from SARS-CoV-2 measurements in wastewater. Sci Total Environ. 2021; 798:149177.
doi: 10.1016/j.scitotenv.2021.149177
pubmed: 34375259
pmcid: 8294117
Huang C, Wang Y, Li X, et al.: Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. Lancet. 2020; 395:497–506.
doi: 10.1016/S0140-6736(20)30183-5
pubmed: 31986264
pmcid: 7159299
Chung M, Bernheim A, Mei X, et al.: CT Imaging Features of 2019 Novel Coronavirus (2019-nCoV). Radiology. 2020; 295:202–207.
doi: 10.1148/radiol.2020200230
pubmed: 32017661
Song F, Shi N, Shan F, et al.: Emerging 2019 Novel Coronavirus (2019-nCoV) Pneumonia. Radiology. 2020; 295:210–217.
doi: 10.1148/radiol.2020200274
pubmed: 32027573
Dashboard COVID-19 from the French government: https://www.gouvernement.fr/info-coronavirus/carte-et-donnee . Accessed Jan 2023
Standardized Report for non-contrast-enhanced chest CT according to the French Society of Radiology: SFR e-Bulletin. 2020; https://ebulletin.radiologie.fr/actualites-covid-19/compte-rendu-tdm-thoracique-iv . Accessed Jan 2023
Lassau N, Ammari S, Chouzenoux E, et al.: Integrating deep learning CT-scan model, biological and clinical variables to predict severity of COVID-19 patients. Nat Commun. 2021; 12:634.
doi: 10.1038/s41467-020-20657-4
pubmed: 33504775
pmcid: 7840774
Das S, Ayus I, Gupta D: A comprehensive review of COVID-19 detection with machine learning and deep learning techniques. Health Technol (Berl). 2023; 1–14.
Wang M, Xia C, Huang L, et al.: Deep learning-based triage and analysis of lesion burden for COVID-19: a retrospective study with external validation. Lancet Digit Health. 2020; 2:e506–e515.
doi: 10.1016/S2589-7500(20)30199-0
pubmed: 32984796
pmcid: 7508506
Li L, Qin L, Xu Z, et al.: Using Artificial Intelligence to Detect COVID-19 and Community-acquired Pneumonia Based on Pulmonary CT: Evaluation of the Diagnostic Accuracy. Radiology. 2020; 296:E65–E71.
doi: 10.1148/radiol.2020200905
pubmed: 32191588
Chung J, Kim D, Choi J, et al.: Prediction of oxygen requirement in patients with COVID-19 using a pre-trained chest radiograph xAI model: efficient development of auditable risk prediction models via a fine-tuning approach. Sci Rep. 2022; 12:21164.
doi: 10.1038/s41598-022-24721-5
pubmed: 36476724
pmcid: 9729627
Li MD, Wood PA, Alkasab TK, Lev MH, Kalpathy-Cramer J, Succi MD: Automated tracking of emergency department abdominal CT findings during the COVID-19 pandemic using natural language processing. The American Journal of Emergency Medicine. 2021; 49:52–57.
doi: 10.1016/j.ajem.2021.05.057
pubmed: 34062318
pmcid: 8154187
Wickham H, Averick M, Bryan J, et al.: Welcome to the Tidyverse. Journal of Open Source Software. 2019; 4:1686.
doi: 10.21105/joss.01686
Sparck Jones K: A STATISTICAL INTERPRETATION OF TERM SPECIFICITY AND ITS APPLICATION IN RETRIEVAL. Journal of Documentation. 1972; 28:11–21.
doi: 10.1108/eb026526
Partitioning Around Medoids (Program PAM): In: Finding Groups in Data. John Wiley & Sons, Ltd, 1990. p. 68–125.
Hubert L, Arabie P: Comparing partitions. Journal of Classification. 1985; 2:193–218.
doi: 10.1007/BF01908075
Hyndman RJ, Khandakar Y: Automatic Time Series Forecasting: The forecast Package for R. Journal of Statistical Software. 2008; 27:1–22.
doi: 10.18637/jss.v027.i03
Crombé A, Lecomte J-C, Banaste N, et al.: Emergency teleradiological activity is an epidemiological estimator and predictor of the covid-19 pandemic in mainland France. Insights Imaging. 2021; 12:103.
doi: 10.1186/s13244-021-01040-3
pubmed: 34292414
pmcid: 8295630
Leonard-Lorant I, Severac F, Bilbault P, et al.: Normal chest CT in 1091 symptomatic patients with confirmed Covid-19: frequency, characteristics and outcome. Eur Radiol. 2021; 31:5172–5177.
doi: 10.1007/s00330-020-07593-z
pubmed: 33439316
pmcid: 7804574
Nivet H, Crombé A, Schuster P, et al.: The accuracy of teleradiologists in diagnosing COVID-19 based on a French multicentric emergency cohort. Eur Radiol. 2021; 31:2833–2844.
doi: 10.1007/s00330-020-07345-z
pubmed: 33123790
Wong HYF, Lam HYS, Fong AH-T, et al.: Frequency and Distribution of Chest Radiographic Findings in Patients Positive for COVID-19. Radiology. 2020; 296:E72–E78.
Wang Y, Dong C, Hu Y, et al.: Temporal Changes of CT Findings in 90 Patients with COVID-19 Pneumonia: A Longitudinal Study. Radiology. 2020; 296:E55–E64.
doi: 10.1148/radiol.2020200843
pubmed: 32191587
Caruso D, Zerunian M, Polici M, et al.: Chest CT Features of COVID-19 in Rome, Italy. Radiology. 2020; 201237.
Lang M, Yeung T, Mendoza DP, et al.: Imaging Volume Trends and Recovery During the COVID-19 Pandemic: A Comparative Analysis Between a Large Urban Academic Hospital and Its Affiliated Imaging Centers. Acad Radiol. 2020; 27:1353–1362.
doi: 10.1016/j.acra.2020.08.008
pubmed: 32830030
pmcid: 7428785
Blei DM, Ng AY, Jordan MI: Latent dirichlet allocation. J Mach Learn Res. 2003; 3:993–1022.
Hahsler M, Piekenbrock M, Doran D: dbscan : Fast Density-Based Clustering with R. J Stat Soft. 2019; 91:.