AssistMED project: Transforming cardiology cohort characterisation from electronic health records through natural language processing - Algorithm design, preliminary results, and field prospects.

Cardiology Epidemiology NLP Natural language processing Text-mining

Journal

International journal of medical informatics
ISSN: 1872-8243
Titre abrégé: Int J Med Inform
Pays: Ireland
ID NLM: 9711057

Informations de publication

Date de publication:
19 Feb 2024
Historique:
received: 12 10 2023
revised: 15 02 2024
accepted: 16 02 2024
medline: 7 3 2024
pubmed: 7 3 2024
entrez: 6 3 2024
Statut: aheadofprint

Résumé

Electronic health records (EHR) are of great value for clinical research. However, EHR consists primarily of unstructured text which must be analysed by a human and coded into a database before data analysis- a time-consuming and costly process limiting research efficiency. Natural language processing (NLP) can facilitate data retrieval from unstructured text. During AssistMED project, we developed a practical, NLP tool that automatically provides comprehensive clinical characteristics of patients from EHR, that is tailored to clinical researchers needs. AssistMED retrieves patient characteristics regarding clinical conditions, medications with dosage, and echocardiographic parameters with clinically oriented data structure and provides researcher-friendly database output. We validate the algorithm performance against manual data retrieval and provide critical quantitative and qualitative analysis. AssistMED analysed the presence of 56 clinical conditions, medications from 16 drug groups with dosage and 15 numeric echocardiographic parameters in a sample of 400 patients hospitalized in the cardiology unit. No statistically significant differences between algorithm and human retrieval were noted. Qualitative analysis revealed that disagreements with manual annotation were primarily accounted to random algorithm errors, erroneous human annotation and lack of advanced context awareness of our tool. Current NLP approaches are feasible to acquire accurate and detailed patient characteristics tailored to clinical researchers' needs from EHR. We present an in-depth description of an algorithm development and validation process, discuss obstacles and pinpoint potential solutions, including opportunities arising with recent advancements in the field of NLP, such as large language models.

Identifiants

pubmed: 38447318
pii: S1386-5056(24)00043-1
doi: 10.1016/j.ijmedinf.2024.105380
pii:
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

105380

Informations de copyright

Copyright © 2024 Elsevier B.V. All rights reserved.

Déclaration de conflit d'intérêts

Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Auteurs

Cezary Maciejewski (C)

1st Chair and Department of Cardiology, Medical University of Warsaw, 02-091 Warszawa, Poland; Doctoral School, Medical University of Warsaw, 02-091 Warszawa, Poland; Department of Medical Informatics and Telemedicine, Medical University of Warsaw, 02-091 Warszawa, Poland.

Krzysztof Ozierański (K)

1st Chair and Department of Cardiology, Medical University of Warsaw, 02-091 Warszawa, Poland. Electronic address: krzysztof.ozieranski@wum.edu.pl.

Adam Barwiołek (A)

Codifive sp. z o.o., Lindleya 16, 02-013 Warszawa, Poland.

Mikołaj Basza (M)

Medical University of Silesia in Katowice, 40-055 Katowice, Poland.

Aleksandra Bożym (A)

1st Chair and Department of Cardiology, Medical University of Warsaw, 02-091 Warszawa, Poland.

Michalina Ciurla (M)

1st Chair and Department of Cardiology, Medical University of Warsaw, 02-091 Warszawa, Poland.

Maciej Janusz Krajsman (M)

Department of Medical Informatics and Telemedicine, Medical University of Warsaw, 02-091 Warszawa, Poland.

Magdalena Maciejewska (M)

Doctoral School, Medical University of Warsaw, 02-091 Warszawa, Poland.

Piotr Lodziński (P)

1st Chair and Department of Cardiology, Medical University of Warsaw, 02-091 Warszawa, Poland.

Grzegorz Opolski (G)

1st Chair and Department of Cardiology, Medical University of Warsaw, 02-091 Warszawa, Poland.

Marcin Grabowski (M)

1st Chair and Department of Cardiology, Medical University of Warsaw, 02-091 Warszawa, Poland.

Andrzej Cacko (A)

1st Chair and Department of Cardiology, Medical University of Warsaw, 02-091 Warszawa, Poland; Department of Medical Informatics and Telemedicine, Medical University of Warsaw, 02-091 Warszawa, Poland.

Paweł Balsam (P)

1st Chair and Department of Cardiology, Medical University of Warsaw, 02-091 Warszawa, Poland.

Classifications MeSH