Evaluation of Natural Language Processing for the Identification of Crohn Disease-Related Variables in Spanish Electronic Health Records: A Validation Study for the PREMONITION-CD Project.
Crohn disease
artificial intelligence
electronic health records
inflammatory bowel disease
linguistic validation
natural language processing
Journal
JMIR medical informatics
ISSN: 2291-9694
Titre abrégé: JMIR Med Inform
Pays: Canada
ID NLM: 101645109
Informations de publication
Date de publication:
18 02 2022
18 02 2022
Historique:
received:
11
05
2021
accepted:
02
01
2022
revised:
22
07
2021
entrez:
18
2
2022
pubmed:
19
2
2022
medline:
19
2
2022
Statut:
epublish
Résumé
The exploration of clinically relevant information in the free text of electronic health records (EHRs) holds the potential to positively impact clinical practice as well as knowledge regarding Crohn disease (CD), an inflammatory bowel disease that may affect any segment of the gastrointestinal tract. The EHRead technology, a clinical natural language processing (cNLP) system, was designed to detect and extract clinical information from narratives in the clinical notes contained in EHRs. The aim of this study is to validate the performance of the EHRead technology in identifying information of patients with CD. We used the EHRead technology to explore and extract CD-related clinical information from EHRs. To validate this tool, we compared the output of the EHRead technology with a manually curated gold standard to assess the quality of our cNLP system in detecting records containing any reference to CD and its related variables. The validation metrics for the main variable (CD) were a precision of 0.88, a recall of 0.98, and an F1 score of 0.93. Regarding the secondary variables, we obtained a precision of 0.91, a recall of 0.71, and an F1 score of 0.80 for CD flare, while for the variable vedolizumab (treatment), a precision, recall, and F1 score of 0.86, 0.94, and 0.90 were obtained, respectively. This evaluation demonstrates the ability of the EHRead technology to identify patients with CD and their related variables from the free text of EHRs. To the best of our knowledge, this study is the first to use a cNLP system for the identification of CD in EHRs written in Spanish.
Sections du résumé
BACKGROUND
The exploration of clinically relevant information in the free text of electronic health records (EHRs) holds the potential to positively impact clinical practice as well as knowledge regarding Crohn disease (CD), an inflammatory bowel disease that may affect any segment of the gastrointestinal tract. The EHRead technology, a clinical natural language processing (cNLP) system, was designed to detect and extract clinical information from narratives in the clinical notes contained in EHRs.
OBJECTIVE
The aim of this study is to validate the performance of the EHRead technology in identifying information of patients with CD.
METHODS
We used the EHRead technology to explore and extract CD-related clinical information from EHRs. To validate this tool, we compared the output of the EHRead technology with a manually curated gold standard to assess the quality of our cNLP system in detecting records containing any reference to CD and its related variables.
RESULTS
The validation metrics for the main variable (CD) were a precision of 0.88, a recall of 0.98, and an F1 score of 0.93. Regarding the secondary variables, we obtained a precision of 0.91, a recall of 0.71, and an F1 score of 0.80 for CD flare, while for the variable vedolizumab (treatment), a precision, recall, and F1 score of 0.86, 0.94, and 0.90 were obtained, respectively.
CONCLUSIONS
This evaluation demonstrates the ability of the EHRead technology to identify patients with CD and their related variables from the free text of EHRs. To the best of our knowledge, this study is the first to use a cNLP system for the identification of CD in EHRs written in Spanish.
Identifiants
pubmed: 35179507
pii: v10i2e30345
doi: 10.2196/30345
pmc: PMC8900906
doi:
Types de publication
Journal Article
Langues
eng
Pagination
e30345Investigateurs
Carlos Castaño
(C)
Ángel Ponferrada Díaz
(ÁP)
María Chaparro
(M)
María José Casanova
(MJ)
Felipe Ramos Zabala
(FR)
Almudena Calvache
(A)
Fernando Bermejo
(F)
Noemí Manceñido
(N)
Marta Calvo Moya
(MC)
Informations de copyright
©Carmen Montoto, Javier P Gisbert, Iván Guerra, Rocío Plaza, Ramón Pajares Villarroya, Luis Moreno Almazán, María Del Carmen López Martín, Mercedes Domínguez Antonaya, Isabel Vera Mendoza, Jesús Aparicio, Vicente Martínez, Ignacio Tagarro, Alonso Fernandez-Nistal, Lea Canales, Sebastian Menke, Fernando Gomollón, PREMONITION-CD Study Group. Originally published in JMIR Medical Informatics (https://medinform.jmir.org), 18.02.2022.
Références
J Am Med Inform Assoc. 2018 Mar 1;25(3):331-336
pubmed: 29186491
Biochem Med (Zagreb). 2012;22(3):276-82
pubmed: 23092060
Eur Respir J. 2021 Mar 4;57(3):
pubmed: 33154029
Nat Rev Gastroenterol Hepatol. 2015 Apr;12(4):205-17
pubmed: 25732745
Nat Rev Gastroenterol Hepatol. 2019 May;16(5):312-321
pubmed: 30659247
World J Gastroenterol. 2014 Jan 7;20(1):31-6
pubmed: 24415855
JMIR Med Inform. 2021 Jul 23;9(7):e20492
pubmed: 34297002
J Cardiovasc Transl Res. 2017 Jun;10(3):313-321
pubmed: 28585184
J Biomed Inform. 2015 Aug;56:318-32
pubmed: 26141794
Nat Rev Genet. 2012 May 02;13(6):395-405
pubmed: 22549152
J Clin Med. 2020 Oct 12;9(10):
pubmed: 33053774
JMIR Med Inform. 2016 Nov 11;4(4):e37
pubmed: 27836816
J Pharm Policy Pract. 2020 Nov 9;13(1):75
pubmed: 33292570
J Med Internet Res. 2020 Oct 28;22(10):e21801
pubmed: 33090964
Inflamm Bowel Dis. 2013 Jun;19(7):1411-20
pubmed: 23567779
J Am Med Inform Assoc. 2016 Sep;23(5):1007-15
pubmed: 26911811
BMC Med Inform Decis Mak. 2017 Jul 5;17(Suppl 2):67
pubmed: 28699566
AMA J Ethics. 2017 Mar 1;19(3):281-288
pubmed: 28323609
World J Surg. 2011 Mar;35(3):500-4
pubmed: 21190114
Gut. 2010 Sep;59(9):1200-6
pubmed: 20650924
Yearb Med Inform. 2015 Aug 13;10(1):183-93
pubmed: 26293867
J Biomed Inform. 2017 Jul;71:16-30
pubmed: 28526460
J Am Med Inform Assoc. 2010 Sep-Oct;17(5):507-13
pubmed: 20819853
J Am Med Inform Assoc. 1994 Mar-Apr;1(2):142-60
pubmed: 7719796
Inflamm Bowel Dis. 2021 Jun 15;27(7):1035-1044
pubmed: 32914165
Int J Med Inform. 2018 Mar;111:83-89
pubmed: 29425639
J Biomed Inform. 2013 Oct;46(5):765-73
pubmed: 23810857
Biometrics. 1977 Mar;33(1):159-74
pubmed: 843571