Extracting seizure frequency from epilepsy clinic notes: a machine reading approach to natural language processing.
electronic medical record
epilepsy
natural language processing
question-answering
Journal
Journal of the American Medical Informatics Association : JAMIA
ISSN: 1527-974X
Titre abrégé: J Am Med Inform Assoc
Pays: England
ID NLM: 9430800
Informations de publication
Date de publication:
13 04 2022
13 04 2022
Historique:
received:
26
11
2021
revised:
11
01
2022
accepted:
08
02
2022
pubmed:
23
2
2022
medline:
16
4
2022
entrez:
22
2
2022
Statut:
ppublish
Résumé
Seizure frequency and seizure freedom are among the most important outcome measures for patients with epilepsy. In this study, we aimed to automatically extract this clinical information from unstructured text in clinical notes. If successful, this could improve clinical decision-making in epilepsy patients and allow for rapid, large-scale retrospective research. We developed a finetuning pipeline for pretrained neural models to classify patients as being seizure-free and to extract text containing their seizure frequency and date of last seizure from clinical notes. We annotated 1000 notes for use as training and testing data and determined how well 3 pretrained neural models, BERT, RoBERTa, and Bio_ClinicalBERT, could identify and extract the desired information after finetuning. The finetuned models (BERTFT, Bio_ClinicalBERTFT, and RoBERTaFT) achieved near-human performance when classifying patients as seizure free, with BERTFT and Bio_ClinicalBERTFT achieving accuracy scores over 80%. All 3 models also achieved human performance when extracting seizure frequency and date of last seizure, with overall F1 scores over 0.80. The best combination of models was Bio_ClinicalBERTFT for classification, and RoBERTaFT for text extraction. Most of the gains in performance due to finetuning required roughly 70 annotated notes. Our novel machine reading approach to extracting important clinical outcomes performed at or near human performance on several tasks. This approach opens new possibilities to support clinical practice and conduct large-scale retrospective clinical research. Future studies can use our finetuning pipeline with minimal training annotations to answer new clinical questions.
Identifiants
pubmed: 35190834
pii: 6534112
doi: 10.1093/jamia/ocac018
pmc: PMC9006692
doi:
Types de publication
Journal Article
Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
873-881Subventions
Organisme : NINDS NIH HHS
ID : K23 NS121520
Pays : United States
Organisme : NINDS NIH HHS
ID : 1DP1 OD029758
Pays : United States
Organisme : NINDS NIH HHS
ID : DP1 NS122038
Pays : United States
Informations de copyright
© The Author(s) 2022. Published by Oxford University Press on behalf of the American Medical Informatics Association.
Références
Int J Popul Data Sci. 2020 Jan 30;5(1):1123
pubmed: 32935049
Biochem Med (Zagreb). 2012;22(3):276-82
pubmed: 23092060
PLoS One. 2015 Jul 06;10(7):e0131521
pubmed: 26147611
Plast Reconstr Surg. 2010 Dec;126(6):2234-2242
pubmed: 20697313
Clin Res Cardiol. 2017 Jan;106(1):1-9
pubmed: 27557678
J Clin Oncol. 2003 Nov 15;21(22):4081-2
pubmed: 14559890
BMJ Open. 2019 Apr 1;9(4):e023232
pubmed: 30940752
J Am Med Inform Assoc. 2020 Dec 9;27(12):1935-1942
pubmed: 33120431
Annu Rev Public Health. 2016;37:61-81
pubmed: 26667605
J Clin Oncol. 2003 Nov 15;21(22):4145-50
pubmed: 14559889