A comprehensive study of mobility functioning information in clinical notes: Entity hierarchy, corpus annotation, and sequence labeling.
Clinical notes
Functioning information
Mobility
Named entity recognition
Natural language processing
Text mining
Journal
International journal of medical informatics
ISSN: 1872-8243
Titre abrégé: Int J Med Inform
Pays: Ireland
ID NLM: 9711057
Informations de publication
Date de publication:
03 2021
03 2021
Historique:
received:
11
03
2020
revised:
10
08
2020
accepted:
22
11
2020
pubmed:
6
1
2021
medline:
22
4
2021
entrez:
5
1
2021
Statut:
ppublish
Résumé
Secondary use of Electronic Health Records (EHRs) has mostly focused on health conditions (diseases and drugs). Function is an important health indicator in addition to morbidity and mortality. Nevertheless, function has been overlooked in accessing patients' health status. The World Health Organization (WHO)'s International Classification of Functioning, Disability and Health (ICF) is considered the international standard for describing and coding function and health states. We pioneer the first comprehensive analysis and identification of functioning concepts in the Mobility domain of the ICF. Using physical therapy notes at the National Institutes of Health's Clinical Center, we induced a hierarchical order of mobility-related entities including 5 entities types, 3 relations, 8 attributes, and 33 attribute values. Two domain experts manually curated a gold standard corpus of 14,281 nested entity mentions from 400 clinical notes. Inter-annotator agreement (IAA) of exact matching averaged 92.3 % F1-score on mention text spans, and 96.6 % Cohen's kappa on attributes assignments. A high-performance Ensemble machine learning model for named entity recognition (NER) was trained and evaluated using the gold standard corpus. Average F1-score on exact entity matching of our Ensemble method (84.90 %) outperformed popular NER methods: Conditional Random Field (80.4 %), Recurrent Neural Network (81.82 %), and Bidirectional Encoder Representations from Transformers (82.33 %). The results of this study show that mobility functioning information can be reliably captured from clinical notes once adequate resources are provided for sequence labeling methods. We expect that functioning concepts in other domains of the ICF can be identified in similar fashion.
Sections du résumé
BACKGROUND
Secondary use of Electronic Health Records (EHRs) has mostly focused on health conditions (diseases and drugs). Function is an important health indicator in addition to morbidity and mortality. Nevertheless, function has been overlooked in accessing patients' health status. The World Health Organization (WHO)'s International Classification of Functioning, Disability and Health (ICF) is considered the international standard for describing and coding function and health states. We pioneer the first comprehensive analysis and identification of functioning concepts in the Mobility domain of the ICF.
RESULTS
Using physical therapy notes at the National Institutes of Health's Clinical Center, we induced a hierarchical order of mobility-related entities including 5 entities types, 3 relations, 8 attributes, and 33 attribute values. Two domain experts manually curated a gold standard corpus of 14,281 nested entity mentions from 400 clinical notes. Inter-annotator agreement (IAA) of exact matching averaged 92.3 % F1-score on mention text spans, and 96.6 % Cohen's kappa on attributes assignments. A high-performance Ensemble machine learning model for named entity recognition (NER) was trained and evaluated using the gold standard corpus. Average F1-score on exact entity matching of our Ensemble method (84.90 %) outperformed popular NER methods: Conditional Random Field (80.4 %), Recurrent Neural Network (81.82 %), and Bidirectional Encoder Representations from Transformers (82.33 %).
CONCLUSIONS
The results of this study show that mobility functioning information can be reliably captured from clinical notes once adequate resources are provided for sequence labeling methods. We expect that functioning concepts in other domains of the ICF can be identified in similar fashion.
Identifiants
pubmed: 33401169
pii: S1386-5056(20)31887-6
doi: 10.1016/j.ijmedinf.2020.104351
pmc: PMC8104034
mid: NIHMS1661619
pii:
doi:
Types de publication
Journal Article
Research Support, N.I.H., Intramural
Research Support, U.S. Gov't, Non-P.H.S.
Langues
eng
Sous-ensembles de citation
IM
Pagination
104351Subventions
Organisme : Intramural NIH HHS
ID : ZIA CL060065
Pays : United States
Informations de copyright
Copyright © 2020 The Authors. Published by Elsevier B.V. All rights reserved.
Références
Bull World Health Organ. 2019 Oct 1;97(10):725-728
pubmed: 31656340
J Am Med Inform Assoc. 2015 Jan;22(1):143-54
pubmed: 25147248
J Biomed Inform. 2011 Feb;44(1):94-101
pubmed: 20971216
AMIA Jt Summits Transl Sci Proc. 2018 May 18;2017:379-388
pubmed: 29888096
Brief Bioinform. 2020 Dec 1;21(6):2219-2238
pubmed: 32602538
Stud Health Technol Inform. 2004;107(Pt 1):434-8
pubmed: 15360850
J Am Med Inform Assoc. 2012 Sep-Oct;19(5):786-91
pubmed: 22366294
J Biomed Inform. 2018 Jan;77:34-49
pubmed: 29162496
Int J Med Inform. 2006 Jun;75(6):418-29
pubmed: 16169769
Eur J Phys Rehabil Med. 2017 Feb;53(1):134-138
pubmed: 28118696
J Am Med Inform Assoc. 2013 Sep-Oct;20(5):806-13
pubmed: 23564629
J Biomed Inform. 2015 Dec;58 Suppl:S11-S19
pubmed: 26225918
J Am Med Inform Assoc. 2013 Sep-Oct;20(5):922-30
pubmed: 23355458
J Biomed Inform. 2009 Oct;42(5):950-66
pubmed: 19535011
J Am Med Inform Assoc. 2011 Sep-Oct;18(5):552-6
pubmed: 21685143
AMIA Annu Symp Proc. 2018 Apr 16;2017:1812-1819
pubmed: 29854252
BMC Bioinformatics. 2019 Dec 27;20(1):735
pubmed: 31881938
J Am Med Inform Assoc. 2009 Jul-Aug;16(4):561-70
pubmed: 19390096
J Biomed Inform. 2014 Dec;52:11-27
pubmed: 24262893
J Biomed Inform. 2015 Dec;58 Suppl:S67-S77
pubmed: 26210362
Med Care. 2017 Mar;55(3):261-266
pubmed: 27632767
Neural Comput. 1997 Nov 15;9(8):1735-80
pubmed: 9377276
Proc Conf Empir Methods Nat Lang Process. 2019 Nov;2019:85-90
pubmed: 33313604
CEUR Workshop Proc. 2016 Sep;1609:28-42
pubmed: 29308065
Proc Conf Assoc Comput Linguist Meet. 2018 Jul;2018:197-207
pubmed: 30305770
J Am Med Inform Assoc. 2020 Jan 1;27(1):3-12
pubmed: 31584655
AMIA Annu Symp Proc. 2015 Nov 05;2015:1224-33
pubmed: 26958262
BMC Public Health. 2019 Oct 15;19(1):1288
pubmed: 31615472
J Am Med Inform Assoc. 2008 Jan-Feb;15(1):14-24
pubmed: 17947624
Disabil Rehabil. 2018 Sep;40(19):2325-2330
pubmed: 28583004
AMIA Annu Symp Proc. 2015 Nov 05;2015:795-803
pubmed: 26958215
Bioinformatics. 2020 Feb 15;36(4):1234-1240
pubmed: 31501885
J Am Med Inform Assoc. 2007 Sep-Oct;14(5):550-63
pubmed: 17600094
J Am Med Inform Assoc. 2010 Sep-Oct;17(5):514-8
pubmed: 20819854
J Am Med Inform Assoc. 2005 May-Jun;12(3):296-8
pubmed: 15684123
J Am Med Inform Assoc. 2006 Sep-Oct;13(5):508-15
pubmed: 16799117