Quantitative analysis of manual annotation of clinical text samples.


Journal

International journal of medical informatics
ISSN: 1872-8243
Titre abrégé: Int J Med Inform
Pays: Ireland
ID NLM: 9711057

Informations de publication

Date de publication:
03 2019
Historique:
received: 05 06 2018
revised: 09 10 2018
accepted: 30 12 2018
entrez: 19 1 2019
pubmed: 19 1 2019
medline: 20 7 2019
Statut: ppublish

Résumé

Semantic interoperability of eHealth services within and across countries has been the main topic in several research projects. It is a key consideration for the European Commission to overcome the complexity of making different health information systems work together. This paper describes a study within the EU-funded project ASSESS CT, which focuses on assessing the potential of SNOMED CT as core reference terminology for semantic interoperability at European level. This paper presents a quantitative analysis of the results obtained in ASSESS CT to determine the fitness of SNOMED CT for semantic interoperability. The quantitative analysis consists of concept coverage, term coverage and inter-annotator agreement analysis of the annotation experiments related to six European languages (English, Swedish, French, Dutch, German and Finnish) and three scenarios: (i) ADOPT, where only SNOMED CT was used by the annotators; (ii) ALTERNATIVE, where a fixed set of terminologies from UMLS, excluding SNOMED CT, was used; and (iii) ABSTAIN, where any terminologies available in the current national infrastructure of the annotators' country were used. For each language and each scenario, we configured the different terminology settings of the annotation experiments. There was a positive correlation between the number of concepts in each terminology setting and their concept and term coverage values. Inter-annotator agreement is low, irrespective of the terminology setting. No significant differences were found between the analyses for the three scenarios, but availability of SNOMED CT for the assessed language is associated with increased concept coverage. Terminology setting size and concept and term coverage correlate positively up to a limit where more concepts do not significantly impact the coverage values. The results did not confirm the hypothesis of an inverse correlation between concept coverage and IAA due to a lower amount of choices available. The overall low IAA results pose a challenge for interoperability and indicate the need for further research to assess whether consistent terminology implementation is possible across Europe, e.g., improving term coverage by adding localized versions of the selected terminologies, analysing causes of low inter-annotator agreement, and improving tooling and guidance for annotators. The much lower term coverage for the Swedish version of SNOMED CT compared to English together with the similarly high concept coverage obtained with English and Swedish SNOMED CT reflects its relevance as a hub to connect user interface terminologies and serving a variety of user needs.

Sections du résumé

BACKGROUND
Semantic interoperability of eHealth services within and across countries has been the main topic in several research projects. It is a key consideration for the European Commission to overcome the complexity of making different health information systems work together. This paper describes a study within the EU-funded project ASSESS CT, which focuses on assessing the potential of SNOMED CT as core reference terminology for semantic interoperability at European level.
OBJECTIVE
This paper presents a quantitative analysis of the results obtained in ASSESS CT to determine the fitness of SNOMED CT for semantic interoperability.
METHODS
The quantitative analysis consists of concept coverage, term coverage and inter-annotator agreement analysis of the annotation experiments related to six European languages (English, Swedish, French, Dutch, German and Finnish) and three scenarios: (i) ADOPT, where only SNOMED CT was used by the annotators; (ii) ALTERNATIVE, where a fixed set of terminologies from UMLS, excluding SNOMED CT, was used; and (iii) ABSTAIN, where any terminologies available in the current national infrastructure of the annotators' country were used. For each language and each scenario, we configured the different terminology settings of the annotation experiments.
RESULTS
There was a positive correlation between the number of concepts in each terminology setting and their concept and term coverage values. Inter-annotator agreement is low, irrespective of the terminology setting.
CONCLUSIONS
No significant differences were found between the analyses for the three scenarios, but availability of SNOMED CT for the assessed language is associated with increased concept coverage. Terminology setting size and concept and term coverage correlate positively up to a limit where more concepts do not significantly impact the coverage values. The results did not confirm the hypothesis of an inverse correlation between concept coverage and IAA due to a lower amount of choices available. The overall low IAA results pose a challenge for interoperability and indicate the need for further research to assess whether consistent terminology implementation is possible across Europe, e.g., improving term coverage by adding localized versions of the selected terminologies, analysing causes of low inter-annotator agreement, and improving tooling and guidance for annotators. The much lower term coverage for the Swedish version of SNOMED CT compared to English together with the similarly high concept coverage obtained with English and Swedish SNOMED CT reflects its relevance as a hub to connect user interface terminologies and serving a variety of user needs.

Identifiants

pubmed: 30654902
pii: S1386-5056(18)30544-6
doi: 10.1016/j.ijmedinf.2018.12.011
pii:
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

37-48

Informations de copyright

Copyright © 2019 The Authors. Published by Elsevier B.V. All rights reserved.

Auteurs

Jose A Miñarro-Giménez (JA)

Institute of Medical Informatics, Statistics and Documentation, Medical University of Graz, Austria. Electronic address: jose.minarro-gimenez@medunigraz.at.

Ronald Cornet (R)

Department of Medical Informatics, Amsterdam Public Health Research Institute, The Netherlands.

M C Jaulent (MC)

Laboratoire d'Informatique Médicale et Ingénierie des Connaissances en eSanté, Institut National de la Santé et de la Recherche Médicale, Sorbonne Université, Université Paris 13, France.

Heike Dewenter (H)

University of Applied Sciences Niederrhein, Krefeld, Germany.

Sylvia Thun (S)

Charité Universitätsmedizin, Berlin Institute of Health, Germany.

Kirstine Rosenbeck Gøeg (KR)

Department of Health Science and Technology, Aalborg University, Denmark.

Daniel Karlsson (D)

Department for Knowledge-Based Policy of Social Services, eHealth and Structured Information Unit, The National Board of Health and Welfare, Sweden.

Stefan Schulz (S)

Institute of Medical Informatics, Statistics and Documentation, Medical University of Graz, Austria.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH