Definition and validation of SNOMED CT subsets using the expression constraint language.

Expression constraint language Expression constraint simplification Graph database SNOMED CT Subset visualization

Journal

Journal of biomedical informatics
ISSN: 1532-0480
Titre abrégé: J Biomed Inform
Pays: United States
ID NLM: 100970413

Informations de publication

Date de publication:
05 2021
Historique:
received: 11 12 2020
revised: 05 03 2021
accepted: 06 03 2021
pubmed: 24 3 2021
medline: 28 7 2021
entrez: 23 3 2021
Statut: ppublish

Résumé

SNOMED CT Expression Constraint Language (ECL) is a declarative language developed by SNOMED International for the definition of SNOMED CT Expression Constraints (ECs). ECs are executable expressions that define intensional subsets of clinical meanings by stating constraints over the logic definition of concepts. The execution of an EC on some SNOMED CT substrate yields the intended subset, and it requires an execution engine able to receive an EC as input, execute it, and return the matching concepts. An important issue regarding subsets of clinical concepts is their use in terminology binding between clinical information models and terminologies for defining the set of valid values of codified data. To define and implement methods for the simplification, semantic validation and execution of ECs over a graph-oriented SNOMED CT database, and to provide a method for the visual representation of subsets in order to explore, understand and validate its content, as well as to develop an EC execution platform, called SNQuery, which makes use of these methods. Since SNOMED CT is a directed and acyclic graph, we have used a graph-oriented database to represent the content of SNOMED CT, where the schema and instances are represented as graphs and the data manipulation is expressed by graph-oriented operations. For the execution of ECs over the graph database, it is performed a translation process in which ECs are translated into a set of Cypher Query Language queries. We have defined some EC simplification methods that leverage the logic structure underlying SNOMED CT. The purpose of these methods is to reduce the complexity of ECs and, in turn, its execution time, as well as to validate them from a SNOMED CT Concept Model and logical definition points of view. We also have developed a graphic representation based on the circle packing geometrical concept, which allows validating subsets, as well as pre-defined refsets and the terminology itself. We have developed SNQuery, a platform for the definition of intensional subsets of SNOMED CT concepts by means of the execution of ECs over a graph-oriented SNOMED CT database. Additionally, we have incorporated methods for the simplification and semantic validation of ECs, as well as for the visualization of subsets as a mechanism to understand and validate them. SNQuery has been evaluated in terms of EC execution times. In this paper, we provide methods to simplify, semantically validate and execute ECs over a graph-oriented database. We also offer a method to visualize the intensional subsets obtained by executing ECs to explore, understand and validate them, as well as refsets and the terminology itself. The definition of intensional subsets is useful to bind content between clinical information models and clinical terminologies, which is a necessary step to achieve semantic interoperability between EHR systems.

Sections du résumé

BACKGROUND
SNOMED CT Expression Constraint Language (ECL) is a declarative language developed by SNOMED International for the definition of SNOMED CT Expression Constraints (ECs). ECs are executable expressions that define intensional subsets of clinical meanings by stating constraints over the logic definition of concepts. The execution of an EC on some SNOMED CT substrate yields the intended subset, and it requires an execution engine able to receive an EC as input, execute it, and return the matching concepts. An important issue regarding subsets of clinical concepts is their use in terminology binding between clinical information models and terminologies for defining the set of valid values of codified data.
OBJECTIVE
To define and implement methods for the simplification, semantic validation and execution of ECs over a graph-oriented SNOMED CT database, and to provide a method for the visual representation of subsets in order to explore, understand and validate its content, as well as to develop an EC execution platform, called SNQuery, which makes use of these methods.
METHODS
Since SNOMED CT is a directed and acyclic graph, we have used a graph-oriented database to represent the content of SNOMED CT, where the schema and instances are represented as graphs and the data manipulation is expressed by graph-oriented operations. For the execution of ECs over the graph database, it is performed a translation process in which ECs are translated into a set of Cypher Query Language queries. We have defined some EC simplification methods that leverage the logic structure underlying SNOMED CT. The purpose of these methods is to reduce the complexity of ECs and, in turn, its execution time, as well as to validate them from a SNOMED CT Concept Model and logical definition points of view. We also have developed a graphic representation based on the circle packing geometrical concept, which allows validating subsets, as well as pre-defined refsets and the terminology itself.
RESULTS
We have developed SNQuery, a platform for the definition of intensional subsets of SNOMED CT concepts by means of the execution of ECs over a graph-oriented SNOMED CT database. Additionally, we have incorporated methods for the simplification and semantic validation of ECs, as well as for the visualization of subsets as a mechanism to understand and validate them. SNQuery has been evaluated in terms of EC execution times.
CONCLUSION
In this paper, we provide methods to simplify, semantically validate and execute ECs over a graph-oriented database. We also offer a method to visualize the intensional subsets obtained by executing ECs to explore, understand and validate them, as well as refsets and the terminology itself. The definition of intensional subsets is useful to bind content between clinical information models and clinical terminologies, which is a necessary step to achieve semantic interoperability between EHR systems.

Identifiants

pubmed: 33753269
pii: S1532-0464(21)00076-9
doi: 10.1016/j.jbi.2021.103747
pii:
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

103747

Informations de copyright

Copyright © 2021 Elsevier Inc. All rights reserved.

Auteurs

V M Giménez-Solano (VM)

VeraTech for Health, Valencia, Spain. Electronic address: vigiso@veratech.es.

J A Maldonado (JA)

VeraTech for Health, Valencia, Spain.

D Boscá (D)

VeraTech for Health, Valencia, Spain.

S Salas-García (S)

VeraTech for Health, Valencia, Spain.

M Robles (M)

VeraTech for Health, Valencia, Spain.

Articles similaires

Humans Middle Aged Female Male Surveys and Questionnaires
Humans Recurrence Male Female Middle Aged

Real world data on cervical cancer treatment patterns, healthcare access and resource utilization in the Brazilian public healthcare system.

Thabata Martins Ferreira Campuzano, Maria Amelia Carlos Souto Maior Borba, Paula de Mendonça Batista et al.
1.00
Humans Female Uterine Cervical Neoplasms Brazil Middle Aged
Humans Linguistics Language China Semantics

Classifications MeSH