An ontology-based approach for developing a harmonised data-validation tool for European cancer registration.

Cancer registry Data federation Data harmonisation Data validation Ontology Semantic web

Journal

Journal of biomedical semantics
ISSN: 2041-1480
Titre abrégé: J Biomed Semantics
Pays: England
ID NLM: 101531992

Informations de publication

Date de publication:
06 01 2021
Historique:
received: 11 10 2019
accepted: 15 11 2020
entrez: 7 1 2021
pubmed: 8 1 2021
medline: 29 10 2021
Statut: epublish

Résumé

Population-based cancer registries constitute an important information source in cancer epidemiology. Studies collating and comparing data across regional and national boundaries have proved important for deploying and evaluating effective cancer-control strategies. A critical aspect in correctly comparing cancer indicators across regional and national boundaries lies in ensuring a good and harmonised level of data quality, which is a primary motivator for a centralised collection of pseudonymised data. The recent introduction of the European Union's general data-protection regulation (GDPR) imposes stricter conditions on the collection, processing, and sharing of personal data. It also considers pseudonymised data as personal data. The new regulation motivates the need to find solutions that allow a continuation of the smooth processes leading to harmonised European cancer-registry data. One element in this regard would be the availability of a data-validation software tool based on a formalised depiction of the harmonised data-validation rules, allowing an eventual devolution of the data-validation process to the local level. A semantic data model was derived from the data-validation rules for harmonising cancer-data variables at European level. The data model was encapsulated in an ontology developed using the Web-Ontology Language (OWL) with the data-model entities forming the main OWL classes. The data-validation rules were added as axioms in the ontology. The reasoning function of the resulting ontology demonstrated its ability to trap registry-coding errors and in some instances to be able to correct errors. Describing the European cancer-registry core data set in terms of an OWL ontology affords a tool based on a formalised set of axioms for validating a cancer-registry's data set according to harmonised, supra-national rules. The fact that the data checks are inherently linked to the data model would lead to less maintenance overheads and also allow automatic versioning synchronisation, important for distributed data-quality checking processes.

Sections du résumé

BACKGROUND
Population-based cancer registries constitute an important information source in cancer epidemiology. Studies collating and comparing data across regional and national boundaries have proved important for deploying and evaluating effective cancer-control strategies. A critical aspect in correctly comparing cancer indicators across regional and national boundaries lies in ensuring a good and harmonised level of data quality, which is a primary motivator for a centralised collection of pseudonymised data. The recent introduction of the European Union's general data-protection regulation (GDPR) imposes stricter conditions on the collection, processing, and sharing of personal data. It also considers pseudonymised data as personal data. The new regulation motivates the need to find solutions that allow a continuation of the smooth processes leading to harmonised European cancer-registry data. One element in this regard would be the availability of a data-validation software tool based on a formalised depiction of the harmonised data-validation rules, allowing an eventual devolution of the data-validation process to the local level.
RESULTS
A semantic data model was derived from the data-validation rules for harmonising cancer-data variables at European level. The data model was encapsulated in an ontology developed using the Web-Ontology Language (OWL) with the data-model entities forming the main OWL classes. The data-validation rules were added as axioms in the ontology. The reasoning function of the resulting ontology demonstrated its ability to trap registry-coding errors and in some instances to be able to correct errors.
CONCLUSIONS
Describing the European cancer-registry core data set in terms of an OWL ontology affords a tool based on a formalised set of axioms for validating a cancer-registry's data set according to harmonised, supra-national rules. The fact that the data checks are inherently linked to the data model would lead to less maintenance overheads and also allow automatic versioning synchronisation, important for distributed data-quality checking processes.

Identifiants

pubmed: 33407816
doi: 10.1186/s13326-020-00233-x
pii: 10.1186/s13326-020-00233-x
pmc: PMC7789225
doi:

Types de publication

Journal Article Research Support, N.I.H., Extramural

Langues

eng

Sous-ensembles de citation

IM

Pagination

1

Références

IEEE Trans Inf Technol Biomed. 2012 May;16(3):424-31
pubmed: 22217917
J Biomed Semantics. 2017 Feb 7;8(1):6
pubmed: 28173841
Sci Data. 2016 Mar 15;3:160018
pubmed: 26978244
AI Matters. 2015 Jun;1(4):4-12
pubmed: 27239556
BMC Bioinformatics. 2012 Jan 25;13 Suppl 1:S9
pubmed: 22373043
J Biomed Semantics. 2017 Sep 29;8(1):46
pubmed: 28962670
IARC Sci Publ. 1985;(66):13-26
pubmed: 4093183
J Biomed Inform. 2013 Oct;46(5):784-94
pubmed: 23751263

Auteurs

Nicholas Charles Nicholson (NC)

European Commission, Joint Research Centre, Via E. Fermi 2749, I-21027, Ispra, VA, Italy. nicholas.nicholson@ec.europa.eu.

Francesco Giusti (F)

European Commission, Joint Research Centre, Via E. Fermi 2749, I-21027, Ispra, VA, Italy.

Manola Bettio (M)

European Commission, Joint Research Centre, Via E. Fermi 2749, I-21027, Ispra, VA, Italy.

Raquel Negrao Carvalho (R)

European Commission, Joint Research Centre, Via E. Fermi 2749, I-21027, Ispra, VA, Italy.

Nadya Dimitrova (N)

European Commission, Joint Research Centre, Via E. Fermi 2749, I-21027, Ispra, VA, Italy.

Tadeusz Dyba (T)

European Commission, Joint Research Centre, Via E. Fermi 2749, I-21027, Ispra, VA, Italy.

Manuela Flego (M)

European Commission, Joint Research Centre, Via E. Fermi 2749, I-21027, Ispra, VA, Italy.

Luciana Neamtiu (L)

European Commission, Joint Research Centre, Via E. Fermi 2749, I-21027, Ispra, VA, Italy.

Giorgia Randi (G)

European Commission, Joint Research Centre, Via E. Fermi 2749, I-21027, Ispra, VA, Italy.

Carmen Martos (C)

European Commission, Joint Research Centre, Via E. Fermi 2749, I-21027, Ispra, VA, Italy.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH