Linked Entity Attribute Pair (LEAP): A Harmonization Framework for Data Pooling.


Journal

JCO clinical cancer informatics
ISSN: 2473-4276
Titre abrégé: JCO Clin Cancer Inform
Pays: United States
ID NLM: 101708809

Informations de publication

Date de publication:
08 2020
Historique:
entrez: 7 8 2020
pubmed: 7 8 2020
medline: 1 9 2021
Statut: ppublish

Résumé

As data-sharing projects become increasingly frequent, so does the need to map data elements between multiple classification systems. A generic, robust, shareable architecture will result in increased efficiency and transparency of the mapping process, while upholding the integrity of the data. The American Association for Cancer Research's Genomics Evidence Neoplasia Information Exchange (GENIE) collects clinical and genomic data for precision cancer medicine. As part of its commitment to open science, GENIE has partnered with the National Cancer Institute's Genomic Data Commons (GDC) as a secondary repository. After initial efforts to submit data from GENIE to GDC failed, we realized the need for a solution to allow for the iterative mapping of data elements between dynamic classification systems. We developed the Linked Entity Attribute Pair (LEAP) database framework to store and manage the term mappings used to submit data from GENIE to GDC. After creating and populating the LEAP framework, we identified 195 mappings from GENIE to GDC requiring remediation and observed a 28% reduction in effort to resolve these issues, as well as a reduction in inadvertent errors. These results led to a decrease in the time to map between OncoTree, the cancer type ontology used by GENIE, and International Classification of Disease for Oncology, 3rd Edition, used by GDC, from several months to less than 1 week. The LEAP framework provides a streamlined mapping process among various classification systems and allows for reusability so that efforts to create or adjust mappings are straightforward. The ability of the framework to track changes over time streamlines the process to map data elements across various dynamic classification systems.

Identifiants

pubmed: 32755461
doi: 10.1200/CCI.20.00037
pmc: PMC7469618
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

691-699

Subventions

Organisme : NCI NIH HHS
ID : P30 CA008748
Pays : United States
Organisme : NCI NIH HHS
ID : P30 CA012197
Pays : United States

Références

Cancer Discov. 2017 Aug;7(8):818-831
pubmed: 28572459
Blood. 2017 Jul 27;130(4):453-459
pubmed: 28600341
J Am Med Inform Assoc. 2007 Jan-Feb;14(1):86-93
pubmed: 17068350
Int J Med Inform. 2007 Nov-Dec;76(11-12):769-79
pubmed: 17098467
JCO Clin Cancer Inform. 2019 Nov;3:1-11
pubmed: 31834820
Am J Epidemiol. 2015 Dec 15;182(12):1033-8
pubmed: 26589709
Nat Genet. 2012 Jan 27;44(2):127-30
pubmed: 22281773

Auteurs

Stacy Thomas (S)

Memorial Sloan Kettering Cancer Center, New York, NY.

Tara Lichtenberg (T)

Center for Translational Data Science, University of Chicago, Chicago, IL.

Kristen Dang (K)

Sage Bionetworks, Seattle, WA.

Michael Fitzsimons (M)

Center for Translational Data Science, University of Chicago, Chicago, IL.
University of Illinois at Chicago, Chicago, IL.

Robert L Grossman (RL)

Center for Translational Data Science, University of Chicago, Chicago, IL.

Ritika Kundra (R)

Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY.

Jessica A Lavery (JA)

Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY.

Michele L Lenoue-Newton (ML)

Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, TN.

Katherine S Panageas (KS)

Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY.

Charles Sawyers (C)

Human Oncology and Pathogenesis Program, Memorial Sloan Kettering Cancer Center, New York, NY.

Nikolaus D Schultz (ND)

Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY.

Sahussapont J Sirintrapun (SJ)

Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, NY.

Umit Topaloglu (U)

Cancer Biology, Wake Forest University School of Medicine, Winston Salem, NC.

Angelica Welch (A)

Information Systems, Memorial Sloan Kettering Cancer Center, New York, NY.

Thomas Yu (T)

Sage Bionetworks, Seattle, WA.

Ahmet Zehir (A)

Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, NY.

Stuart Gardos (S)

Information Systems, Memorial Sloan Kettering Cancer Center, New York, NY.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH