Seeing the primary tumor because of all the trees: Cancer type prediction on low-dimensional data.

Cancer of Unknown Primary classification oncology prediction real-world data

Journal

Frontiers in medicine
ISSN: 2296-858X
Titre abrégé: Front Med (Lausanne)
Pays: Switzerland
ID NLM: 101648047

Informations de publication

Date de publication:
2024
Historique:
received: 05 03 2024
accepted: 06 08 2024
medline: 11 9 2024
pubmed: 11 9 2024
entrez: 11 9 2024
Statut: epublish

Résumé

The Cancer of Unknown Primary (CUP) syndrome is characterized by identifiable metastases while the primary tumor remains hidden. In recent years, various data-driven approaches have been suggested to predict the location of the primary tumor (LOP) in CUP patients promising improved diagnosis and outcome. These LOP prediction approaches use high-dimensional input data like images or genetic data. However, leveraging such data is challenging, resource-intensive and therefore a potential translational barrier. Instead of using high-dimensional data, we analyzed the LOP prediction performance of low-dimensional data from routine medical care. With our findings, we show that such low-dimensional routine clinical information suffices as input data for tree-based LOP prediction models. The best model reached a mean Accuracy of 94% and a mean Matthews correlation coefficient (MCC) score of 0.92 in 10-fold nested cross-validation (NCV) when distinguishing four types of cancer. When considering eight types of cancer, this model achieved a mean Accuracy of 85% and a mean MCC score of 0.81. This is comparable to the performance achieved by approaches using high-dimensional input data. Additionally, the distribution pattern of metastases appears to be important information in predicting the LOP.

Identifiants

pubmed: 39257886
doi: 10.3389/fmed.2024.1396459
pmc: PMC11385615
doi:

Types de publication

Journal Article

Langues

eng

Pagination

1396459

Informations de copyright

Copyright © 2024 Gehrmann, Soenarto, Hidayat, Beyer, Quakulinski, Alkarkoukly, Berressem, Gundert, Butler, Grönke, Lennartz, Persigehl, Zander and Beyan.

Déclaration de conflit d'intérêts

SL received author and speaker royalties from Amboss GmbH. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Auteurs

Julia Gehrmann (J)

Institute for Biomedical Informatics, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany.

Devina Johanna Soenarto (DJ)

Institute for Biomedical Informatics, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany.

Kevin Hidayat (K)

Institute for Biomedical Informatics, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany.

Maria Beyer (M)

Institute for Biomedical Informatics, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany.

Lars Quakulinski (L)

Institute for Biomedical Informatics, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany.

Samer Alkarkoukly (S)

Institute for Biomedical Informatics, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany.
Medical Data Integration Center (MeDIC), Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany.

Scarlett Berressem (S)

Department of Internal Medicine, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany.
Center for Integrated Oncology Aachen Bonn Cologne Duesseldorf (CIO ABCD), Aachen, Germany.

Anna Gundert (A)

Department of Internal Medicine, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany.
Center for Integrated Oncology Aachen Bonn Cologne Duesseldorf (CIO ABCD), Aachen, Germany.

Michael Butler (M)

Department of Internal Medicine, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany.
Center for Integrated Oncology Aachen Bonn Cologne Duesseldorf (CIO ABCD), Aachen, Germany.

Ana Grönke (A)

Medical Data Integration Center (MeDIC), Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany.

Simon Lennartz (S)

Institute for Diagnostic and Interventional Radiology, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany.

Thorsten Persigehl (T)

Institute for Diagnostic and Interventional Radiology, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany.

Thomas Zander (T)

Department of Internal Medicine, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany.
Center for Integrated Oncology Aachen Bonn Cologne Duesseldorf (CIO ABCD), Aachen, Germany.

Oya Beyan (O)

Institute for Biomedical Informatics, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany.
Medical Data Integration Center (MeDIC), Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany.
Department of Data Science and Artificial Intelligence, Fraunhofer FIT, Sankt Augustin, Germany.

Classifications MeSH