A survey on single and multi omics data mining methods in cancer data classification.

Cancer classification Data integration Gene selection High dimensional datasets Single and multi omics data

Journal

Journal of biomedical informatics
ISSN: 1532-0480
Titre abrégé: J Biomed Inform
Pays: United States
ID NLM: 100970413

Informations de publication

Date de publication:
07 2020
Historique:
received: 11 12 2019
revised: 01 05 2020
accepted: 31 05 2020
pubmed: 12 6 2020
medline: 29 7 2021
entrez: 12 6 2020
Statut: ppublish

Résumé

Data analytics is routinely used to support biomedical research in all areas, with particular focus on the most relevant clinical conditions, such as cancer. Bioinformatics approaches, in particular, have been used to characterize the molecular aspects of diseases. In recent years, numerous studies have been performed on cancer based upon single and multi-omics data. For example, Single-omics-based studies have employed a diverse set of data, such as gene expression, DNA methylation, or miRNA, to name only a few instances. Despite that, a significant part of literature reports studies on gene expression with microarray datasets. Single-omics data have high numbers of attributes and very low sample counts. This characteristic makes them paradigmatic of an under-sampled, small-n large-p machine learning problem. An important goal of single-omics data analysis is to find the most relevant genes, in terms of their potential use in clinics and research, in the batch of available data. This problem has been addressed in gene selection as one of the pre-processing steps in data mining. An analysis that use only one type of data (single-omics) often miss the complexity of the landscape of molecular phenomena underlying the disease. As a result, they provide limited and sometimes poorly reliable information about the disease mechanisms. Therefore, in recent years, researchers have been eager to build models that are more complex, obtaining more reliable results using multi-omics data. However, to achieve this, the most important challenge is data integration. In this paper, we provide a comprehensive overview of the challenges in single and multi-omics data analysis of cancer data, focusing on gene selection and data integration methods.

Identifiants

pubmed: 32525020
pii: S1532-0464(20)30093-9
doi: 10.1016/j.jbi.2020.103466
pii:
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't Review

Langues

eng

Sous-ensembles de citation

IM

Pagination

103466

Informations de copyright

Copyright © 2020 Elsevier Inc. All rights reserved.

Déclaration de conflit d'intérêts

Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Auteurs

Zahra Momeni (Z)

Faculty of Electrical and Computer Engineering, Tarbiat Modares University, Tehran, Iran.

Esmail Hassanzadeh (E)

Faculty of Electrical and Computer Engineering, Tarbiat Modares University, Tehran, Iran.

Mohammad Saniee Abadeh (M)

Faculty of Electrical and Computer Engineering, Tarbiat Modares University, Tehran, Iran; School of Computer Science, Institute for Research in Fundamental Sciences (IPM), Tehran, Iran. Electronic address: saniee@modares.ac.ir.

Riccardo Bellazzi (R)

Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Italy; IRCCS ICS Maugeri, Pavia, Italy.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH