A survey on single and multi omics data mining methods in cancer data classification.

Computational Biology Data Mining Genomics Humans Machine Learning Neoplasms / genetics

Cancer classification Data integration Gene selection High dimensional datasets Single and multi omics data

Journal

Journal of biomedical informatics

ISSN: 1532-0480

Titre abrégé: J Biomed Inform

Pays: United States

ID NLM: 100970413

Informations de publication

Date de publication:
07 2020

Historique:

received: 11 12 2019

revised: 01 05 2020

accepted: 31 05 2020

pubmed: 12 6 2020

medline: 29 7 2021

entrez: 12 6 2020

Statut: ppublish

Résumé

Data analytics is routinely used to support biomedical research in all areas, with particular focus on the most relevant clinical conditions, such as cancer. Bioinformatics approaches, in particular, have been used to characterize the molecular aspects of diseases. In recent years, numerous studies have been performed on cancer based upon single and multi-omics data. For example, Single-omics-based studies have employed a diverse set of data, such as gene expression, DNA methylation, or miRNA, to name only a few instances. Despite that, a significant part of literature reports studies on gene expression with microarray datasets. Single-omics data have high numbers of attributes and very low sample counts. This characteristic makes them paradigmatic of an under-sampled, small-n large-p machine learning problem. An important goal of single-omics data analysis is to find the most relevant genes, in terms of their potential use in clinics and research, in the batch of available data. This problem has been addressed in gene selection as one of the pre-processing steps in data mining. An analysis that use only one type of data (single-omics) often miss the complexity of the landscape of molecular phenomena underlying the disease. As a result, they provide limited and sometimes poorly reliable information about the disease mechanisms. Therefore, in recent years, researchers have been eager to build models that are more complex, obtaining more reliable results using multi-omics data. However, to achieve this, the most important challenge is data integration. In this paper, we provide a comprehensive overview of the challenges in single and multi-omics data analysis of cancer data, focusing on gene selection and data integration methods.

Identifiants

DOI: 10.1016/j.jbi.2020.103466 PMID: 32525020

pubmed: 32525020

pii: S1532-0464(20)30093-9

doi: 10.1016/j.jbi.2020.103466

pii:

doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't Review

Langues

eng

Sous-ensembles de citation

Pagination

103466

Informations de copyright

Déclaration de conflit d'intérêts

Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

A survey on single and multi omics data mining methods in cancer data classification.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Pagination

Informations de copyright

Déclaration de conflit d'intérêts

Auteurs

Zahra Momeni (Z)

Esmail Hassanzadeh (E)

Mohammad Saniee Abadeh (M)

Riccardo Bellazzi (R)

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Smoking Cessation and Incident Cardiovascular Disease.

Evaluation of Low-Value Services Across Major Medicare Advantage Insurers and Traditional Medicare.

Effectiveness of Virtual Yoga for Chronic Low Back Pain: A Randomized Clinical Trial.

Classifications MeSH