SMRT: Randomized Data Transformation for Cancer Subtyping and Big Data Analysis.

CRAN package cancer subtyping multi-omics integration survival analysis web application

Journal

Frontiers in oncology
ISSN: 2234-943X
Titre abrégé: Front Oncol
Pays: Switzerland
ID NLM: 101568867

Informations de publication

Date de publication:
2021
Historique:
received: 14 06 2021
accepted: 28 09 2021
entrez: 8 11 2021
pubmed: 9 11 2021
medline: 9 11 2021
Statut: epublish

Résumé

Cancer is an umbrella term that includes a range of disorders, from those that are fast-growing and lethal to indolent lesions with low or delayed potential for progression to death. The treatment options, as well as treatment success, are highly dependent on the correct subtyping of individual patients. With the advancement of high-throughput platforms, we have the opportunity to differentiate among cancer subtypes from a holistic perspective that takes into consideration phenomena at different molecular levels (mRNA, methylation, etc.). This demands powerful integrative methods to leverage large multi-omics datasets for a better subtyping. Here we introduce Subtyping Multi-omics using a Randomized Transformation (SMRT), a new method for multi-omics integration and cancer subtyping. SMRT offers the following advantages over existing approaches: (i) the scalable analysis pipeline allows researchers to integrate multi-omics data and analyze hundreds of thousands of samples in minutes, (ii) the ability to integrate data types with different numbers of patients, (iii) the ability to analyze un-matched data of different types, and (iv) the ability to offer users a convenient data analysis pipeline through a web application. We also improve the efficiency of our ensemble-based, perturbation clustering to support analysis on machines with memory constraints. In an extensive analysis, we compare SMRT with eight state-of-the-art subtyping methods using 37 TCGA and two METABRIC datasets comprising a total of almost 12,000 patient samples from 28 different types of cancer. We also performed a number of simulation studies. We demonstrate that SMRT outperforms other methods in identifying subtypes with significantly different survival profiles. In addition, SMRT is extremely fast, being able to analyze hundreds of thousands of samples in minutes. The web application is available at http://SMRT.tinnguyen-lab.com. The R package will be deposited to CRAN as part of our PINSPlus software suite.

Identifiants

pubmed: 34745946
doi: 10.3389/fonc.2021.725133
pmc: PMC8563705
doi:

Types de publication

Journal Article

Langues

eng

Pagination

725133

Subventions

Organisme : NIGMS NIH HHS
ID : P20 GM103440
Pays : United States

Informations de copyright

Copyright © 2021 Nguyen, Tran, Tran, Roy, Cassell, Dascalu, Draghici and Nguyen.

Déclaration de conflit d'intérêts

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Références

Bioinformatics. 2017 Nov 15;33(22):3558-3566
pubmed: 28961917
Bioinformatics. 2010 Jun 15;26(12):1572-3
pubmed: 20427518
Biostatistics. 2018 Jan 1;19(1):71-86
pubmed: 28541380
PLoS Comput Biol. 2011 Oct;7(10):e1002227
pubmed: 22028636
Nat Methods. 2014 Mar;11(3):333-7
pubmed: 24464287
Nat Commun. 2018 Oct 26;9(1):4453
pubmed: 30367051
Am J Pathol. 2007 May;170(5):1445-53
pubmed: 17456751
Bioinformatics. 2017 Sep 1;33(17):2706-2714
pubmed: 28520848
Nucleic Acids Res. 2021 Jul 2;49(W1):W114-W124
pubmed: 34037798
Bioinformatics. 2015 Jun 15;31(12):i268-75
pubmed: 26072491
Nature. 2012 Jul 18;487(7407):330-7
pubmed: 22810696
PLoS Med. 2015 Mar 31;12(3):e1001779
pubmed: 25826379
Bioinformatics. 2013 Oct 15;29(20):2610-6
pubmed: 23990412
Nature. 2012 Apr 18;486(7403):346-52
pubmed: 22522925
J Proteome Res. 2016 Mar 4;15(3):755-65
pubmed: 26653205
Bioinformatics. 2012 Oct 1;28(19):2458-66
pubmed: 22863767
N Engl J Med. 2016 Sep 22;375(12):1109-12
pubmed: 27653561
Bioinformatics. 2019 Aug 15;35(16):2843-2846
pubmed: 30590381
PLoS One. 2017 May 1;12(5):e0176278
pubmed: 28459819
Genes (Basel). 2018 Nov 28;9(12):
pubmed: 30487464
BMC Cancer. 2019 Dec 4;19(1):1184
pubmed: 31801484
Lancet Oncol. 2009 May;10(5):459-66
pubmed: 19269895
BMC Genomics. 2015 Dec 01;16:1022
pubmed: 26626453
Nucleic Acids Res. 2012 Oct;40(19):9379-91
pubmed: 22879375
Front Oncol. 2020 Jun 24;10:1052
pubmed: 32714868
Bioinformatics. 2012 Dec 15;28(24):3290-7
pubmed: 23047558
Stat Appl Genet Mol Biol. 2009;8:Article28
pubmed: 19572827
Ann Appl Stat. 2017 Jun;11(2):1011-1039
pubmed: 28959370
BMC Bioinformatics. 2014 May 29;15:162
pubmed: 24884486
Ann Appl Stat. 2013 Mar 1;7(1):523-542
pubmed: 23745156
Bioinformatics. 2018 Dec 1;34(23):4064-4072
pubmed: 29939219
Proc Natl Acad Sci U S A. 2013 Mar 12;110(11):4245-50
pubmed: 23431203
Bioinformatics. 2016 Jan 1;32(1):1-8
pubmed: 26377073
Bioinformatics. 2008 Mar 1;24(5):719-20
pubmed: 18024473
Bioinformatics. 2019 Sep 15;35(18):3348-3356
pubmed: 30698637
Acta Neuropathol. 2016 Jun;131(6):803-20
pubmed: 27157931
J Natl Cancer Inst. 1999 Nov 17;91(22):1922-32
pubmed: 10564676
Genome Res. 2017 Dec;27(12):2025-2039
pubmed: 29066617

Auteurs

Hung Nguyen (H)

Department of Computer Science and Engineering, University of Nevada Reno, Reno, NV, United States.

Duc Tran (D)

Department of Computer Science and Engineering, University of Nevada Reno, Reno, NV, United States.

Bang Tran (B)

Department of Computer Science and Engineering, University of Nevada Reno, Reno, NV, United States.

Monikrishna Roy (M)

Department of Computer Science and Engineering, University of Nevada Reno, Reno, NV, United States.

Adam Cassell (A)

Department of Computer Science and Engineering, University of Nevada Reno, Reno, NV, United States.

Sergiu Dascalu (S)

Department of Computer Science and Engineering, University of Nevada Reno, Reno, NV, United States.

Sorin Draghici (S)

Department of Computer Science, Wayne State University, Detroit, MI, United States.

Tin Nguyen (T)

Department of Computer Science and Engineering, University of Nevada Reno, Reno, NV, United States.

Classifications MeSH