Integrated multi-omics analysis of ovarian cancer using variational autoencoders.


Journal

Scientific reports
ISSN: 2045-2322
Titre abrégé: Sci Rep
Pays: England
ID NLM: 101563288

Informations de publication

Date de publication:
18 03 2021
Historique:
received: 19 08 2020
accepted: 28 02 2021
entrez: 19 3 2021
pubmed: 20 3 2021
medline: 28 10 2021
Statut: epublish

Résumé

Cancer is a complex disease that deregulates cellular functions at various molecular levels (e.g., DNA, RNA, and proteins). Integrated multi-omics analysis of data from these levels is necessary to understand the aberrant cellular functions accountable for cancer and its development. In recent years, Deep Learning (DL) approaches have become a useful tool in integrated multi-omics analysis of cancer data. However, high dimensional multi-omics data are generally imbalanced with too many molecular features and relatively few patient samples. This imbalance makes a DL based integrated multi-omics analysis difficult. DL-based dimensionality reduction technique, including variational autoencoder (VAE), is a potential solution to balance high dimensional multi-omics data. However, there are few VAE-based integrated multi-omics analyses, and they are limited to pancancer. In this work, we did an integrated multi-omics analysis of ovarian cancer using the compressed features learned through VAE and an improved version of VAE, namely Maximum Mean Discrepancy VAE (MMD-VAE). First, we designed and developed a DL architecture for VAE and MMD-VAE. Then we used the architecture for mono-omics, integrated di-omics and tri-omics data analysis of ovarian cancer through cancer samples identification, molecular subtypes clustering and classification, and survival analysis. The results show that MMD-VAE and VAE-based compressed features can respectively classify the transcriptional subtypes of the TCGA datasets with an accuracy in the range of 93.2-95.5% and 87.1-95.7%. Also, survival analysis results show that VAE and MMD-VAE based compressed representation of omics data can be used in cancer prognosis. Based on the results, we can conclude that (i) VAE and MMD-VAE outperform existing dimensionality reduction techniques, (ii) integrated multi-omics analyses perform better or similar compared to their mono-omics counterparts, and (iii) MMD-VAE performs better than VAE in most omics dataset.

Identifiants

pubmed: 33737557
doi: 10.1038/s41598-021-85285-4
pii: 10.1038/s41598-021-85285-4
pmc: PMC7973750
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

6265

Commentaires et corrections

Type : ErratumIn

Références

Sci Rep. 2016 May 17;6:26001
pubmed: 27184229
Nature. 2011 Jun 29;474(7353):609-15
pubmed: 21720365
Bioinformatics. 2019 Sep 1;35(17):3055-3062
pubmed: 30657866
Nucleic Acids Res. 2018 Apr 6;46(6):3009-3018
pubmed: 29529299
Clin Cancer Res. 2018 Mar 15;24(6):1248-1259
pubmed: 28982688
Cell. 2013 Mar 28;153(1):38-55
pubmed: 23540689
J Clin Invest. 2014 Jan;124(1):398-412
pubmed: 24316975
Cell Rep. 2018 Nov 27;25(9):2617-2633
pubmed: 30485824
Biomed Res Int. 2018 Oct 3;2018:9836256
pubmed: 30402498
Biom J. 2006 Dec;48(6):1029-40
pubmed: 17240660
Genome Biol. 2017 May 5;18(1):83
pubmed: 28476144
J Ovarian Res. 2020 Apr 28;13(1):48
pubmed: 32345304
Carcinogenesis. 2018 Jul 3;39(7):860-868
pubmed: 29897425
Int J Mol Sci. 2019 Sep 26;20(19):
pubmed: 31561483
Am Fam Physician. 2016 Jun 1;93(11):937-44
pubmed: 27281838
EPMA J. 2013 Jan 22;4(1):2
pubmed: 23339750
Clin Obstet Gynecol. 2006 Sep;49(3):433-47
pubmed: 16885651
J Proteomics. 2018 Sep 30;188:30-40
pubmed: 28851587
Sci Rep. 2020 May 20;10(1):8341
pubmed: 32433524
Genomics. 2020 Jul;112(4):2833-2841
pubmed: 32234433
CA Cancer J Clin. 2018 Jul;68(4):284-296
pubmed: 29809280
High Throughput. 2019 Jan 18;8(1):
pubmed: 30669303
Stat Med. 1996 Feb 28;15(4):361-87
pubmed: 8668867
Cell. 2016 Jul 28;166(3):755-765
pubmed: 27372738
EBioMedicine. 2018 Jan;27:156-166
pubmed: 29331675
EPMA J. 2017 Mar 9;8(1):51-60
pubmed: 28620443
BMC Bioinformatics. 2019 Dec 11;20(1):655
pubmed: 31829157
Gut. 2019 Nov;68(11):2019-2031
pubmed: 31227589
Nat Genet. 2013 Oct;45(10):1113-20
pubmed: 24071849
EPMA J. 2018 Feb 21;9(1):77-102
pubmed: 29515689
Cancers (Basel). 2019 Apr 07;11(4):
pubmed: 30959966
BMC Genomics. 2015;16 Suppl 9:S4
pubmed: 26328610
Clin Cancer Res. 2008 Aug 15;14(16):5198-208
pubmed: 18698038

Auteurs

Muta Tah Hira (MT)

School of Health and Life Sciences, Teesside University, Middlesbrough, TS4 3BX, UK.

M A Razzaque (MA)

School of Computing, Eng. & Digital Tech., Teesside University, Middlesbrough, TS4 3BX, UK. m.razzaque@tees.ac.uk.

Claudio Angione (C)

School of Computing, Eng. & Digital Tech., Teesside University, Middlesbrough, TS4 3BX, UK.

James Scrivens (J)

School of Health and Life Sciences, Teesside University, Middlesbrough, TS4 3BX, UK.

Saladin Sawan (S)

The James Cook University Hospital, Middlesbrough, TS4 3BW, UK.

Mosharraf Sarker (M)

School of Health and Life Sciences, Teesside University, Middlesbrough, TS4 3BX, UK.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH