MOGSA: Integrative Single Sample Gene-set Analysis of Multiple Omics Data.
Bioinformatics software
Computational Biology
Mass Spectrometry
Metabolomics
Multi-omics integration
RNA SEQ
gene set analysis
single sample
tumor subtype
Journal
Molecular & cellular proteomics : MCP
ISSN: 1535-9484
Titre abrégé: Mol Cell Proteomics
Pays: United States
ID NLM: 101125647
Informations de publication
Date de publication:
09 08 2019
09 08 2019
Historique:
received:
30
11
2018
revised:
26
06
2019
pubmed:
28
6
2019
medline:
17
6
2020
entrez:
28
6
2019
Statut:
ppublish
Résumé
Gene-set analysis (GSA) summarizes individual molecular measurements to more interpretable pathways or gene-sets and has become an indispensable step in the interpretation of large-scale omics data. However, GSA methods are limited to the analysis of single omics data. Here, we introduce a new computation method termed multi-omics gene-set analysis (MOGSA), a multivariate single sample gene-set analysis method that integrates multiple experimental and molecular data types measured over the same set of samples. The method learns a low dimensional representation of most variant correlated features (genes, proteins, etc.) across multiple omics data sets, transforms the features onto the same scale and calculates an integrated gene-set score from the most informative features in each data type. MOGSA does not require filtering data to the intersection of features (gene IDs), therefore, all molecular features, including those that lack annotation may be included in the analysis. Using simulated data, we demonstrate that integrating multiple diverse sources of molecular data increases the power to discover subtle changes in gene-sets and may reduce the impact of unreliable information in any single data type. Using real experimental data, we demonstrate three use-cases of MOGSA. First, we show how to remove a source of noise (technical or biological) in integrative MOGSA of NCI60 transcriptome and proteome data. Second, we apply MOGSA to discover similarities and differences in mRNA, protein and phosphorylation profiles of a small study of stem cell lines and assess the influence of each data type or feature on the total gene-set score. Finally, we apply MOGSA to cluster analysis and show that three molecular subtypes are robustly discovered when copy number variation and mRNA data of 308 bladder cancers from The Cancer Genome Atlas are integrated using MOGSA. MOGSA is available in the Bioconductor R package "mogsa."
Identifiants
pubmed: 31243065
pii: S1535-9476(20)32768-7
doi: 10.1074/mcp.TIR118.001251
pmc: PMC6692785
pii:
doi:
Substances chimiques
RNA, Messenger
0
Types de publication
Journal Article
Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, Non-P.H.S.
Langues
eng
Sous-ensembles de citation
IM
Pagination
S153-S168Informations de copyright
© 2019 Meng et al.
Références
Nucleic Acids Res. 2018 Nov 16;46(20):10546-10562
pubmed: 30295871
Cell Rep. 2014 Nov 20;9(4):1235-45
pubmed: 25456126
Proc Natl Acad Sci U S A. 2000 Jul 18;97(15):8409-14
pubmed: 10890920
Nat Rev Cancer. 2015 Jan;15(1):25-41
pubmed: 25533674
Mol Syst Biol. 2018 Jun 20;14(6):e8124
pubmed: 29925568
Trends Genet. 2018 Oct;34(10):790-805
pubmed: 30143323
Mol Oncol. 2007 Jun;1(1):84-96
pubmed: 18516279
J Proteome Res. 2016 Mar 4;15(3):755-65
pubmed: 26653205
Mol Cell Proteomics. 2019 Mar;18(3):576-593
pubmed: 30563849
Nature. 2014 May 29;509(7502):582-7
pubmed: 24870543
Brief Bioinform. 2020 Mar 23;21(2):541-552
pubmed: 31220206
Nat Biotechnol. 2017 Oct;35(10):936-939
pubmed: 28854175
Bioinformatics. 2005 May 15;21(10):2424-9
pubmed: 15746280
Genome Biol. 2011;12(4):R41
pubmed: 21527027
Bioinformatics. 2010 Feb 15;26(4):493-500
pubmed: 20022975
Bioinformatics. 2018 Mar 15;34(6):1009-1015
pubmed: 29077792
BMC Bioinformatics. 2003 Nov 21;4:59
pubmed: 14633289
Bioinformatics. 2010 Jun 15;26(12):1572-3
pubmed: 20427518
PLoS Comput Biol. 2008 Nov;4(11):e1000217
pubmed: 18989396
Nat Methods. 2017 Sep;14(9):865-868
pubmed: 28759029
Nat Rev Genet. 2010 Jan;11(1):31-46
pubmed: 19997069
Sci Rep. 2014 Aug 27;4:6207
pubmed: 25158761
Bioinformatics. 2011 Jun 15;27(12):1739-40
pubmed: 21546393
Nat Commun. 2015 Dec 04;6:8971
pubmed: 26634437
BMC Genomics. 2009 Jan 20;10:32
pubmed: 19154582
Nat Biotechnol. 2016 Jun 9;34(6):591-3
pubmed: 27281413
BMC Bioinformatics. 2009 Jan 26;10:34
pubmed: 19171069
Clin Cancer Res. 2012 Jun 15;18(12):3377-86
pubmed: 22553347
Brief Bioinform. 2016 Jul;17(4):628-41
pubmed: 26969681
Cancer Cell. 2014 Feb 10;25(2):152-65
pubmed: 24525232
Proteomics. 2011 Mar;11(6):1064-74
pubmed: 21298793
BMC Bioinformatics. 2005 Sep 12;6:225
pubmed: 16156896
Cancer Discov. 2013 Oct;3(10):1108-12
pubmed: 24124232
Proc Natl Acad Sci U S A. 2003 Mar 18;100(6):3351-6
pubmed: 12631705
Cancer Res. 2010 May 1;70(9):3463-72
pubmed: 20406976
PLoS One. 2014 Jan 08;9(1):e85150
pubmed: 24416353
BMC Bioinformatics. 2014 May 29;15:162
pubmed: 24884486
Nature. 2009 Nov 5;462(7269):108-12
pubmed: 19847166
Psychometrika. 2017 May 23;:
pubmed: 28536930
Genome Biol. 2000;1(2):RESEARCH0003
pubmed: 11178228
Proteomics. 2007 Jun;7(13):2162-71
pubmed: 17549791
Cell. 2017 Oct 19;171(3):540-556.e25
pubmed: 28988769
Nat Methods. 2011 Sep 11;8(10):821-7
pubmed: 21983960
Biostatistics. 2004 Oct;5(4):557-72
pubmed: 15475419
Nat Rev Genet. 2011 Feb;12(2):87-98
pubmed: 21191423
BMC Bioinformatics. 2013 Jan 16;14:7
pubmed: 23323831
PLoS Comput Biol. 2012;8(2):e1002375
pubmed: 22383865
Nat Genet. 2013 Oct;45(10):1113-20
pubmed: 24071849
BMC Genomics. 2009 Jun 23;10:277
pubmed: 19549304
Nat Methods. 2014 Jun;11(6):599-600
pubmed: 24874569
Nucleic Acids Res. 2010 Oct;38(18):e178
pubmed: 20802226
Cell Rep. 2013 Aug 15;4(3):609-20
pubmed: 23933261
PLoS Genet. 2007 Sep;3(9):1724-35
pubmed: 17907809
Nature. 2011 May 19;473(7347):337-42
pubmed: 21593866
Proc Natl Acad Sci U S A. 2014 Feb 25;111(8):3110-5
pubmed: 24520177
Nat Rev Genet. 2010 Oct;11(10):733-9
pubmed: 20838408
Bioinformatics. 2009 Feb 1;25(3):401-5
pubmed: 19073588