A Topic Modeling Analysis of TCGA Breast and Lung Cancer Transcriptomic Data.

gene expression network theory network-based cancer data analysis stochastic block modeling topic modeling

Journal

Cancers
ISSN: 2072-6694
Titre abrégé: Cancers (Basel)
Pays: Switzerland
ID NLM: 101526829

Informations de publication

Date de publication:
16 Dec 2020
Historique:
received: 19 10 2020
revised: 07 12 2020
accepted: 11 12 2020
entrez: 19 12 2020
pubmed: 20 12 2020
medline: 20 12 2020
Statut: epublish

Résumé

Topic modeling is a widely used technique to extract relevant information from large arrays of data. The problem of finding a topic structure in a dataset was recently recognized to be analogous to the community detection problem in network theory. Leveraging on this analogy, a new class of topic modeling strategies has been introduced to overcome some of the limitations of classical methods. This paper applies these recent ideas to TCGA transcriptomic data on breast and lung cancer. The established cancer subtype organization is well reconstructed in the inferred latent topic structure. Moreover, we identify specific topics that are enriched in genes known to play a role in the corresponding disease and are strongly related to the survival probability of patients. Finally, we show that a simple neural network classifier operating in the low dimensional topic space is able to predict with high accuracy the cancer subtype of a test expression sample.

Identifiants

pubmed: 33339347
pii: cancers12123799
doi: 10.3390/cancers12123799
pmc: PMC7766023
pii:
doi:

Types de publication

Journal Article

Langues

eng

Subventions

Organisme : Ministero dell'Istruzione, dell'Università e della Ricerca
ID : Departments of Excellence 2018-2022

Références

Methods. 2015 Mar;74:83-9
pubmed: 25484339
Proc Natl Acad Sci U S A. 2004 Mar 23;101(12):4164-9
pubmed: 15016911
Nat Med. 2016 Jan;22(1):105-13
pubmed: 26618723
Ann Oncol. 2013 Oct;24(10):2657-2671
pubmed: 23921790
BMC Bioinformatics. 2008 Dec 29;9:559
pubmed: 19114008
Lancet. 2017 Mar 18;389(10074):1134-1150
pubmed: 27865536
Phys Rev Lett. 2003 Feb 28;90(8):088102
pubmed: 12633463
Cancer Res Treat. 2019 Apr;51(2):737-747
pubmed: 30189722
Sci Adv. 2018 Jul 18;4(7):eaaq1360
pubmed: 30035215
Mol Oncol. 2011 Feb;5(1):5-23
pubmed: 21147047
PLoS One. 2010 Nov 23;5(11):e15543
pubmed: 21124904
PLoS Comput Biol. 2019 Mar 5;15(3):e1006701
pubmed: 30835723
N Engl J Med. 2016 Sep 22;375(12):1109-12
pubmed: 27653561
BMC Cancer. 2019 Aug 20;19(1):824
pubmed: 31429720
Springerplus. 2016 Sep 20;5(1):1608
pubmed: 27652181
PLoS Genet. 2017 Mar 23;13(3):e1006599
pubmed: 28333934
Phys Rev E. 2018 Jul;98(1-1):012315
pubmed: 30110773
Bioinformatics. 2005 Jan 15;21(2):171-8
pubmed: 15308542
Breast Cancer Res Treat. 2012 Aug;135(1):301-6
pubmed: 22752290
Genome Biol. 2018 Feb 6;19(1):15
pubmed: 29409532
Cancer Res. 2008 May 1;68(9):3108-14
pubmed: 18451135
Nature. 2012 Oct 4;490(7418):61-70
pubmed: 23000897
Nature. 2000 Aug 17;406(6797):747-52
pubmed: 10963602
Ann Oncol. 2010 Jun;21(6):1323-1360
pubmed: 19948741
Nat Rev Genet. 2016 Aug 16;17(9):507-22
pubmed: 27528417
Sci Rep. 2019 Jan 23;9(1):337
pubmed: 30674955
Nat Rev Cancer. 2014 Aug;14(8):535-46
pubmed: 25056707
J Clin Oncol. 2009 Mar 10;27(8):1160-7
pubmed: 19204204
Nat Genet. 2013 Jun;45(6):580-5
pubmed: 23715323
Proc Natl Acad Sci U S A. 2005 Oct 25;102(43):15545-50
pubmed: 16199517
Phys Rev E. 2017 Jan;95(1-1):012317
pubmed: 28208453
Nat Commun. 2015 Nov 17;6:8878
pubmed: 27305450
Proc Natl Acad Sci U S A. 1998 Dec 8;95(25):14863-8
pubmed: 9843981
Proc Natl Acad Sci U S A. 2001 Sep 11;98(19):10869-74
pubmed: 11553815
Sci Data. 2018 Apr 17;5:180061
pubmed: 29664468
Nucleic Acids Res. 2016 May 5;44(8):e71
pubmed: 26704973
Nat Genet. 2013 Oct;45(10):1113-20
pubmed: 24071849
Sci Rep. 2013 Oct 02;3:2652
pubmed: 24084870
Sci Rep. 2015 Dec 07;5:17386
pubmed: 26639632
Phys Rev E Stat Nonlin Soft Matter Phys. 2014 Jan;89(1):012804
pubmed: 24580278
Breast Cancer Res Treat. 2010 Jan;119(1):119-26
pubmed: 19669409
Cell. 2015 Oct 8;163(2):506-19
pubmed: 26451490
Front Biosci (Landmark Ed). 2017 Jun 1;22(10):1774-1791
pubmed: 28410145

Auteurs

Filippo Valle (F)

Physics Department, University of Turin and INFN, via P. Giuria 1, 10125 Turin, Italy.

Matteo Osella (M)

Physics Department, University of Turin and INFN, via P. Giuria 1, 10125 Turin, Italy.

Michele Caselle (M)

Physics Department, University of Turin and INFN, via P. Giuria 1, 10125 Turin, Italy.

Classifications MeSH