Widespread redundancy in -omics profiles of cancer mutation states.


Journal

Genome biology
ISSN: 1474-760X
Titre abrégé: Genome Biol
Pays: England
ID NLM: 100960660

Informations de publication

Date de publication:
27 06 2022
Historique:
received: 15 11 2021
accepted: 14 06 2022
entrez: 27 6 2022
pubmed: 28 6 2022
medline: 30 6 2022
Statut: epublish

Résumé

In studies of cellular function in cancer, researchers are increasingly able to choose from many -omics assays as functional readouts. Choosing the correct readout for a given study can be difficult, and which layer of cellular function is most suitable to capture the relevant signal remains unclear. We consider prediction of cancer mutation status (presence or absence) from functional -omics data as a representative problem that presents an opportunity to quantify and compare the ability of different -omics readouts to capture signals of dysregulation in cancer. From the TCGA Pan-Cancer Atlas that contains genetic alteration data, we focus on RNA sequencing, DNA methylation arrays, reverse phase protein arrays (RPPA), microRNA, and somatic mutational signatures as -omics readouts. Across a collection of genes recurrently mutated in cancer, RNA sequencing tends to be the most effective predictor of mutation state. We find that one or more other data types for many of the genes are approximately equally effective predictors. Performance is more variable between mutations than that between data types for the same mutation, and there is little difference between the top data types. We also find that combining data types into a single multi-omics model provides little or no improvement in predictive ability over the best individual data type. Based on our results, for the design of studies focused on the functional outcomes of cancer mutations, there are often multiple -omics types that can serve as effective readouts, although gene expression seems to be a reasonable default option.

Sections du résumé

BACKGROUND
In studies of cellular function in cancer, researchers are increasingly able to choose from many -omics assays as functional readouts. Choosing the correct readout for a given study can be difficult, and which layer of cellular function is most suitable to capture the relevant signal remains unclear.
RESULTS
We consider prediction of cancer mutation status (presence or absence) from functional -omics data as a representative problem that presents an opportunity to quantify and compare the ability of different -omics readouts to capture signals of dysregulation in cancer. From the TCGA Pan-Cancer Atlas that contains genetic alteration data, we focus on RNA sequencing, DNA methylation arrays, reverse phase protein arrays (RPPA), microRNA, and somatic mutational signatures as -omics readouts. Across a collection of genes recurrently mutated in cancer, RNA sequencing tends to be the most effective predictor of mutation state. We find that one or more other data types for many of the genes are approximately equally effective predictors. Performance is more variable between mutations than that between data types for the same mutation, and there is little difference between the top data types. We also find that combining data types into a single multi-omics model provides little or no improvement in predictive ability over the best individual data type.
CONCLUSIONS
Based on our results, for the design of studies focused on the functional outcomes of cancer mutations, there are often multiple -omics types that can serve as effective readouts, although gene expression seems to be a reasonable default option.

Identifiants

pubmed: 35761387
doi: 10.1186/s13059-022-02705-y
pii: 10.1186/s13059-022-02705-y
pmc: PMC9238138
doi:

Substances chimiques

MicroRNAs 0

Types de publication

Journal Article Research Support, N.I.H., Extramural

Langues

eng

Sous-ensembles de citation

IM

Pagination

137

Subventions

Organisme : NCI NIH HHS
ID : R01 CA216265
Pays : United States
Organisme : NCI NIH HHS
ID : R01 CA253976
Pays : United States
Organisme : NCI NIH HHS
ID : R01 CA237170
Pays : United States
Organisme : NHGRI NIH HHS
ID : R01 HG010067
Pays : United States
Organisme : NCI NIH HHS
ID : R50 CA221675
Pays : United States

Informations de copyright

© 2022. The Author(s).

Références

Cancer Cell. 2019 Oct 14;36(4):444-457.e7
pubmed: 31588020
Cell Mol Life Sci. 2017 Sep;74(18):3317-3334
pubmed: 28386724
PLoS Comput Biol. 2017 Nov 10;13(11):e1005840
pubmed: 29125844
Pac Symp Biocomput. 2020;25:343-354
pubmed: 31797609
Nat Commun. 2016 Jul 15;7:12096
pubmed: 27417679
Nat Genet. 2013 Oct;45(10):1113-20
pubmed: 24071849
Cell Rep. 2022 Mar 29;38(13):110569
pubmed: 35354049
Bioinformatics. 2019 Jul 15;35(14):i501-i509
pubmed: 31510700
PLoS Comput Biol. 2021 Apr 16;17(4):e1008878
pubmed: 33861732
Nature. 2019 May;569(7757):503-508
pubmed: 31068700
Nat Commun. 2019 Aug 8;10(1):3574
pubmed: 31395879
Bioinformatics. 2013 Jan 15;29(2):189-96
pubmed: 23175756
Nat Rev Genet. 2018 Feb;19(2):93-109
pubmed: 29279605
BMC Med Genomics. 2018 Nov 06;11(1):98
pubmed: 30400878
Bioinformatics. 2011 Mar 15;27(6):887-8
pubmed: 21228048
Brief Bioinform. 2021 Jul 20;22(4):
pubmed: 33126245
Cell Syst. 2018 Mar 28;6(3):271-281.e7
pubmed: 29596782
Nature. 2020 Feb;578(7793):112-121
pubmed: 32025012
J R Soc Interface. 2015 Nov 6;12(112):
pubmed: 26490630
PeerJ. 2018 Jul 31;6:e5362
pubmed: 30083469
Genome Biol. 2020 May 11;21(1):109
pubmed: 32393369
Genome Biol. 2011;12(4):R41
pubmed: 21527027
Cell. 2018 Apr 5;173(2):371-385.e18
pubmed: 29625053
Cancer Cell. 2020 Nov 9;38(5):672-684.e6
pubmed: 33096023
Nat Rev Cancer. 2018 Nov;18(11):696-705
pubmed: 30293088
Elife. 2018 Dec 11;7:
pubmed: 30526857
Nat Genet. 2016 Jan;48(1):59-66
pubmed: 26618343
Cancer Cell. 2010 May 18;17(5):510-22
pubmed: 20399149
Bioinformatics. 2010 Jun 15;26(12):i237-45
pubmed: 20529912
PLoS Comput Biol. 2019 Mar 28;15(3):e1006658
pubmed: 30921324
Sci Transl Med. 2015 Apr 15;7(283):283ra54
pubmed: 25877892
Brief Bioinform. 2017 May 1;18(3):413-425
pubmed: 27127206
Trends Genet. 2014 Oct;30(10):464-74
pubmed: 25132561
Science. 2013 Mar 29;339(6127):1546-58
pubmed: 23539594
BioData Min. 2018 Oct 25;11:22
pubmed: 30386434
Nat Commun. 2015 Oct 05;6:8554
pubmed: 26436532
BMC Bioinformatics. 2021 May 6;22(1):233
pubmed: 33957863
Genome Med. 2018 Nov 15;10(1):83
pubmed: 30442178
PLoS Comput Biol. 2019 Jun 24;15(6):e1007128
pubmed: 31233491
Stat Med. 1996 Feb 28;15(4):361-87
pubmed: 8668867
BMC Bioinformatics. 2020 Mar 17;21(1):108
pubmed: 32183722
Cell Rep. 2018 Apr 3;23(1):172-180.e3
pubmed: 29617658
Nat Rev Cancer. 2014 May;14(5):299-313
pubmed: 24759209
Nature. 2020 Feb;578(7793):94-101
pubmed: 32025018
Proc Natl Acad Sci U S A. 2016 Dec 13;113(50):14330-14335
pubmed: 27911828
Clin Cancer Res. 2014 Jan 1;20(1):265-272
pubmed: 24170544
Cancer Res. 2021 Jun 15;81(12):3358-3373
pubmed: 33853832
Oncogene. 2004 Apr 1;23(14):2564-75
pubmed: 14743203
PLoS One. 2020 Nov 9;15(11):e0241514
pubmed: 33166334
Cell Rep. 2018 Apr 03;23(1):239-254.e6
pubmed: 29617664
Nat Commun. 2019 Jul 30;10(1):3399
pubmed: 31363082
Inf Fusion. 2019 Oct;50:71-91
pubmed: 30467459
NPJ Syst Biol Appl. 2021 Aug 20;7(1):33
pubmed: 34417465
J Natl Cancer Inst. 2011 Jan 19;103(2):143-53
pubmed: 21163902
Cell. 2020 Jan 23;180(2):387-402.e16
pubmed: 31978347
PLoS One. 2015 Mar 04;10(3):e0118432
pubmed: 25738806
Genome Biol. 2021 May 6;22(1):142
pubmed: 33957961
Nat Methods. 2013 Nov;10(11):1046-7
pubmed: 24037243
N Engl J Med. 2009 Feb 19;360(8):765-73
pubmed: 19228619
Nat Rev Genet. 2014 Sep;15(9):585-98
pubmed: 24981601
Cell. 2018 Apr 5;173(2):321-337.e10
pubmed: 29625050
Blood Adv. 2019 Nov 12;3(21):3214-3227
pubmed: 31698452
J Biomed Inform. 2015 Aug;56:220-8
pubmed: 26048077
Nature. 2012 Feb 15;483(7390):474-8
pubmed: 22343901
Am J Hum Genet. 2013 Oct 3;93(4):641-51
pubmed: 24075185
Cold Spring Harb Perspect Med. 2017 May 1;7(5):
pubmed: 28159833

Auteurs

Jake Crawford (J)

Genomics and Computational Biology Graduate Group, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.

Brock C Christensen (BC)

Department of Epidemiology, Geisel School of Medicine, Dartmouth College, Lebanon, NH, USA.
Department of Molecular and Systems Biology, Geisel School of Medicine, Dartmouth College, Lebanon, NH, USA.

Maria Chikina (M)

Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA.

Casey S Greene (CS)

Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, CO, USA. casey.s.greene@cuanschutz.edu.
Center for Health AI, University of Colorado School of Medicine, Aurora, CO, USA. casey.s.greene@cuanschutz.edu.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH