Community Assessment of the Predictability of Cancer Protein and Phosphoprotein Levels from Genomics and Transcriptomics.
cancer
crowdsourcing
genomics
machine learning
protein regulation
proteogenomics
proteomics
Journal
Cell systems
ISSN: 2405-4720
Titre abrégé: Cell Syst
Pays: United States
ID NLM: 101656080
Informations de publication
Date de publication:
26 08 2020
26 08 2020
Historique:
received:
22
08
2019
revised:
12
03
2020
accepted:
29
06
2020
pubmed:
28
7
2020
medline:
29
9
2021
entrez:
26
7
2020
Statut:
ppublish
Résumé
Cancer is driven by genomic alterations, but the processes causing this disease are largely performed by proteins. However, proteins are harder and more expensive to measure than genes and transcripts. To catalyze developments of methods to infer protein levels from other omics measurements, we leveraged crowdsourcing via the NCI-CPTAC DREAM proteogenomic challenge. We asked for methods to predict protein and phosphorylation levels from genomic and transcriptomic data in cancer patients. The best performance was achieved by an ensemble of models, including as predictors transcript level of the corresponding genes, interaction between genes, conservation across tumor types, and phosphosite proximity for phosphorylation prediction. Proteins from metabolic pathways and complexes were the best and worst predicted, respectively. The performance of even the best-performing model was modest, suggesting that many proteins are strongly regulated through translational control and degradation. Our results set a reference for the limitations of computational inference in proteogenomics. A record of this paper's transparent peer review process is included in the Supplemental Information.
Identifiants
pubmed: 32710834
pii: S2405-4712(20)30242-8
doi: 10.1016/j.cels.2020.06.013
pii:
doi:
Substances chimiques
Phosphoproteins
0
Proteins
0
Types de publication
Journal Article
Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
186-195.e9Subventions
Organisme : NCI NIH HHS
ID : U24 CA210972
Pays : United States
Investigateurs
Tunde Aderinwale
(T)
Ebrahim Afyounian
(E)
Piyush Agrawal
(P)
Mehreen Ali
(M)
Alicia Amadoz
(A)
Francisco Azuaje
(F)
John Bachman
(J)
Seohui Bae
(S)
Sherry Bhalla
(S)
José Carbonell-Caballero
(J)
Priyanka Chakraborty
(P)
Kumardeep Chaudhary
(K)
Yonghwa Choi
(Y)
Yoonjung Choi
(Y)
Cankut Çubuk
(C)
Sandeep Kumar Dhanda
(SK)
Joaquín Dopazo
(J)
Laura L Elo
(LL)
Ábel Fóthi
(Á)
Olivier Gevaert
(O)
Kirsi Granberg
(K)
Russell Greiner
(R)
Eunji Heo
(E)
Marta R Hidalgo
(MR)
Vivek Jayaswal
(V)
Hwisang Jeon
(H)
Minji Jeon
(M)
Sunil V Kalmady
(SV)
Yasuhiro Kambara
(Y)
Jaewoo Kang
(J)
Keunsoo Kang
(K)
Tony Kaoma
(T)
Harpreet Kaur
(H)
Hilal Kazan
(H)
Devishi Kesar
(D)
Juha Kesseli
(J)
Daehan Kim
(D)
Keonwoo Kim
(K)
Sang-Yoon Kim
(SY)
Sunkyu Kim
(S)
Sajal Kumar
(S)
Bora Lee
(B)
Heewon Lee
(H)
Yunpeng Liu
(Y)
Roland Luethy
(R)
Swapnil Mahajan
(S)
Mehrad Mahmoudian
(M)
Arnaud Muller
(A)
Petr V Nazarov
(PV)
Hien Nguyen
(H)
Matti Nykter
(M)
Shujiro Okuda
(S)
Sungsoo Park
(S)
Gajendra Pal Singh Raghava
(G)
Jagath C Rajapakse
(JC)
Tommi Rantapero
(T)
Hobin Ryu
(H)
Francisco Salavert
(F)
Sohrab Saraei
(S)
Ruby Sharma
(R)
Ari Siitonen
(A)
Artem Sokolov
(A)
Kartik Subramanian
(K)
Veronika Suni
(V)
Tomi Suomi
(T)
Léon-Charles Tranchevent
(LC)
Salman Sadullah Usmani
(SS)
Tommi Välikangas
(T)
Roberto Vega
(R)
Hua Zhong
(H)
Informations de copyright
Copyright © 2020 The Authors. Published by Elsevier Inc. All rights reserved.
Déclaration de conflit d'intérêts
Declaration of Interests The authors declare no competing interests.