PCprophet: a framework for protein complex prediction and differential analysis using proteomic data.


Journal

Nature methods
ISSN: 1548-7105
Titre abrégé: Nat Methods
Pays: United States
ID NLM: 101215604

Informations de publication

Date de publication:
05 2021
Historique:
received: 06 05 2020
accepted: 03 03 2021
pubmed: 17 4 2021
medline: 28 7 2021
entrez: 16 4 2021
Statut: ppublish

Résumé

Despite the availability of methods for analyzing protein complexes, systematic analysis of complexes under multiple conditions remains challenging. Approaches based on biochemical fractionation of intact, native complexes and correlation of protein profiles have shown promise. However, most approaches for interpreting cofractionation datasets to yield complex composition and rearrangements between samples depend considerably on protein-protein interaction inference. We introduce PCprophet, a toolkit built on size exclusion chromatography-sequential window acquisition of all theoretical mass spectrometry (SEC-SWATH-MS) data to predict protein complexes and characterize their changes across experimental conditions. We demonstrate improved performance of PCprophet over state-of-the-art approaches and introduce a Bayesian approach to analyze altered protein-protein interactions across conditions. We provide both command-line and graphical interfaces to support the application of PCprophet to any cofractionation MS dataset, independent of separation or quantitative liquid chromatography-MS workflow, for the detection and quantitative tracking of protein complexes and their physiological dynamics.

Identifiants

pubmed: 33859439
doi: 10.1038/s41592-021-01107-5
pii: 10.1038/s41592-021-01107-5
doi:

Substances chimiques

Proteins 0

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

520-527

Références

Marsh, J. A. & Teichmann, S. A. Structure, dynamics, assembly, and evolution of protein complexes. Annu. Rev. Biochem. 84, 551–575 (2015).
pubmed: 25494300 doi: 10.1146/annurev-biochem-060614-034142
Pan, J. et al. Interrogation of mammalian protein complex structure, function, and membership using genome-scale fitness screens. Cell Syst. 6, 555–568 e557 (2018).
pubmed: 29778836 pmcid: 6152908 doi: 10.1016/j.cels.2018.04.011
Sowmya, G., Breen, E. J. & Ranganathan, S. Linking structural features of protein complexes and biological function. Protein Sci. 24, 1486–1494 (2015).
pubmed: 26131659 pmcid: 4570542 doi: 10.1002/pro.2736
Spirin, V. & Mirny, L. A. Protein complexes and functional modules in molecular networks. Proc. Natl Acad. Sci. USA 100, 12123–12128 (2003).
pubmed: 14517352 doi: 10.1073/pnas.2032324100 pmcid: 218723
Salas, D., Stacey, R. G., Akinlaja, M. & Foster, L. J. Next-generation interactomics: considerations for the use of co-elution to measure protein interaction networks. Mol. Cell Proteom. 19, 1–10 (2020).
doi: 10.1074/mcp.R119.001803
Crozier, T. W. M., Tinti, M., Larance, M., Lamond, A. I. & Ferguson, M. A. J. Prediction of protein complexes in Trypanosoma brucei by protein correlation profiling mass spectrometry and machine learning. Mol. Cell Proteom. 16, 2254–2267 (2017).
doi: 10.1074/mcp.O117.068122
Heusel, M. et al. A global screen for assembly state changes of the mitotic proteome by SEC-SWATH-MS. Cell Syst. 10, 133–155.e6 (2019).
doi: 10.1016/j.cels.2020.01.001
Hu, L. Z. et al. EPIC: software toolkit for elution profile-based inference of protein complexes. Nat. Methods 16, 737–742 (2019).
pubmed: 31308550 pmcid: 7995176 doi: 10.1038/s41592-019-0461-4
Kirkwood, K. J., Ahmad, Y., Larance, M. & Lamond, A. I. Characterization of native protein complexes and protein isoform variation using size-fractionation-based quantitative proteomics. Mol. Cell Proteom. 12, 3851–3873 (2013).
doi: 10.1074/mcp.M113.032367
Scott, N. E. et al. Interactome disassembly during apoptosis occurs independent of caspase cleavage. Mol. Syst. Biol. 13, 906 (2017).
pubmed: 28082348 pmcid: 5293159 doi: 10.15252/msb.20167067
Heusel, M. et al. Complex-centric proteome profiling by SEC-SWATH-MS. Mol. Syst. Biol. 15, e8438 (2019).
pubmed: 30642884 pmcid: 6346213 doi: 10.15252/msb.20188438
McBride, Z. et al. A label-free mass spectrometry method to predict endogenous protein complex composition. Mol. Cell Proteom. 18, 1588–1606 (2019).
doi: 10.1074/mcp.RA119.001400
Stacey, R. G., Skinnider, M. A., Scott, N. E. & Foster, L. J. A rapid and accurate approach for prediction of interactomes from coelution data (PrInCE). BMC Bioinf. 18, 457 (2017).
doi: 10.1186/s12859-017-1865-8
Kerr, C. H. et al. Dynamic rewiring of the human interactome by interferon signaling. Genome Biol. 21, 140 (2020).
pubmed: 32539747 pmcid: 7294662 doi: 10.1186/s13059-020-02050-y
Pourhaghighi, R. et al. BraInMap elucidates the macromolecular connectivity landscape of mammalian brain. Cell Syst. 10, 333–350.e314 (2020).
pubmed: 32325033 pmcid: 7938770 doi: 10.1016/j.cels.2020.03.003
Stacey, R. G., Skinnider, M. A. & Foster, L. J. On the robustness of graph-based clustering to random network alterations. Mol. Cell Proteom. 20, 100002 (2020).
doi: 10.1074/mcp.RA120.002275
Quinlan, R. C4.5: Programs for Machine Learning (Morgan Kaufmann, 1993).
Breiman, L. Random forests. Mach. Learn. 45, 5–32 (2001).
doi: 10.1023/A:1010933404324
Zhang, H. The optimality of naïve Bayes. in Proc. Seventeenth International Florida Artificial Intelligence Research Society Conference (AAAI Press, 2004).
Cortes, C. & Vapnik, V. Support-Vector Networks. Mach. Learn. 20, 273–297 (1995).
doi: 10.1007/BF00994018
Lecessie, S. & Vanhouwelingen, J. C. Ridge estimators in logistic-regression. Appl Stat.-J. R. St C. 41, 191–201 (1992).
Giurgiu, M. et al. CORUM: the comprehensive resource of mammalian protein complexes – 2019. Nucleic Acids Res. 47, D559–D563 (2019).
pubmed: 30357367 doi: 10.1093/nar/gky973
Kristensen, A. R., Gsponer, J. & Foster, L. J. A high-throughput approach for measuring temporal changes in the interactome. Nat. Methods 9, 907–909 (2012).
pubmed: 22863883 pmcid: 3954081 doi: 10.1038/nmeth.2131
Szklarczyk, D. et al. STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 47, D607–D613 (2019).
pubmed: 30476243 doi: 10.1093/nar/gky1131
Huttlin, E. L. et al. Architecture of the human interactome defines protein communities and disease networks. Nature 545, 505–509 (2017).
pubmed: 28514442 pmcid: 5531611 doi: 10.1038/nature22366
Huttlin, E. L. et al. The BioPlex network: a systematic exploration of the human interactome. Cell 162, 425–440 (2015).
pubmed: 26186194 pmcid: 4617211 doi: 10.1016/j.cell.2015.06.043
Oughtred, R. et al. The BioGRID interaction database: 2019 update. Nucleic Acids Res. 47, D529–D541 (2019).
pubmed: 30476227 doi: 10.1093/nar/gky1079
Havugimana, P. C. et al. A census of human soluble protein complexes. Cell 150, 1068–1081 (2012).
pubmed: 22939629 pmcid: 3477804 doi: 10.1016/j.cell.2012.08.011
Livneh, I., Cohen-Kaplan, V., Cohen-Rosenzweig, C., Avni, N. & Ciechanover, A. The life cycle of the 26S proteasome: from birth, through regulation and function, and onto its death. Cell Res 26, 869–885 (2016).
pubmed: 27444871 pmcid: 4973335 doi: 10.1038/cr.2016.86
Lasker, K. et al. Molecular architecture of the 26S proteasome holocomplex determined by an integrative approach. Proc. Natl Acad. Sci. USA 109, 1380–1387 (2012).
pubmed: 22307589 doi: 10.1073/pnas.1120559109 pmcid: 3277140
Ding, Z. et al. Structural snapshots of 26S proteasome reveal tetraubiquitin-induced conformations. Mol. Cell 73, 1150–1161.e1156 (2019).
pubmed: 30792173 doi: 10.1016/j.molcel.2019.01.018
Huang, D. T. et al. E2-RING expansion of the NEDD8 cascade confers specificity to cullin modification. Mol. Cell 33, 483–495 (2009).
pubmed: 19250909 pmcid: 2725360 doi: 10.1016/j.molcel.2009.01.011
Kohroki, J., Nishiyama, T., Nakamura, T. & Masuho, Y. ASB proteins interact with Cullin5 and Rbx2 to form E3 ubiquitin ligase complexes. FEBS Lett. 579, 6796–6802 (2005).
pubmed: 16325183 doi: 10.1016/j.febslet.2005.11.016
Lowe, N. et al. Analysis of the expression patterns, subcellular localisations and interaction partners of Drosophila proteins using a pigP protein trap library. Development 141, 3994–4005 (2014).
pubmed: 25294943 pmcid: 4197710 doi: 10.1242/dev.111054
Collins, M. O. et al. Molecular characterization and comparison of the components and multiprotein complexes in the postsynaptic proteome. J. Neurochem. 97, 16–23 (2006).
pubmed: 16635246 doi: 10.1111/j.1471-4159.2005.03507.x
Antonysamy, S. et al. Crystal structure of the human PRMT5:MEP50 complex. Proc. Natl Acad. Sci. USA 109, 17960–17965 (2012).
pubmed: 23071334 doi: 10.1073/pnas.1209814109 pmcid: 3497828
Scoumanne, A., Zhang, J. & Chen, X. PRMT5 is required for cell-cycle progression and p53 tumor suppressor function. Nucleic Acids Res. 37, 4965–4976 (2009).
pubmed: 19528079 pmcid: 2731901 doi: 10.1093/nar/gkp516
Gu, Z. et al. The p44/wdr77-dependent cellular proliferation process during lung development is reactivated in lung cancer. Oncogene 32, 1888–1900 (2013).
pubmed: 22665061 doi: 10.1038/onc.2012.207
Bludau, I. & Aebersold, R. Proteomic and interactomic insights into the molecular basis of cell functional diversity. Nat. Rev. Mol. Cell Biol. 21, 327–340 (2020).
pubmed: 32235894 doi: 10.1038/s41580-020-0231-2
Bludau, I. et al. Complex-centric proteome profiling by SEC-SWATH-MS for the parallel detection of hundreds of protein complexes. Nat. Protoc. 15, 2341–2386 (2020).
pubmed: 32690956 doi: 10.1038/s41596-020-0332-6
Chambers, M. C. et al. A cross-platform toolkit for mass spectrometry and proteomics. Nat. Biotechnol. 30, 918–920 (2012).
pubmed: 23051804 pmcid: 3471674 doi: 10.1038/nbt.2377
Rost, H. L. et al. OpenSWATH enables automated, targeted analysis of data-independent acquisition MS data. Nat. Biotechnol. 32, 219–223 (2014).
pubmed: 24727770 doi: 10.1038/nbt.2841
Rost, H. L. et al. TRIC: an automated alignment strategy for reproducible protein quantification in targeted proteomics. Nat. Methods 13, 777–783 (2016).
pubmed: 27479329 pmcid: 5008461 doi: 10.1038/nmeth.3954
Dijkstra, E. W. A note on two problems in connexion with graphs. Numer. Math. 1, 3 (1959).
doi: 10.1007/BF01386390
Vert, J. P, Tsuda, K & Schoelkopf, B. Kernel Methods in Computational Biology (MIT Press, 2004) 35–70.
Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn Res 12, 2825–2830 (2011).
Frank, E., Hall, M. A., & Witten, I. H. The WEKA Workbench. Online Appendix for ‘Data Mining: Practical Machine Learning Tools and Techniques’, 4th edn (Morgan Kaufmann, 2016).
Matthews, B. W. Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim. Biophys. Acta 405, 442–451 (1975).
pubmed: 1180967 doi: 10.1016/0005-2795(75)90109-9
Franz, M. et al. GeneMANIA update 2018. Nucleic Acids Res. 46, W60–W64 (2018).
pubmed: 29912392 pmcid: 6030815 doi: 10.1093/nar/gky311
Ashburner, M. et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25, 25–29 (2000).
pubmed: 10802651 pmcid: 3037419 doi: 10.1038/75556
Carbon, S. et al. AmiGO: online access to ontology and annotation data. Bioinformatics 25, 288–289 (2009).
pubmed: 19033274 doi: 10.1093/bioinformatics/btn615
The Gene Ontology Consortium. The Gene Ontology Resource: 20 years and still GOing strong. Nucleic Acids Res. 47, D330–D338 (2019).
doi: 10.1093/nar/gky1055
Wang, J. Z., Du, Z., Payattakool, R., Yu, P. S. & Chen, C. F. A new method to measure the semantic similarity of GO terms. Bioinformatics 23, 1274–1281 (2007).
pubmed: 17344234 doi: 10.1093/bioinformatics/btm087
The UniProt Consortium. UniProt: a worldwide hub of protein knowledge. Nucleic Acids Res. 47, D506–D515 (2019).
doi: 10.1093/nar/gky1049
McKinney, W. Data structure for statistical computation in Python. in The 9th Python in Science Conference (eds., Stéfan van der Walt and Jarrod Millman) 56–61 (2010).
Hagberg, A. A., Schult, D. A. & Swart, P. J. Exploring network structure, dynamics, and function using NetworkX. in The 7th Python in Science Conference (SciPy2008) (eds., Varoquaux, G. et al.) (2008).
Hunter, J. D. Matplotlib: a 2D graphics environment. Comput. Sci. Eng. 9, 90–95 (2007).
doi: 10.1109/MCSE.2007.55
Wickham, H. ggplot2: Elegant Graphics for Data Analysis (Springer, 2016).
Perez-Riverol, Y. et al. The PRIDE database and related tools and resources in 2019: improving support for quantification data. Nucleic Acids Res. 47, D442–D450 (2019).
doi: 10.1093/nar/gky1106 pubmed: 30395289

Auteurs

Andrea Fossati (A)

Department of Biology, Institute of Molecular Systems Biology, ETH Zürich, Zürich, Switzerland.
Department of Biology, Institute of Molecular Health Sciences, ETH Zürich, Zürich, Switzerland.
Quantitative Biosciences Institute (QBI) and Department of Cellular and Molecular Pharmacology, University of California, San Francisco, San Francisco, CA, USA.
J. David Gladstone Institutes, San Francisco, CA, USA.

Chen Li (C)

Department of Biology, Institute of Molecular Systems Biology, ETH Zürich, Zürich, Switzerland. Chen.Li@monash.edu.
Department of Biochemistry and Molecular Biology and Infection and Immunity Program, Biomedicine Discovery Institute, Monash University, Clayton, Victoria, Australia. Chen.Li@monash.edu.

Federico Uliana (F)

Department of Biology, Institute of Molecular Systems Biology, ETH Zürich, Zürich, Switzerland.

Fabian Wendt (F)

Department of Health Sciences and Technology, Institute of Translational Medicine, ETH Zürich, Zürich, Switzerland.

Fabian Frommelt (F)

Department of Biology, Institute of Molecular Systems Biology, ETH Zürich, Zürich, Switzerland.

Peter Sykacek (P)

Department of Biotechnology, University of Natural Resources and Life Sciences, Vienna, Austria.

Moritz Heusel (M)

Department of Biology, Institute of Molecular Systems Biology, ETH Zürich, Zürich, Switzerland.
Division of Infection Medicine (BMC), Department of Clinical Sciences, Lund University, Lund, Sweden.

Mahmoud Hallal (M)

Department for BioMedical Research, University of Bern, Bern, Switzerland.

Isabell Bludau (I)

Department of Biology, Institute of Molecular Systems Biology, ETH Zürich, Zürich, Switzerland.
Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany.

Tümay Capraz (T)

European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany.

Peng Xue (P)

Department of Biology, Institute of Molecular Systems Biology, ETH Zürich, Zürich, Switzerland.
Institute of Biophysics, Chinese Academy of Sciences, Beijing, China.

Jiangning Song (J)

Department of Biochemistry and Molecular Biology and Infection and Immunity Program, Biomedicine Discovery Institute, Monash University, Clayton, Victoria, Australia.

Bernd Wollscheid (B)

Department of Health Sciences and Technology, Institute of Translational Medicine, ETH Zürich, Zürich, Switzerland.
Swiss Institute of Bioinformatics, Lausanne, Switzerland.

Anthony W Purcell (AW)

Department of Biochemistry and Molecular Biology and Infection and Immunity Program, Biomedicine Discovery Institute, Monash University, Clayton, Victoria, Australia.

Matthias Gstaiger (M)

Department of Biology, Institute of Molecular Systems Biology, ETH Zürich, Zürich, Switzerland. matthias.gstaiger@imsb.biol.ethz.ch.

Ruedi Aebersold (R)

Department of Biology, Institute of Molecular Systems Biology, ETH Zürich, Zürich, Switzerland. aebersold@imsb.biol.ethz.ch.
Faculty of Science, University of Zürich, Zürich, Switzerland. aebersold@imsb.biol.ethz.ch.

Articles similaires

Selecting optimal software code descriptors-The case of Java.

Yegor Bugayenko, Zamira Kholmatova, Artem Kruglov et al.
1.00
Software Algorithms Programming Languages
Databases, Protein Protein Domains Protein Folding Proteins Deep Learning

Exploring blood-brain barrier passage using atomic weighted vector and machine learning.

Yoan Martínez-López, Paulina Phoobane, Yanaima Jauriga et al.
1.00
Blood-Brain Barrier Machine Learning Humans Support Vector Machine Software

Understanding the role of machine learning in predicting progression of osteoarthritis.

Simone Castagno, Benjamin Gompels, Estelle Strangmark et al.
1.00
Humans Disease Progression Machine Learning Osteoarthritis

Classifications MeSH