Discriminating physiological from non-physiological interfaces in structures of protein complexes: A community-wide study.

crystal contacts homodimers potential energy protein interactions protein structure

Journal

Proteomics
ISSN: 1615-9861
Titre abrégé: Proteomics
Pays: Germany
ID NLM: 101092707

Informations de publication

Date de publication:
09 2023
Historique:
revised: 11 05 2023
received: 04 02 2023
accepted: 11 05 2023
medline: 6 9 2023
pubmed: 27 6 2023
entrez: 27 6 2023
Statut: ppublish

Résumé

Reliably scoring and ranking candidate models of protein complexes and assigning their oligomeric state from the structure of the crystal lattice represent outstanding challenges. A community-wide effort was launched to tackle these challenges. The latest resources on protein complexes and interfaces were exploited to derive a benchmark dataset consisting of 1677 homodimer protein crystal structures, including a balanced mix of physiological and non-physiological complexes. The non-physiological complexes in the benchmark were selected to bury a similar or larger interface area than their physiological counterparts, making it more difficult for scoring functions to differentiate between them. Next, 252 functions for scoring protein-protein interfaces previously developed by 13 groups were collected and evaluated for their ability to discriminate between physiological and non-physiological complexes. A simple consensus score generated using the best performing score of each of the 13 groups, and a cross-validated Random Forest (RF) classifier were created. Both approaches showed excellent performance, with an area under the Receiver Operating Characteristic (ROC) curve of 0.93 and 0.94, respectively, outperforming individual scores developed by different groups. Additionally, AlphaFold2 engines recalled the physiological dimers with significantly higher accuracy than the non-physiological set, lending support to the reliability of our benchmark dataset annotations. Optimizing the combined power of interface scoring functions and evaluating it on challenging benchmark datasets appears to be a promising strategy.

Identifiants

pubmed: 37365936
doi: 10.1002/pmic.202200323
pmc: PMC10937251
mid: NIHMS1966567
doi:

Substances chimiques

Proteins 0

Types de publication

Journal Article Research Support, N.I.H., Extramural Research Support, U.S. Gov't, Non-P.H.S. Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

e2200323

Subventions

Organisme : NIGMS NIH HHS
ID : R01 GM133840
Pays : United States
Organisme : NIGMS NIH HHS
ID : R35 GM122517
Pays : United States
Organisme : NIGMS NIH HHS
ID : R35 GM136409
Pays : United States

Informations de copyright

© 2023 Wiley-VCH GmbH.

Références

Bioinformatics. 2017 Jun 01;33(11):1656-1663
pubmed: 28130235
Front Mol Biosci. 2021 May 25;8:647915
pubmed: 34113650
Nucleic Acids Res. 2019 Jul 2;47(W1):W437-W442
pubmed: 31073605
Structure. 2021 Jun 3;29(6):606-621.e5
pubmed: 33539768
Protein Sci. 2018 Jan;27(1):172-181
pubmed: 28891124
Bioinformatics. 2019 Nov 1;35(22):4821-4823
pubmed: 31141126
Nucleic Acids Res. 2018 Jan 4;46(D1):D435-D439
pubmed: 29112716
PLoS One. 2016 Sep 09;11(9):e0162143
pubmed: 27611671
Protein Sci. 2013 Jan;22(1):74-82
pubmed: 23139141
Bioinformatics. 2018 Oct 15;34(20):3461-3469
pubmed: 29718115
Proteins. 2011 Sep;79(9):2648-61
pubmed: 21732421
Nucleic Acids Res. 2023 Jan 6;51(D1):D466-D478
pubmed: 36300618
Bioinform Adv. 2021 Dec 10;2(1):vbab042
pubmed: 36699405
Nat Struct Biol. 2003 Dec;10(12):980
pubmed: 14634627
Nat Commun. 2022 Mar 10;13(1):1265
pubmed: 35273146
Proteins. 2008 Nov 15;73(3):705-9
pubmed: 18491384
Bioinformatics. 2013 Dec 15;29(24):3158-66
pubmed: 24078704
Biochimie. 1995;77(7-8):497-505
pubmed: 8589061
Proteins. 2019 Dec;87(12):1200-1221
pubmed: 31612567
BMC Struct Biol. 2014 Oct 18;14:22
pubmed: 25326082
Sci Rep. 2017 Sep 5;7(1):10480
pubmed: 28874689
Nat Rev Genet. 2011 Jan;12(1):56-68
pubmed: 21164525
Science. 2021 Aug 20;373(6557):871-876
pubmed: 34282049
Nat Methods. 2022 Jun;19(6):730-739
pubmed: 35637310
Nat Methods. 2022 Jun;19(6):679-682
pubmed: 35637307
PLoS One. 2016 Aug 25;11(8):e0161879
pubmed: 27560519
FEBS Lett. 2010 Mar 19;584(6):1163-8
pubmed: 20153323
Bioinformatics. 2020 Apr 1;36(7):2113-2118
pubmed: 31746961
Proteins. 2010 Nov 15;78(15):3111-4
pubmed: 20806234
Bio Protoc. 2017 Feb 05;7(3):e2124
pubmed: 34458447
Curr Opin Struct Biol. 2019 Oct;58:105-114
pubmed: 31394387
Proteins. 2021 Dec;89(12):1711-1721
pubmed: 34599769
F1000Res. 2016 Feb 18;5:189
pubmed: 26973785
Proteins. 2013 Dec;81(12):2082-95
pubmed: 24115211
Nat Commun. 2020 Feb 5;11(1):711
pubmed: 32024829
Acta Crystallogr D Biol Crystallogr. 2013 May;69(Pt 5):701-9
pubmed: 23633579
Proteins. 2008 Aug;72(2):557-79
pubmed: 18247354
Nature. 2021 Aug;596(7873):583-589
pubmed: 34265844
BMC Biophys. 2012 May 06;5:7
pubmed: 22559010
Proteins. 2013 Sep;81(9):1571-84
pubmed: 23609916
Biophys J. 2011 Oct 19;101(8):2043-52
pubmed: 22004759
Nat Methods. 2020 Feb;17(2):184-192
pubmed: 31819266
J Mol Biol. 2007 Sep 21;372(3):774-97
pubmed: 17681537
J Mol Biol. 2016 Feb 22;428(4):720-725
pubmed: 26410586
Proteins. 2006 Nov 1;65(2):392-406
pubmed: 16933295
Proteins. 2021 Dec;89(12):1800-1823
pubmed: 34453465
Proteins. 2012 Jul;80(7):1818-33
pubmed: 22488467
Proteins. 1995 Dec;23(4):580-7
pubmed: 8749854
Proteins. 2007 Aug 1;68(2):503-15
pubmed: 17444519
Proteins. 2020 Sep;88(9):1180-1188
pubmed: 32170770
J Mol Biol. 2008 Aug 29;381(2):487-507
pubmed: 18599072
Proteins. 2003 Jul 1;52(1):2-9
pubmed: 12784359
Adv Protein Chem. 2002;61:9-73
pubmed: 12461820
Proteins. 2015 Sep;83(9):1563-70
pubmed: 25488330
PLoS Comput Biol. 2006 Nov 17;2(11):e155
pubmed: 17112313
Bioinformatics. 2011 Oct 15;27(20):2915-6
pubmed: 21873642
Bioinformatics. 2015 Jan 1;31(1):123-5
pubmed: 25183488
Nucleic Acids Res. 2018 Jul 2;46(W1):W296-W303
pubmed: 29788355
Nat Methods. 2018 Jan;15(1):67-72
pubmed: 29155427
Proteins. 2020 Aug;88(8):916-938
pubmed: 31886916
J Chem Theory Comput. 2017 Jun 13;13(6):3031-3048
pubmed: 28430426
Methods Mol Biol. 2018;1764:429-447
pubmed: 29605932
Bioinformatics. 2020 Apr 1;36(7):2105-2112
pubmed: 31738385
J Mol Biol. 2015 Sep 25;427(19):3031-41
pubmed: 26231283
J Mol Biol. 2003 Aug 1;331(1):281-99
pubmed: 12875852
Front Mol Biosci. 2022 Jan 05;8:787510
pubmed: 35071324
Proteins. 2017 Mar;85(3):359-377
pubmed: 27865038
Curr Opin Struct Biol. 2004 Apr;14(2):242-9
pubmed: 15093840
J Comput Chem. 2014 Mar 30;35(8):672-81
pubmed: 24523197
Genome Res. 2008 Apr;18(4):644-52
pubmed: 18381899
Proteins. 2017 Jun;85(6):1131-1145
pubmed: 28263393
Nat Commun. 2021 Dec 3;12(1):7068
pubmed: 34862392
Nucleic Acids Res. 2021 Jan 8;49(D1):D266-D273
pubmed: 33237325
Nucleic Acids Res. 2018 Jul 2;46(W1):W408-W416
pubmed: 29741647
Bioinformatics. 2013 Jul 15;29(14):1742-9
pubmed: 23652426
Front Bioinform. 2022 Sep 26;2:959160
pubmed: 36304330
Nat Commun. 2021 Nov 26;12(1):6933
pubmed: 34836937
Bioinformatics. 2003 Jan;19(1):163-4
pubmed: 12499312
Bioinformatics. 2023 Jan 1;39(1):
pubmed: 36420989
Cell. 1998 Feb 6;92(3):291-4
pubmed: 9476889
Proteins. 2014 Nov;82(11):3163-9
pubmed: 25179222

Auteurs

Hugo Schweke (H)

Department of Chemical and Structural Biology, Weizmann Institute of Science, Rehovot, Israel.

Qifang Xu (Q)

Institute for Cancer Research, Fox Chase Cancer Center, Philadelphia, Pennsylvania, USA.

Gerardo Tauriello (G)

Biozentrum, University of Basel & SIB Swiss Institute of Bioinformatics, Basel, Switzerland.

Lorenzo Pantolini (L)

Biozentrum, University of Basel & SIB Swiss Institute of Bioinformatics, Basel, Switzerland.

Torsten Schwede (T)

Biozentrum, University of Basel & SIB Swiss Institute of Bioinformatics, Basel, Switzerland.

Frédéric Cazals (F)

Centre Inria d'Université Côte d'Azur, Sophia-Antipolis, France.

Alix Lhéritier (A)

Amadeus SAS, Sophia-Antipolis, France.

Juan Fernandez-Recio (J)

Instituto de Ciencias de la Vid y del Vino (ICVV), CSIC-UR-Gobierno de La Rioja, Logroño, Spain.

Luis Angel Rodríguez-Lumbreras (LA)

Instituto de Ciencias de la Vid y del Vino (ICVV), CSIC-UR-Gobierno de La Rioja, Logroño, Spain.

Ora Schueler-Furman (O)

Department of Microbiology and Molecular Genetics, The Institute for Medical Research Israel-Canada, Hebrew University-Hadassah Medical School, Jerusalem, Israel.

Julia K Varga (JK)

Department of Microbiology and Molecular Genetics, The Institute for Medical Research Israel-Canada, Hebrew University-Hadassah Medical School, Jerusalem, Israel.

Brian Jiménez-García (B)

Computational Structural Biology Group, Department of Chemistry, Bijvoet Centre, Faculty of Science, Utrecht University, Utrecht, The Netherlands.
Zymvol Biomodeling SL, Barcelona, Spain.

Manon F Réau (MF)

Computational Structural Biology Group, Department of Chemistry, Bijvoet Centre, Faculty of Science, Utrecht University, Utrecht, The Netherlands.

Alexandre M J J Bonvin (AMJJ)

Computational Structural Biology Group, Department of Chemistry, Bijvoet Centre, Faculty of Science, Utrecht University, Utrecht, The Netherlands.

Castrense Savojardo (C)

Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy.

Pier-Luigi Martelli (PL)

Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy.

Rita Casadio (R)

Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy.

Jérôme Tubiana (J)

Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel.

Haim J Wolfson (HJ)

Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel.

Romina Oliva (R)

Department of Sciences and Technologies, University of Naples "Parthenope", Naples, Italy.

Didier Barradas-Bautista (D)

Kaust Visualization Lab, Core lab Division, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia.

Tiziana Ricciardelli (T)

Physical Sciences and Engineering Division, Kaust Catalysis Center, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia.

Luigi Cavallo (L)

Physical Sciences and Engineering Division, Kaust Catalysis Center, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia.

Česlovas Venclovas (Č)

Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania.

Kliment Olechnovič (K)

Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania.

Raphael Guerois (R)

Institute for Integrative Biology of the Cell (I2BC), Commissariat à l'Energie Atomique, CNRS, Université Paris-Sud, Université Paris-Saclay, Gif-sur-Yvette, France.

Jessica Andreani (J)

Institute for Integrative Biology of the Cell (I2BC), Commissariat à l'Energie Atomique, CNRS, Université Paris-Sud, Université Paris-Saclay, Gif-sur-Yvette, France.

Juliette Martin (J)

Univ Lyon, Université Claude Bernard Lyon 1, CNRS, UMR 5086 MMSB, Lyon, France.

Xiao Wang (X)

Department of Computer Science, Purdue University, West Lafayette, Indiana, USA.

Genki Terashi (G)

Department of Biological Sciences, Purdue University, West Lafayette, Indiana, USA.

Daipayan Sarkar (D)

Department of Biological Sciences, Purdue University, West Lafayette, Indiana, USA.

Charles Christoffer (C)

Department of Computer Science, Purdue University, West Lafayette, Indiana, USA.

Tunde Aderinwale (T)

Department of Computer Science, Purdue University, West Lafayette, Indiana, USA.

Jacob Verburgt (J)

Department of Biological Sciences, Purdue University, West Lafayette, Indiana, USA.

Daisuke Kihara (D)

Department of Computer Science, Purdue University, West Lafayette, Indiana, USA.
Department of Biological Sciences, Purdue University, West Lafayette, Indiana, USA.

Anthony Marchand (A)

Laboratory of Protein Design and Immunoengineering, Ecole polytechnique fédérale de Lausanne (EPFL), Lausanne, Switzerland.

Bruno E Correia (BE)

Laboratory of Protein Design and Immunoengineering, Ecole polytechnique fédérale de Lausanne (EPFL), Lausanne, Switzerland.

Rui Duan (R)

Department of Physics and Astronomy, Department of Biochemistry, Dalton Cardiovascular Research Center, Institute for Data Science and Informatics, University of Missouri, Columbia, Missouri, USA.

Liming Qiu (L)

Department of Physics and Astronomy, Department of Biochemistry, Dalton Cardiovascular Research Center, Institute for Data Science and Informatics, University of Missouri, Columbia, Missouri, USA.

Xianjin Xu (X)

Department of Physics and Astronomy, Department of Biochemistry, Dalton Cardiovascular Research Center, Institute for Data Science and Informatics, University of Missouri, Columbia, Missouri, USA.

Shuang Zhang (S)

Department of Physics and Astronomy, Department of Biochemistry, Dalton Cardiovascular Research Center, Institute for Data Science and Informatics, University of Missouri, Columbia, Missouri, USA.

Xiaoqin Zou (X)

Department of Physics and Astronomy, Department of Biochemistry, Dalton Cardiovascular Research Center, Institute for Data Science and Informatics, University of Missouri, Columbia, Missouri, USA.

Sucharita Dey (S)

Department of Bioscience and Bioengineering, Indian Institute of Technology Jodhpur, Karwar, Rajasthan, India.

Roland L Dunbrack (RL)

Institute for Cancer Research, Fox Chase Cancer Center, Philadelphia, Pennsylvania, USA.

Emmanuel D Levy (ED)

Department of Chemical and Structural Biology, Weizmann Institute of Science, Rehovot, Israel.

Shoshana J Wodak (SJ)

VIB-VUB Center for Structural Biology, Brussels, Belgium.

Articles similaires

Databases, Protein Protein Domains Protein Folding Proteins Deep Learning
Humans Middle Aged Female Male Surveys and Questionnaires
Adolescent Child Female Humans Male

Classifications MeSH