A multi-scale coevolutionary approach to predict interactions between protein domains.


Journal

PLoS computational biology
ISSN: 1553-7358
Titre abrégé: PLoS Comput Biol
Pays: United States
ID NLM: 101238922

Informations de publication

Date de publication:
10 2019
Historique:
received: 19 02 2019
accepted: 27 09 2019
revised: 31 10 2019
pubmed: 22 10 2019
medline: 6 2 2020
entrez: 22 10 2019
Statut: epublish

Résumé

Interacting proteins and protein domains coevolve on multiple scales, from their correlated presence across species, to correlations in amino-acid usage. Genomic databases provide rapidly growing data for variability in genomic protein content and in protein sequences, calling for computational predictions of unknown interactions. We first introduce the concept of direct phyletic couplings, based on global statistical models of phylogenetic profiles. They strongly increase the accuracy of predicting pairs of related protein domains beyond simpler correlation-based approaches like phylogenetic profiling (80% vs. 30-50% positives out of the 1000 highest-scoring pairs). Combined with the direct coupling analysis of inter-protein residue-residue coevolution, we provide multi-scale evidence for direct but unknown interaction between protein families. An in-depth discussion shows these to be biologically sensible and directly experimentally testable. Negative phyletic couplings highlight alternative solutions for the same functionality, including documented cases of convergent evolution. Thereby our work proves the strong potential of global statistical modeling approaches to genome-wide coevolutionary analysis, far beyond the established use for individual protein complexes and domain-domain interactions.

Identifiants

pubmed: 31634362
doi: 10.1371/journal.pcbi.1006891
pii: PCOMPBIOL-D-19-00283
pmc: PMC6822775
doi:

Substances chimiques

Amino Acids 0
Proteins 0

Types de publication

Journal Article Research Support, N.I.H., Extramural Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

e1006891

Déclaration de conflit d'intérêts

The authors have declared that no competing interests exist.

Références

Proc Natl Acad Sci U S A. 2011 Dec 6;108(49):E1293-301
pubmed: 22106262
Proc Natl Acad Sci U S A. 2017 Mar 28;114(13):E2662-E2671
pubmed: 28289198
Genome Res. 2000 Dec;10(12):1878-89
pubmed: 11116084
J Mol Biol. 2016 Nov 20;428(23):4669-4685
pubmed: 27732872
Science. 2002 Jul 5;297(5578):105-7
pubmed: 12029065
Evol Bioinform Online. 2008 Apr 24;4:97-107
pubmed: 19204811
Proc Natl Acad Sci U S A. 1999 Apr 13;96(8):4285-8
pubmed: 10200254
Phys Rev Lett. 2013 Apr 26;110(17):178102
pubmed: 23679784
J Biol Chem. 2009 Mar 13;284(11):6627-38
pubmed: 19124462
J Biol Chem. 1982 Aug 10;257(15):8799-805
pubmed: 6284744
Mol Biol Evol. 2009 Aug;26(8):1901-8
pubmed: 19435739
Mol Syst Biol. 2008;4:165
pubmed: 18277381
Proc Natl Acad Sci U S A. 2016 Oct 25;113(43):12186-12191
pubmed: 27729520
Genome Biol Evol. 2011;3:1265-75
pubmed: 21971516
Methods Mol Biol. 2012;804:167-77
pubmed: 22144153
FEMS Microbiol Lett. 2014 Jun;355(1):1-11
pubmed: 24810496
Curr Opin Struct Biol. 2018 Jun;50:26-32
pubmed: 29101847
Proc Natl Acad Sci U S A. 2013 Sep 24;110(39):15674-9
pubmed: 24009338
Bioinformatics. 2012 Jan 15;28(2):184-90
pubmed: 22101153
Bioinformatics. 2012 Sep 15;28(18):i389-i394
pubmed: 22962457
Nucleic Acids Res. 2014 Jan;42(Database issue):D364-73
pubmed: 24297255
Plant Physiol. 1995 Jan;107(1):7-12
pubmed: 7870841
J Bacteriol. 2009 Feb;191(4):1191-9
pubmed: 19060138
Proc Natl Acad Sci U S A. 2009 Dec 29;106(52):22124-9
pubmed: 20018738
J Mol Biol. 2004 Dec 10;344(5):1331-46
pubmed: 15561146
Bioinformatics. 2008 Feb 1;24(3):333-40
pubmed: 18057019
Proc Natl Acad Sci U S A. 2016 Oct 25;113(43):12180-12185
pubmed: 27663738
Proc Natl Acad Sci U S A. 2008 Jan 22;105(3):934-9
pubmed: 18199838
Proc Natl Acad Sci U S A. 2009 Jan 6;106(1):67-72
pubmed: 19116270
Proc Natl Acad Sci U S A. 2016 Dec 27;113(52):15018-15023
pubmed: 27965389
Eur J Cell Biol. 2011 Sep;90(9):705-10
pubmed: 21684627
PLoS One. 2016 Feb 16;11(2):e0149166
pubmed: 26882169
J Bacteriol. 2011 Oct;193(19):5242-51
pubmed: 21803992
Phys Rev E Stat Nonlin Soft Matter Phys. 2013 Jan;87(1):012707
pubmed: 23410359
PLoS One. 2011 May 09;6(5):e19729
pubmed: 21573011
PLoS Comput Biol. 2011 Dec;7(12):e1002340
pubmed: 22219725
Rep Prog Phys. 2018 Mar;81(3):032601
pubmed: 29120346
Nucleic Acids Res. 2013 Jul;41(Web Server issue):W232-7
pubmed: 23748951
Nat Biotechnol. 2003 Jul;21(7):790-5
pubmed: 12794638
J Bacteriol. 2011 Nov;193(21):5887-97
pubmed: 21908668
Plasmid. 2000 Mar;43(2):149-52
pubmed: 10686134
J Mol Biol. 2005 May 13;348(4):845-55
pubmed: 15843017
Nat Rev Genet. 2013 Apr;14(4):249-61
pubmed: 23458856
Nat Methods. 2009 Jan;6(1):91-7
pubmed: 19060903
Nucleic Acids Res. 2012 Jan;40(Database issue):D627-31
pubmed: 22096236
Mol Biol Evol. 2015 Sep;32(9):2456-68
pubmed: 25944916
Nucleic Acids Res. 2012 Jan;40(Database issue):D841-6
pubmed: 22121220
Proteins. 2002 May 1;47(2):219-27
pubmed: 11933068
Elife. 2014 May 01;3:e02030
pubmed: 24842992
PLoS Comput Biol. 2013;9(8):e1003176
pubmed: 23990764
Nucleic Acids Res. 2016 Jan 4;44(D1):D279-85
pubmed: 26673716
Elife. 2014 Sep 25;3:
pubmed: 25255213
Cell. 1996 Jul 26;86(2):321-9
pubmed: 8706136
Genome Biol. 2003;4(9):R54
pubmed: 12952533
FEBS Lett. 2008 Apr 9;582(8):1251-8
pubmed: 18294967
J Androl. 2004 Jul-Aug;25(4):479-93
pubmed: 15223837
Methods Enzymol. 1994;235:527-40
pubmed: 8057924
PLoS One. 2014 Jun 23;9(6):e100851
pubmed: 24955841
Proc Natl Acad Sci U S A. 2018 Jan 23;115(4):690-695
pubmed: 29311320
ISME J. 2009 May;3(5):563-72
pubmed: 19212430
Curr Opin Struct Biol. 2002 Jun;12(3):368-73
pubmed: 12127457

Auteurs

Giancarlo Croce (G)

Sorbonne Université, CNRS, Institut de Biologie Paris Seine, Biologie computationnelle et quantitative-LCQB, Paris, France.

Thomas Gueudré (T)

Italian Institute for Genomic Medicine, Torino, Italy.

Maria Virginia Ruiz Cuevas (MV)

Sorbonne Université, CNRS, Institut de Biologie Paris Seine, Biologie computationnelle et quantitative-LCQB, Paris, France.

Victoria Keidel (V)

Department of Basic Medical Sciences, College of Osteopathic Medicine of the Pacific, Western University of Health Sciences, Pomona CA, United States of America.

Matteo Figliuzzi (M)

Sorbonne Université, CNRS, Institut de Biologie Paris Seine, Biologie computationnelle et quantitative-LCQB, Paris, France.

Hendrik Szurmant (H)

Department of Basic Medical Sciences, College of Osteopathic Medicine of the Pacific, Western University of Health Sciences, Pomona CA, United States of America.

Martin Weigt (M)

Sorbonne Université, CNRS, Institut de Biologie Paris Seine, Biologie computationnelle et quantitative-LCQB, Paris, France.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C

Classifications MeSH