Identification of hidden associations among eukaryotic genes through statistical analysis of coevolutionary transitions.

coevolution gene association glyoxylate cycle purine catabolism statistical significance

Journal

Proceedings of the National Academy of Sciences of the United States of America
ISSN: 1091-6490
Titre abrégé: Proc Natl Acad Sci U S A
Pays: United States
ID NLM: 7505876

Informations de publication

Date de publication:
18 04 2023
Historique:
medline: 14 4 2023
entrez: 12 4 2023
pubmed: 13 4 2023
Statut: ppublish

Résumé

Coevolution at the gene level, as reflected by correlated events of gene loss or gain, can be revealed by phylogenetic profile analysis. The optimal method and metric for comparing phylogenetic profiles, especially in eukaryotic genomes, are not yet established. Here, we describe a procedure suitable for large-scale analysis, which can reveal coevolution based on the assessment of the statistical significance of correlated presence/absence transitions between gene pairs. This metric can identify coevolution in profiles with low overall similarities and is not affected by similarities lacking coevolutionary information. We applied the procedure to a large collection of 60,912 orthologous gene groups (orthogroups) in 1,264 eukaryotic genomes extracted from OrthoDB. We found significant cotransition scores for 7,825 orthogroups associated in 2,401 coevolving modules linking known and unknown genes in protein complexes and biological pathways. To demonstrate the ability of the method to predict hidden gene associations, we validated through experiments the involvement of vertebrate malate synthase-like genes in the conversion of (

Identifiants

pubmed: 37043529
doi: 10.1073/pnas.2218329120
pmc: PMC10120013
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

e2218329120

Subventions

Organisme : Ministero dell'Istruzione, dell'Università e della Ricerca (MIUR)
ID : 2017483NH8

Commentaires et corrections

Type : CommentIn

Références

Nat Chem Biol. 2006 Mar;2(3):144-8
pubmed: 16462750
Bioinformatics. 2007 Jan 1;23(1):14-20
pubmed: 17090580
Sci Rep. 2016 Dec 06;6:38302
pubmed: 27922051
Genome Res. 2019 Mar;29(3):439-448
pubmed: 30718334
Nucleic Acids Res. 2021 Jan 8;49(D1):D389-D393
pubmed: 33196836
Proc Natl Acad Sci U S A. 2023 Apr 18;120(16):e2218329120
pubmed: 37043529
Biochem J. 1995 Nov 15;312 ( Pt 1):315-8
pubmed: 7492331
Microb Comp Genomics. 1998;3(4):199-217
pubmed: 10027190
PLoS Comput Biol. 2010 Jan;6(1):e1000633
pubmed: 20052271
Bioinformatics. 2019 Jul 15;35(14):2504-2506
pubmed: 30508066
Nat Commun. 2021 Nov 9;12(1):6454
pubmed: 34753957
Mol Biol Evol. 2015 Apr;32(4):835-45
pubmed: 25739733
Nat Rev Genet. 2018 Oct;19(10):635-648
pubmed: 30018367
Genome Biol. 2005;6(1):R2
pubmed: 15642094
Nat Plants. 2020 Mar;6(3):280-289
pubmed: 32123350
NAR Genom Bioinform. 2019 Oct 24;2(1):lqz012
pubmed: 33575564
Cell Rep. 2015 Feb 17;10(6):993-1006
pubmed: 25683721
Proc Natl Acad Sci U S A. 2013 Sep 24;110(39):15674-9
pubmed: 24009338
BMC Bioinformatics. 2007 May 22;8 Suppl 4:S7
pubmed: 17570150
Innovation (Camb). 2021 Jul 01;2(3):100141
pubmed: 34557778
Plant Physiol. 1988 Apr;86(4):1084-8
pubmed: 16666035
BMC Genomics. 2021 Oct 29;22(1):774
pubmed: 34715785
J Biol Chem. 1979 Jun 25;254(12):5272-5
pubmed: 447647
Bioinformatics. 2014 May 1;30(9):1312-3
pubmed: 24451623
J Mol Biol. 2014 Aug 26;426(17):3028-40
pubmed: 25020232
Biochim Biophys Acta. 1970 Mar 18;198(3):569-82
pubmed: 4314237
Nat Chem Biol. 2010 Jan;6(1):19-21
pubmed: 19935661
Bioinformatics. 2019 Feb 1;35(3):526-528
pubmed: 30016406
Nat Genet. 2005 Jul;37(7):777-82
pubmed: 15951822
J Biol Chem. 2004 Oct 8;279(41):42916-23
pubmed: 15272001
Science. 1997 Oct 24;278(5338):631-7
pubmed: 9381173
Gene. 2002 May 1;289(1-2):13-7
pubmed: 12036579
Fungal Genet Biol. 2014 Aug;69:96-108
pubmed: 24970358
Genome Biol Evol. 2016 Sep 11;8(9):2683-701
pubmed: 27604879
Protein Sci. 2008 Nov;17(11):1935-45
pubmed: 18714089
Cell Syst. 2015 Aug 26;1(2):106-15
pubmed: 27135799
Cell. 2019 May 30;177(6):1480-1494.e19
pubmed: 31056283
Bioinformatics. 1999 Jan;15(1):87-8
pubmed: 10068696
PLoS One. 2015 Sep 22;10(9):e0139006
pubmed: 26394049
Nat Rev Genet. 2013 Apr;14(4):249-61
pubmed: 23458856
Proc Natl Acad Sci U S A. 2006 Aug 29;103(35):13126-31
pubmed: 16924101
Biol Direct. 2006 Oct 23;1:31
pubmed: 17059607
Cell. 2014 Jul 3;158(1):213-25
pubmed: 24995987
Nucleic Acids Res. 2021 Jan 8;49(D1):D373-D379
pubmed: 33174605
PLoS Comput Biol. 2005 Jun;1(1):e3
pubmed: 16103904
J Bacteriol. 2000 Oct;182(20):5841-8
pubmed: 11004185
PLoS Comput Biol. 2020 Jul 22;16(7):e1007553
pubmed: 32697802
Comp Biochem Physiol Part D Genomics Proteomics. 2009 Sep;4(3):174-8
pubmed: 20161190
EMBO Rep. 2017 Sep;18(9):1559-1571
pubmed: 28642229
Nucleic Acids Res. 1992 Nov 11;20(21):5677-86
pubmed: 1454530
Nucleic Acids Res. 2002 Apr 1;30(7):1575-84
pubmed: 11917018
Biochem Biophys Res Commun. 1974 May 20;58(2):419-26
pubmed: 4601336
Trends Genet. 1996 Sep;12(9):334-6
pubmed: 8855656
J Mol Biol. 2005 May 13;348(4):857-70
pubmed: 15843018
Med Mycol. 2009 Nov;47(7):734-44
pubmed: 19888806
Proc Natl Acad Sci U S A. 1999 Apr 13;96(8):4285-8
pubmed: 10200254
Proc Natl Acad Sci U S A. 1998 May 26;95(11):6073-8
pubmed: 9600919
NAR Genom Bioinform. 2021 Apr 20;3(2):lqab024
pubmed: 33928243
Nat Rev Genet. 2020 Dec;21(12):754-768
pubmed: 32860017
PLoS Genet. 2019 Feb 21;15(2):e1007986
pubmed: 30789903
Database (Oxford). 2013 Oct 09;2013:bat071
pubmed: 24107613
Mol Biol Evol. 2017 Aug 1;34(8):2016-2034
pubmed: 28460059
Dev Biol. 1995 Jun;169(2):399-414
pubmed: 7781887
Proc Natl Acad Sci U S A. 1998 May 26;95(11):5849-56
pubmed: 9600883
Syst Biol. 2010 May;59(3):307-21
pubmed: 20525638
Biochem J. 1986 Apr 15;235(2):391-7
pubmed: 3741398
Bioinformatics. 2011 Mar 1;27(5):700-6
pubmed: 21169380

Auteurs

Elena Dembech (E)

Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, Parma 43124, Italy.

Marco Malatesta (M)

Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, Parma 43124, Italy.

Carlo De Rito (C)

Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, Parma 43124, Italy.

Giulia Mori (G)

Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, Parma 43124, Italy.

Davide Cavazzini (D)

Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, Parma 43124, Italy.

Andrea Secchi (A)

Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, Parma 43124, Italy.

Francesco Morandin (F)

Department of Mathematical, Physical and Computer Sciences, University of Parma, Parma 43124, Italy.

Riccardo Percudani (R)

Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, Parma 43124, Italy.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing
Animals Hemiptera Insect Proteins Phylogeny Insecticides
Amaryllidaceae Alkaloids Lycoris NADPH-Ferrihemoprotein Reductase Gene Expression Regulation, Plant Plant Proteins
Drought Resistance Gene Expression Profiling Gene Expression Regulation, Plant Gossypium Multigene Family

Classifications MeSH