CAGECAT: The CompArative GEne Cluster Analysis Toolbox for rapid search and visualisation of homologous gene clusters.
Biosynthetic
Colocalized
Comparative analysis
Gene cluster
Homology search
Secondary metabolite
Journal
BMC bioinformatics
ISSN: 1471-2105
Titre abrégé: BMC Bioinformatics
Pays: England
ID NLM: 100965194
Informations de publication
Date de publication:
03 May 2023
03 May 2023
Historique:
received:
10
02
2023
accepted:
27
04
2023
medline:
4
5
2023
pubmed:
3
5
2023
entrez:
2
5
2023
Statut:
epublish
Résumé
Co-localized sets of genes that encode specialized functions are common across microbial genomes and occur in genomes of larger eukaryotes as well. Important examples include Biosynthetic Gene Clusters (BGCs) that produce specialized metabolites with medicinal, agricultural, and industrial value (e.g. antimicrobials). Comparative analysis of BGCs can aid in the discovery of novel metabolites by highlighting distribution and identifying variants in public genomes. Unfortunately, gene-cluster-level homology detection remains inaccessible, time-consuming and difficult to interpret. The comparative gene cluster analysis toolbox (CAGECAT) is a rapid and user-friendly platform to mitigate difficulties in comparative analysis of whole gene clusters. The software provides homology searches and downstream analyses without the need for command-line or programming expertise. By leveraging remote BLAST databases, which always provide up-to-date results, CAGECAT can yield relevant matches that aid in the comparison, taxonomic distribution, or evolution of an unknown query. The service is extensible and interoperable and implements the cblaster and clinker pipelines to perform homology search, filtering, gene neighbourhood estimation, and dynamic visualisation of resulting variant BGCs. With the visualisation module, publication-quality figures can be customized directly from a web-browser, which greatly accelerates their interpretation via informative overlays to identify conserved genes in a BGC query. Overall, CAGECAT is an extensible software that can be interfaced via a standard web-browser for whole region homology searches and comparison on continually updated genomes from NCBI. The public web server and installable docker image are open source and freely available without registration at: https://cagecat.bioinformatics.nl .
Sections du résumé
BACKGROUND
BACKGROUND
Co-localized sets of genes that encode specialized functions are common across microbial genomes and occur in genomes of larger eukaryotes as well. Important examples include Biosynthetic Gene Clusters (BGCs) that produce specialized metabolites with medicinal, agricultural, and industrial value (e.g. antimicrobials). Comparative analysis of BGCs can aid in the discovery of novel metabolites by highlighting distribution and identifying variants in public genomes. Unfortunately, gene-cluster-level homology detection remains inaccessible, time-consuming and difficult to interpret.
RESULTS
RESULTS
The comparative gene cluster analysis toolbox (CAGECAT) is a rapid and user-friendly platform to mitigate difficulties in comparative analysis of whole gene clusters. The software provides homology searches and downstream analyses without the need for command-line or programming expertise. By leveraging remote BLAST databases, which always provide up-to-date results, CAGECAT can yield relevant matches that aid in the comparison, taxonomic distribution, or evolution of an unknown query. The service is extensible and interoperable and implements the cblaster and clinker pipelines to perform homology search, filtering, gene neighbourhood estimation, and dynamic visualisation of resulting variant BGCs. With the visualisation module, publication-quality figures can be customized directly from a web-browser, which greatly accelerates their interpretation via informative overlays to identify conserved genes in a BGC query.
CONCLUSION
CONCLUSIONS
Overall, CAGECAT is an extensible software that can be interfaced via a standard web-browser for whole region homology searches and comparison on continually updated genomes from NCBI. The public web server and installable docker image are open source and freely available without registration at: https://cagecat.bioinformatics.nl .
Identifiants
pubmed: 37131131
doi: 10.1186/s12859-023-05311-2
pii: 10.1186/s12859-023-05311-2
pmc: PMC10155394
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
181Subventions
Organisme : European Research Council
ID : 948770-DECIPHER
Pays : International
Informations de copyright
© 2023. The Author(s).
Références
Nucleic Acids Res. 2020 Jan 8;48(D1):D454-D458
pubmed: 31612915
Nat Chem Biol. 2020 Jan;16(1):60-68
pubmed: 31768033
Front Microbiol. 2015 Mar 03;6:161
pubmed: 25784900
Nucleic Acids Res. 2021 Jan 8;49(D1):D639-D643
pubmed: 33152079
Nucleic Acids Res. 2017 Jan 4;45(D1):D560-D565
pubmed: 27903896
Nat Chem Biol. 2015 Sep;11(9):639-48
pubmed: 26284671
Genome Res. 2020 Dec 23;:
pubmed: 33361114
Nucleic Acids Res. 2020 Jan 8;48(D1):D422-D430
pubmed: 31665416
Mol Biol Evol. 2013 May;30(5):1218-23
pubmed: 23412913
Bioinformatics. 2021 Aug 25;37(16):2473-2475
pubmed: 33459763
Nucleic Acids Res. 2017 Jul 3;45(W1):W49-W54
pubmed: 28460067
Nucleic Acids Res. 2022 Jan 7;50(D1):D736-D740
pubmed: 34718689
Nucleic Acids Res. 2019 Jul 2;47(W1):W81-W87
pubmed: 31032519
Bioinform Adv. 2021 Aug 05;1(1):vbab016
pubmed: 36700093
Cell. 2014 Jul 17;158(2):412-421
pubmed: 25036635
J Med Libr Assoc. 2010 Apr;98(2):171-5
pubmed: 20428285
Nucleic Acids Res. 2019 Oct 10;47(18):e110
pubmed: 31400112
Gigascience. 2021 Jan 13;10(1):
pubmed: 33438731
Nucleic Acids Res. 2021 Jan 8;49(D1):D490-D497
pubmed: 33010170
Appl Environ Microbiol. 1999 Mar;65(3):1236-40
pubmed: 10049889