CLOCI: unveiling cryptic fungal gene clusters with generalized detection.


Journal

Nucleic acids research
ISSN: 1362-4962
Titre abrégé: Nucleic Acids Res
Pays: England
ID NLM: 0411011

Informations de publication

Date de publication:
17 Jul 2024
Historique:
accepted: 10 07 2024
revised: 01 07 2024
received: 13 11 2023
medline: 17 7 2024
pubmed: 17 7 2024
entrez: 17 7 2024
Statut: aheadofprint

Résumé

Gene clusters are genomic loci that contain multiple genes that are functionally and genetically linked. Gene clusters collectively encode diverse functions, including small molecule biosynthesis, nutrient assimilation, metabolite degradation, and production of proteins essential for growth and development. Identifying gene clusters is a powerful tool for small molecule discovery and provides insight into the ecology and evolution of organisms. Current detection algorithms focus on canonical 'core' biosynthetic functions many gene clusters encode, while overlooking uncommon or unknown cluster classes. These overlooked clusters are a potential source of novel natural products and comprise an untold portion of overall gene cluster repertoires. Unbiased, function-agnostic detection algorithms therefore provide an opportunity to reveal novel classes of gene clusters and more precisely define genome organization. We present CLOCI (Co-occurrence Locus and Orthologous Cluster Identifier), an algorithm that identifies gene clusters using multiple proxies of selection for coordinated gene evolution. Our approach generalizes gene cluster detection and gene cluster family circumscription, improves detection of multiple known functional classes, and unveils non-canonical gene clusters. CLOCI is suitable for genome-enabled small molecule mining, and presents an easily tunable approach for delineating gene cluster families and homologous loci.

Identifiants

pubmed: 39016185
pii: 7715716
doi: 10.1093/nar/gkae625
pii:
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Subventions

Organisme : National Science Foundation
ID : DEB-1638999
Organisme : Ohio State University

Informations de copyright

© The Author(s) 2024. Published by Oxford University Press on behalf of Nucleic Acids Research.

Auteurs

Zachary Konkel (Z)

Department of Plant Pathology, The Ohio State University, Columbus, OH 43210, USA.
Center for Applied Plant Sciences, The Ohio State University, Columbus, OH 43210, USA.

Laura Kubatko (L)

Department of Ecology and Organismal Biology, The Ohio State University, Columbus, OH 43210, USA.
Department of Statistics, The Ohio State University, Columbus, OH 43210, USA.

Jason C Slot (JC)

Department of Plant Pathology, The Ohio State University, Columbus, OH 43210, USA.
Center for Applied Plant Sciences, The Ohio State University, Columbus, OH 43210, USA.

Classifications MeSH