Reference-free structural variant detection in microbiomes via long-read co-assembly graphs.


Journal

Bioinformatics (Oxford, England)
ISSN: 1367-4811
Titre abrégé: Bioinformatics
Pays: England
ID NLM: 9808944

Informations de publication

Date de publication:
28 Jun 2024
Historique:
medline: 28 6 2024
pubmed: 28 6 2024
entrez: 28 6 2024
Statut: ppublish

Résumé

The study of bacterial genome dynamics is vital for understanding the mechanisms underlying microbial adaptation, growth, and their impact on host phenotype. Structural variants (SVs), genomic alterations of 50 base pairs or more, play a pivotal role in driving evolutionary processes and maintaining genomic heterogeneity within bacterial populations. While SV detection in isolate genomes is relatively straightforward, metagenomes present broader challenges due to the absence of clear reference genomes and the presence of mixed strains. In response, our proposed method rhea, forgoes reference genomes and metagenome-assembled genomes (MAGs) by encompassing all metagenomic samples in a series (time or other metric) into a single co-assembly graph. The log fold change in graph coverage between successive samples is then calculated to call SVs that are thriving or declining. We show rhea to outperform existing methods for SV and horizontal gene transfer (HGT) detection in two simulated mock metagenomes, particularly as the simulated reads diverge from reference genomes and an increase in strain diversity is incorporated. We additionally demonstrate use cases for rhea on series metagenomic data of environmental and fermented food microbiomes to detect specific sequence alterations between successive time and temperature samples, suggesting host advantage. Our approach leverages previous work in assembly graph structural and coverage patterns to provide versatility in studying SVs across diverse and poorly characterized microbial communities for more comprehensive insights into microbial gene flux. rhea is open source and available at: https://github.com/treangenlab/rhea.

Identifiants

pubmed: 38940156
pii: 7700881
doi: 10.1093/bioinformatics/btae224
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

i58-i67

Subventions

Organisme : Ken Kennedy Institute Recruiting
Organisme : Rice University Wagoner Foreign Study Scholarship
Organisme : NIH HHS
ID : P01-AI152999
Pays : United States
Organisme : National Institute of Allergy and Infectious Diseases
Organisme : NSF
ID : IIS-2239114
Organisme : NSF
Organisme : MIM Universal Rules of Live
ID : EF-2126387
Organisme : European Union's Horizon 2020
Organisme : Marie Skłodowska-Curie
ID : 872539
Organisme : Carnegie Institution for Science
Organisme : Department of Energy Joint Genome Institute
Organisme : Office of Science
Organisme : Department of Energy
Organisme : NSF
ID : 2023333162

Informations de copyright

© The Author(s) 2024. Published by Oxford University Press.

Auteurs

Kristen D Curry (KD)

Department of Computer Science, Rice University, 6100 Main St., Houston, TX 77005, United States.
Department of Genomes and Genetics, Microbial Evolutionary Genomics, Institut Pasteur, Université Paris Cité, CNRS, UMR3525, Paris 75015, France.

Feiqiao Brian Yu (FB)

Arc Institute, Palo Alto, CA 94304, United States.

Summer E Vance (SE)

Department of Environmental Science, Policy, and Management, University of California, Berkeley, CA 94720, United States.

Santiago Segarra (S)

Department of Electrical and Computer Engineering, Rice University, Houston, TX 77005, United States.

Devaki Bhaya (D)

Carnegie Institution for Science, Department of Plant Biology, Stanford, CA 94305, United States.

Rayan Chikhi (R)

Department of Computational Biology, Institut Pasteur, Université Paris Cité, Paris 75015, France.

Eduardo P C Rocha (EPC)

Department of Genomes and Genetics, Microbial Evolutionary Genomics, Institut Pasteur, Université Paris Cité, CNRS, UMR3525, Paris 75015, France.

Todd J Treangen (TJ)

Department of Computer Science, Rice University, 6100 Main St., Houston, TX 77005, United States.

Articles similaires

Selecting optimal software code descriptors-The case of Java.

Yegor Bugayenko, Zamira Kholmatova, Artem Kruglov et al.
1.00
Software Algorithms Programming Languages
Populus Soil Microbiology Soil Microbiota Fungi
Aerosols Humans Decontamination Air Microbiology Masks
Coal Metagenome Phylogeny Bacteria Genome, Bacterial

Classifications MeSH