A Guide to Gene-Centric Analysis Using TreeSAPP.

metagenomics methanogenesis microbial ecology phylogenetic placement

Journal

Current protocols
ISSN: 2691-1299
Titre abrégé: Curr Protoc
Pays: United States
ID NLM: 101773894

Informations de publication

Date de publication:
Feb 2023
Historique:
entrez: 21 2 2023
pubmed: 22 2 2023
medline: 25 2 2023
Statut: ppublish

Résumé

Gene-centric analysis is commonly used to chart the structure, function, and activity of microbial communities in natural and engineered environments. A common approach is to create custom ad hoc reference marker gene sets, but these come with the typical disadvantages of inaccuracy and limited utility beyond assigning query sequences taxonomic labels. The Tree-based Sensitive and Accurate Phylogenetic Profiler (TreeSAPP) software package standardizes analysis of phylogenetic and functional marker genes and improves predictive performance using a classification algorithm that leverages information-rich reference packages consisting of a multiple sequence alignment, a profile hidden Markov model, taxonomic lineage information, and a phylogenetic tree. Here, we provide a set of protocols that link the various analysis modules in TreeSAPP into a coherent process that both informs and directs the user experience. This workflow, initiated from a collection of candidate reference sequences, progresses through construction and refinement of a reference package to marker identification and normalized relative abundance calculations for homologous sequences in metagenomic and metatranscriptomic datasets. The alpha subunit of methyl-coenzyme M reductase (McrA) involved in biological methane cycling is presented as a use case given its dual role as a phylogenetic and functional marker gene driving an ecologically relevant process. These protocols fill several gaps in prior TreeSAPP documentation and provide best practices for reference package construction and refinement, including manual curation steps from trusted sources in support of reproducible gene-centric analysis. © 2023 The Authors. Current Protocols published by Wiley Periodicals LLC. Basic Protocol 1: Creating reference packages Support Protocol 1: Installing TreeSAPP Support Protocol 2: Annotating traits within a phylogenetic context Basic Protocol 2: Updating reference packages Basic Protocol 3: Calculating relative abundance of genes in metagenomic and metatranscriptomic datasets.

Identifiants

pubmed: 36801973
doi: 10.1002/cpz1.671
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

e671

Subventions

Organisme : Natural Sciences and Engineering Research Council
Organisme : Genome Canada
Organisme : Genome British Columbia
Organisme : the Digital Research Alliance of Canada
Organisme : the G. Unger Vetlesen and Ambrose Monell Foundations
Organisme : the U.S. Department of Energy (DOE) Joint Genome Institute (JGI)
Organisme : and the Facilities Integrating Collaborations for User Science (FICUS)

Informations de copyright

© 2023 The Authors. Current Protocols published by Wiley Periodicals LLC.

Auteurs

Connor Morgan-Lang (C)

Graduate Program in Bioinformatics, University of British Columbia, Genome Sciences Centre, Vancouver, British Columbia, Canada.

Steven J Hallam (SJ)

Graduate Program in Bioinformatics, University of British Columbia, Genome Sciences Centre, Vancouver, British Columbia, Canada.
Department of Microbiology and Immunology, University of British Columbia, Vancouver, British Columbia, Canada.
Genome Science and Technology Program, University of British Columbia, Vancouver, British Columbia, Canada.
Life Sciences Institute, University of British Columbia, Vancouver, British Columbia, Canada.
ECOSCOPE Training Program, University of British Columbia, Vancouver, British Columbia, Canada.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing

Selecting optimal software code descriptors-The case of Java.

Yegor Bugayenko, Zamira Kholmatova, Artem Kruglov et al.
1.00
Software Algorithms Programming Languages
Animals Hemiptera Insect Proteins Phylogeny Insecticides
Amaryllidaceae Alkaloids Lycoris NADPH-Ferrihemoprotein Reductase Gene Expression Regulation, Plant Plant Proteins

Classifications MeSH