NCycDB: a curated integrative database for fast and accurate metagenomic profiling of nitrogen cycling genes.


Journal

Bioinformatics (Oxford, England)
ISSN: 1367-4811
Titre abrégé: Bioinformatics
Pays: England
ID NLM: 9808944

Informations de publication

Date de publication:
15 03 2019
Historique:
received: 25 04 2018
revised: 06 08 2018
accepted: 23 08 2018
pubmed: 31 8 2018
medline: 1 1 2020
entrez: 31 8 2018
Statut: ppublish

Résumé

The nitrogen (N) cycle is a collection of important biogeochemical pathways in the Earth ecosystem and has gained extensive foci in ecology and environmental studies. Currently, shotgun metagenome sequencing has been widely applied to explore gene families responsible for N cycle processes. However, there are problems in applying publically available orthology databases to profile N cycle gene families in shotgun metagenomes, such as inefficient database searching, unspecific orthology groups and low coverage of N cycle genes and/or gene (sub)families. To solve these issues, this study built a manually curated integrative database (NCycDB) for fast and accurate profiling of N cycle gene (sub)families from shotgun metagenome sequencing data. NCycDB contains a total of 68 gene (sub)families and covers eight N cycle processes with 84 759 and 219 146 representative sequences at 95 and 100% identity cutoffs, respectively. We also identified 1958 homologous orthology groups and included corresponding sequences in the database to avoid false positive assignments due to 'small database' issues. We applied NCycDB to characterize N cycle gene (sub)families in 52 shotgun metagenomes from the Global Ocean Sampling expedition. Further analysis showed that the structure and composition of N cycle gene families were most strongly correlated with latitude and temperature. NCycDB is expected to facilitate N cycle studies via shotgun metagenome sequencing approaches in various environments. The framework developed in this study can be served as a good reference to build similar knowledge-based functional gene databases in various processes and pathways. NCycDB database files are available at https://github.com/qichao1984/NCyc. Supplementary data are available at Bioinformatics online.

Identifiants

pubmed: 30165481
pii: 5085377
doi: 10.1093/bioinformatics/bty741
doi:

Substances chimiques

Nitrogen N762921K75

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

1040-1048

Informations de copyright

© The Author(s) 2018. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Auteurs

Qichao Tu (Q)

Institute of Marine Science and Technology, Shandong University, Qingdao, China.

Lu Lin (L)

Institute of Marine Science and Technology, Shandong University, Qingdao, China.

Lei Cheng (L)

Department of Ecology, College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang, China.

Ye Deng (Y)

Research Center for Eco-Environmental Science, Chinese Academy of Sciences, Beijing, China.
College of Resources and Environment, University of Chinese Academy of Sciences, Beijing, China.

Zhili He (Z)

Department of Environmental Science, School of Environmental Science and Engineering, Sun Yat-Sen University, Guangzhou, Guangdong, China.
Department of Agriculture, College of Agriculture, Hunan Agricultural University, Changsha, China.

Articles similaires

Coal Metagenome Phylogeny Bacteria Genome, Bacterial
Lakes Salinity Archaea Bacteria Microbiota
Rivers Turkey Biodiversity Environmental Monitoring Animals
1.00
Iran Environmental Monitoring Seasons Ecosystem Forests

Classifications MeSH