STRONG: metagenomics strain resolution on assembly graphs.


Journal

Genome biology
ISSN: 1474-760X
Titre abrégé: Genome Biol
Pays: England
ID NLM: 100960660

Informations de publication

Date de publication:
26 07 2021
Historique:
received: 22 08 2020
accepted: 29 06 2021
entrez: 27 7 2021
pubmed: 28 7 2021
medline: 22 1 2022
Statut: epublish

Résumé

We introduce STrain Resolution ON assembly Graphs (STRONG), which identifies strains de novo, from multiple metagenome samples. STRONG performs coassembly, and binning into metagenome assembled genomes (MAGs), and stores the coassembly graph prior to variant simplification. This enables the subgraphs and their unitig per-sample coverages, for individual single-copy core genes (SCGs) in each MAG, to be extracted. A Bayesian algorithm, BayesPaths, determines the number of strains present, their haplotypes or sequences on the SCGs, and abundances. STRONG is validated using synthetic communities and for a real anaerobic digestor time series generates haplotypes that match those observed from long Nanopore reads.

Identifiants

pubmed: 34311761
doi: 10.1186/s13059-021-02419-7
pii: 10.1186/s13059-021-02419-7
pmc: PMC8311964
doi:

Types de publication

Journal Article Research Support, N.I.H., Intramural Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

214

Subventions

Organisme : Biotechnology and Biological Sciences Research Council
ID : BB/R015171/1
Pays : United Kingdom
Organisme : Medical Research Council
ID : MR/M50161X/1
Pays : United Kingdom
Organisme : Biotechnology and Biological Sciences Research Council
ID : BB/N023285/1
Pays : United Kingdom
Organisme : Medical Research Council
ID : MR/S037195/1
Pays : United Kingdom
Organisme : Biotechnology and Biological Sciences Research Council
ID : BB/K003240/2
Pays : United Kingdom
Organisme : Medical Research Council
ID : MR/L015080/1
Pays : United Kingdom
Organisme : Biotechnology and Biological Sciences Research Council
ID : BB/L502029/1
Pays : United Kingdom
Organisme : Biotechnology and Biological Sciences Research Council
ID : BBS/E/T/000PR9817
Pays : United Kingdom

Informations de copyright

© 2021. The Author(s).

Références

mSystems. 2018 Mar 13;3(2):
pubmed: 29556534
ISME J. 2009 Feb;3(2):199-208
pubmed: 18830278
Genome Biol. 2017 Sep 21;18(1):181
pubmed: 28934976
Bioinformatics. 2021 May 1;37(4):575-577
pubmed: 32805048
Genetics. 2014 Jul;197(3):925-37
pubmed: 24793089
Nat Biotechnol. 2018 Oct;36(9):875-879
pubmed: 30125266
Genome Biol. 2014 Feb 03;15(2):R29
pubmed: 24485249
PeerJ. 2016 Oct 18;4:e2584
pubmed: 27781170
Genome Biol. 2020 Jul 6;21(1):164
pubmed: 32631445
BMC Bioinformatics. 2015 Aug 19;16:262
pubmed: 26286719
PeerJ. 2019 Jul 26;7:e7359
pubmed: 31388474
J Med Microbiol. 2000 May;49(5):397-401
pubmed: 10798550
Bioinformatics. 2015 May 15;31(10):1674-6
pubmed: 25609793
Bioinformatics. 2019 Dec 15;35(24):5086-5094
pubmed: 31147688
Nat Methods. 2014 Nov;11(11):1144-6
pubmed: 25218180
Bioinformatics. 2012 Oct 1;28(19):2520-2
pubmed: 22908215
Nat Biotechnol. 2017 Nov;35(11):1069-1076
pubmed: 28967887
Nat Biotechnol. 2015 Oct;33(10):1045-52
pubmed: 26344404
Microbiome. 2021 Jun 28;9(1):149
pubmed: 34183047
Bioinformatics. 2012 Feb 15;28(4):593-4
pubmed: 22199392
Genomics. 1988 Apr;2(3):231-9
pubmed: 3294162
Cell. 2019 Dec 12;179(7):1623-1635.e11
pubmed: 31835036
Curr Top Microbiol Immunol. 2013;358:3-32
pubmed: 23340801
Nat Microbiol. 2021 Jan;6(1):3-6
pubmed: 33349678
Genome Res. 2017 May;27(5):824-834
pubmed: 28298430
Bioinformatics. 2019 Oct 1;35(19):3599-3607
pubmed: 30851095
Nat Rev Microbiol. 2020 Sep;18(9):491-506
pubmed: 32499497
Genome Res. 2017 Apr;27(4):626-638
pubmed: 28167665
Bioinformatics. 2015 Jan 15;31(2):170-7
pubmed: 25266224
Nat Biotechnol. 2021 Jun;39(6):727-736
pubmed: 33462508
Nucleic Acids Res. 2000 Jan 1;28(1):33-6
pubmed: 10592175
Cell. 2019 Jan 24;176(3):649-662.e20
pubmed: 30661755
Bioinformatics. 2014 Sep 1;30(17):2447-55
pubmed: 24813214
Nature. 2015 Jul 9;523(7559):208-11
pubmed: 26083755
Bioinformatics. 2018 Sep 15;34(18):3094-3100
pubmed: 29750242
Bioinformatics. 2019 Nov 15;:
pubmed: 31730192
Nat Biotechnol. 2013 Jun;31(6):533-8
pubmed: 23707974
PeerJ. 2015 Oct 08;3:e1319
pubmed: 26500826
Nat Microbiol. 2018 Jul;3(7):804-813
pubmed: 29891866
PeerJ. 2015 Aug 27;3:e1165
pubmed: 26336640

Auteurs

Christopher Quince (C)

Organisms and Ecosystems, Earlham Institute, Norwich, NR4 7UZ, UK. christopher.quince@earlham.ac.uk.
Gut Microbes and Health, Quadram Institute, Norwich, NR4 7UQ, UK. christopher.quince@earlham.ac.uk.
Warwick Medical School, University of Warwick, Coventry, CV4 7AL, UK. christopher.quince@earlham.ac.uk.

Sergey Nurk (S)

Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, 20892, MD, USA. sergey.nurk@nih.gov.

Sebastien Raguideau (S)

Organisms and Ecosystems, Earlham Institute, Norwich, NR4 7UZ, UK.
Warwick Medical School, University of Warwick, Coventry, CV4 7AL, UK.

Robert James (R)

Gut Microbes and Health, Quadram Institute, Norwich, NR4 7UQ, UK.

Orkun S Soyer (OS)

School of Life Sciences, University of Warwick, Coventry, CV4 7AL, UK.

J Kimberly Summers (JK)

Warwick Medical School, University of Warwick, Coventry, CV4 7AL, UK.

Antoine Limasset (A)

Univ. Lille, CNRS, Inria, UMR 9189 - CRIStAL, Lille, France.

A Murat Eren (AM)

Department of Medicine, University of Chicago, Chicago, Illinois, USA.
Josephine Bay Paul Center, Marine Biological Laboratory, Woods Hole, Massachusetts, USA.

Rayan Chikhi (R)

Department of Computational Biology, Institut Pasteur, C3BI USR 3756 IP CNRS, Paris, France.

Aaron E Darling (AE)

The iThree institute, University of Technology Sydney, 15 Broadway, Ultimo, 2007, NSW, Australia.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing

Selecting optimal software code descriptors-The case of Java.

Yegor Bugayenko, Zamira Kholmatova, Artem Kruglov et al.
1.00
Software Algorithms Programming Languages

Exploring blood-brain barrier passage using atomic weighted vector and machine learning.

Yoan Martínez-López, Paulina Phoobane, Yanaima Jauriga et al.
1.00
Blood-Brain Barrier Machine Learning Humans Support Vector Machine Software

Classifications MeSH