Mash Screen: high-throughput sequence containment estimation for genome discovery.


Journal

Genome biology
ISSN: 1474-760X
Titre abrégé: Genome Biol
Pays: England
ID NLM: 100960660

Informations de publication

Date de publication:
05 11 2019
Historique:
received: 27 02 2019
accepted: 27 09 2019
entrez: 7 11 2019
pubmed: 7 11 2019
medline: 6 2 2020
Statut: epublish

Résumé

The MinHash algorithm has proven effective for rapidly estimating the resemblance of two genomes or metagenomes. However, this method cannot reliably estimate the containment of a genome within a metagenome. Here, we describe an online algorithm capable of measuring the containment of genomes and proteomes within either assembled or unassembled sequencing read sets. We describe several use cases, including contamination screening and retrospective analysis of metagenomes for novel genome discovery. Using this tool, we provide containment estimates for every NCBI RefSeq genome within every SRA metagenome and demonstrate the identification of a novel polyomavirus species from a public metagenome.

Identifiants

pubmed: 31690338
doi: 10.1186/s13059-019-1841-x
pii: 10.1186/s13059-019-1841-x
pmc: PMC6833257
doi:

Substances chimiques

Proteome 0

Types de publication

Journal Article Research Support, N.I.H., Intramural

Langues

eng

Sous-ensembles de citation

IM

Pagination

232

Références

J Clin Microbiol. 2018 Mar 26;56(4):
pubmed: 29305551
Genome Res. 2016 Dec;26(12):1721-1729
pubmed: 27852649
Bioinformatics. 2012 Feb 15;28(4):593-4
pubmed: 22199392
mSphere. 2018 Dec 12;3(6):
pubmed: 30541782
Nat Commun. 2019 Jul 11;10(1):3066
pubmed: 31296857
Cell Syst. 2018 Aug 22;7(2):201-207.e4
pubmed: 29936185
Nat Biotechnol. 2019 Feb;37(2):152-159
pubmed: 30718882
Bioinformatics. 2009 Aug 15;25(16):2078-9
pubmed: 19505943
Genome Biol. 2016 Jun 20;17(1):132
pubmed: 27323842
Bioinformatics. 2019 Feb 15;35(4):671-673
pubmed: 30052763
Environ Microbiol. 2013 Jun;15(6):1882-99
pubmed: 23387867
PLoS Biol. 2015 Jul 07;13(7):e1002195
pubmed: 26151137
J Infect Dis. 2014 Nov 15;210(10):1595-9
pubmed: 24795478
PLoS One. 2018 Oct 23;13(10):e0206273
pubmed: 30352098
Curr Microbiol. 2017 Oct;74(10):1137-1147
pubmed: 28687946
Nature. 2017 Oct 5;550(7674):61-66
pubmed: 28953883
PLoS Pathog. 2016 Apr 19;12(4):e1005574
pubmed: 27093155
Genome Biol. 2014 Mar 03;15(3):R46
pubmed: 24580807
Nucleic Acids Res. 2011 Jan;39(Database issue):D19-21
pubmed: 21062823
Genome Res. 2017 May;27(5):824-834
pubmed: 28298430
Genome Biol. 2004;5(10):R80
pubmed: 15461798
Microbiome. 2016 Mar 14;4:12
pubmed: 26975510
Genome Biol. 2019 Dec 4;20(1):265
pubmed: 31801633
Nat Methods. 2015 Jan;12(1):59-60
pubmed: 25402007
Nat Biotechnol. 2016 Mar;34(3):300-2
pubmed: 26854477
J Gen Virol. 2017 Dec;98(12):3060-3067
pubmed: 29095685
Emerg Infect Dis. 2016 Apr;22(4):617-24
pubmed: 26982594
Nucleic Acids Res. 2016 Jan 4;44(D1):D733-45
pubmed: 26553804

Auteurs

Brian D Ondov (BD)

Genome Informatics section, National Human Genome Research Institute, Bethesda, MD, USA. brian.ondov@nih.gov.
Department of Computer Science, University of Maryland College Park, College Park, MD, USA. brian.ondov@nih.gov.

Gabriel J Starrett (GJ)

Tumor Virus Molecular Biology section, National Cancer Institute, Bethesda, MD, USA.

Anna Sappington (A)

Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, USA.

Aleksandra Kostic (A)

Department of Computer Science, Princeton University, Princeton, NJ, USA.

Sergey Koren (S)

Genome Informatics section, National Human Genome Research Institute, Bethesda, MD, USA.

Christopher B Buck (CB)

Tumor Virus Molecular Biology section, National Cancer Institute, Bethesda, MD, USA.

Adam M Phillippy (AM)

Genome Informatics section, National Human Genome Research Institute, Bethesda, MD, USA.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH