GenMap: ultra-fast computation of genome mappability.
Journal
Bioinformatics (Oxford, England)
ISSN: 1367-4811
Titre abrégé: Bioinformatics
Pays: England
ID NLM: 9808944
Informations de publication
Date de publication:
01 06 2020
01 06 2020
Historique:
received:
14
12
2019
revised:
23
03
2020
accepted:
31
03
2020
pubmed:
5
4
2020
medline:
29
12
2020
entrez:
5
4
2020
Statut:
ppublish
Résumé
Computing the uniqueness of k-mers for each position of a genome while allowing for up to e mismatches is computationally challenging. However, it is crucial for many biological applications such as the design of guide RNA for CRISPR experiments. More formally, the uniqueness or (k, e)-mappability can be described for every position as the reciprocal value of how often this k-mer occurs approximately in the genome, i.e. with up to e mismatches. We present a fast method GenMap to compute the (k, e)-mappability. We extend the mappability algorithm, such that it can also be computed across multiple genomes where a k-mer occurrence is only counted once per genome. This allows for the computation of marker sequences or finding candidates for probe design by identifying approximate k-mers that are unique to a genome or that are present in all genomes. GenMap supports different formats such as binary output, wig and bed files as well as csv files to export the location of all approximate k-mers for each genomic position. GenMap can be installed via bioconda. Binaries and C++ source code are available on https://github.com/cpockrandt/genmap.
Identifiants
pubmed: 32246826
pii: 5815974
doi: 10.1093/bioinformatics/btaa222
pmc: PMC7320602
doi:
Types de publication
Journal Article
Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
3687-3692Subventions
Organisme : NIGMS NIH HHS
ID : R35 GM130151
Pays : United States
Informations de copyright
© The Author(s) 2020. Published by Oxford University Press.
Références
Nucleic Acids Res. 1995 Nov 11;23(21):4407-14
pubmed: 7501463
Genome Biol. 2009;10(3):R25
pubmed: 19261174
J Biotechnol. 2017 Nov 10;261:157-168
pubmed: 28888961
PLoS One. 2012;7(1):e30377
pubmed: 22276185
Biochim Biophys Acta. 1976 Feb 18;425(1):30-40
pubmed: 1247616
Nat Methods. 2012 Dec;9(12):1185-8
pubmed: 23103880
Appl Environ Microbiol. 2000 Oct;66(10):4555-8
pubmed: 11010916
Nucleic Acids Res. 2018 Nov 16;46(20):e120
pubmed: 30169659
Nature. 2017 Apr 26;544(7651):427-433
pubmed: 28447635
Bioinformatics. 2011 Jan 15;27(2):272-4
pubmed: 21075741
Bioinformatics. 2012 Dec 15;28(24):3169-77
pubmed: 23060614