The K-mer File Format: a standardized and compact disk representation of sets of k-mers.
Journal
Bioinformatics (Oxford, England)
ISSN: 1367-4811
Titre abrégé: Bioinformatics
Pays: England
ID NLM: 9808944
Informations de publication
Date de publication:
15 09 2022
15 09 2022
Historique:
received:
18
03
2022
revised:
27
06
2022
accepted:
26
07
2022
pubmed:
30
7
2022
medline:
15
11
2022
entrez:
29
7
2022
Statut:
ppublish
Résumé
Bioinformatics applications increasingly rely on ad hoc disk storage of k-mer sets, e.g. for de Bruijn graphs or alignment indexes. Here, we introduce the K-mer File Format as a general lossless framework for storing and manipulating k-mer sets, realizing space savings of 3-5× compared to other formats, and bringing interoperability across tools. Format specification, C++/Rust API, tools: https://github.com/Kmer-File-Format/. Supplementary data are available at Bioinformatics online.
Identifiants
pubmed: 35904548
pii: 6651834
doi: 10.1093/bioinformatics/btac528
pmc: PMC9477520
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
4423-4425Subventions
Organisme : ANR Inception
ID : ANR-16-CONV-0005
Organisme : PRAIRIE
ID : ANR-19-P3IA-0001
Organisme : National Science Centre
ID : DEC-2019/33/B/ST6/02040
Organisme : National Science Foundation
ID : 1453527
Organisme : European Union's Horizon 2020 Research and Innovation Programme
Organisme : Marie Skłodowska-Curie
ID : 956229
Informations de copyright
© The Author(s) 2022. Published by Oxford University Press.
Références
Bioinformatics. 2017 Sep 01;33(17):2759-2761
pubmed: 28472236
J Comput Biol. 2021 Apr;28(4):381-394
pubmed: 33290137
Genome Biol. 2021 Apr 6;22(1):96
pubmed: 33823902
Bioinform Adv. 2022 Apr 29;2(1):vbac029
pubmed: 36699393
F1000Res. 2019 Jul 4;8:1006
pubmed: 31508216
Bioinformatics. 2013 Mar 1;29(5):652-3
pubmed: 23325618
BMC Bioinformatics. 2013 May 16;14:160
pubmed: 23679007
Bioinformatics. 2011 Mar 15;27(6):764-70
pubmed: 21217122
Bioinformatics. 2018 Sep 15;34(18):3094-3100
pubmed: 29750242
J Comput Biol. 2012 May;19(5):455-77
pubmed: 22506599
Genome Res. 2021 Jan;31(1):1-12
pubmed: 33328168