Efficient dynamic variation graphs.


Journal

Bioinformatics (Oxford, England)
ISSN: 1367-4811
Titre abrégé: Bioinformatics
Pays: England
ID NLM: 9808944

Informations de publication

Date de publication:
29 01 2021
Historique:
received: 21 04 2020
revised: 20 06 2020
accepted: 09 07 2020
pubmed: 12 10 2020
medline: 10 8 2021
entrez: 11 10 2020
Statut: ppublish

Résumé

Pangenomics is a growing field within computational genomics. Many pangenomic analyses use bidirected sequence graphs as their core data model. However, implementing and correctly using this data model can be difficult, and the scale of pangenomic datasets can be challenging to work at. These challenges have impeded progress in this field. Here, we present a stack of two C++ libraries, libbdsg and libhandlegraph, which use a simple, field-proven interface, designed to expose elementary features of these graphs while preventing common graph manipulation mistakes. The libraries also provide a Python binding. Using a diverse collection of pangenome graphs, we demonstrate that these tools allow for efficient construction and manipulation of large genome graphs with dense variation. For instance, the speed and memory usage are up to an order of magnitude better than the prior graph implementation in the VG toolkit, which has now transitioned to using libbdsg's implementations. libhandlegraph and libbdsg are available under an MIT License from https://github.com/vgteam/libhandlegraph and https://github.com/vgteam/libbdsg.

Identifiants

pubmed: 33040146
pii: 5872523
doi: 10.1093/bioinformatics/btaa640
pmc: PMC7850124
doi:

Types de publication

Journal Article Research Support, N.I.H., Extramural Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

5139-5144

Subventions

Organisme : NHLBI NIH HHS
ID : U01 HL137183
Pays : United States
Organisme : Federal Ministry for Economic Affairs and Energy of Germany
Organisme : W. M. Keck Foundation
ID : DT06172015
Organisme : NHGRI NIH HHS
ID : T32 HG008345
Pays : United States
Organisme : Central Innovation Programme
Organisme : NHGRI NIH HHS
ID : R01 HG010485
Pays : United States

Informations de copyright

© The Author(s) 2020. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Auteurs

Jordan M Eizenga (JM)

Genomics Institute, Santa Cruz, CA 95064, USA.
Biomolecular Engineering and Bioinformatics, University of California Santa Cruz, Santa Cruz, CA 95064, USA.

Adam M Novak (AM)

Genomics Institute, Santa Cruz, CA 95064, USA.
Biomolecular Engineering and Bioinformatics, University of California Santa Cruz, Santa Cruz, CA 95064, USA.

Emily Kobayashi (E)

Genomics Institute, Santa Cruz, CA 95064, USA.
Bioinformatics and Systems Biology, University of California San Diego, La Jolla, CA 92093, USA.

Flavia Villani (F)

Institute of Genetics and Biophysics, Consiglio Nazionale di Ricerche, Naples 80131, Italy.
Biotecnologie Mediche, Università degli Studi di Napoli Federico II, Naples 80138,Italy.

Cecilia Cisar (C)

Genomics Institute, Santa Cruz, CA 95064, USA.
Biomolecular Engineering and Bioinformatics, University of California Santa Cruz, Santa Cruz, CA 95064, USA.

Simon Heumos (S)

Quantitative Biology Center (QBiC), University of Tübingen, Tübingen 72076, Germany.

Glenn Hickey (G)

Genomics Institute, Santa Cruz, CA 95064, USA.

Vincenza Colonna (V)

Institute of Genetics and Biophysics, Consiglio Nazionale di Ricerche, Naples 80131, Italy.

Benedict Paten (B)

Genomics Institute, Santa Cruz, CA 95064, USA.
Biomolecular Engineering and Bioinformatics, University of California Santa Cruz, Santa Cruz, CA 95064, USA.

Erik Garrison (E)

Genomics Institute, Santa Cruz, CA 95064, USA.
Biomolecular Engineering and Bioinformatics, University of California Santa Cruz, Santa Cruz, CA 95064, USA.

Articles similaires

Selecting optimal software code descriptors-The case of Java.

Yegor Bugayenko, Zamira Kholmatova, Artem Kruglov et al.
1.00
Software Algorithms Programming Languages

Exploring blood-brain barrier passage using atomic weighted vector and machine learning.

Yoan Martínez-López, Paulina Phoobane, Yanaima Jauriga et al.
1.00
Blood-Brain Barrier Machine Learning Humans Support Vector Machine Software
Coal Metagenome Phylogeny Bacteria Genome, Bacterial
Cephalometry Humans Anatomic Landmarks Software Internet

Classifications MeSH