PxBLAT: an efficient python binding library for BLAT.
BLAT
Sequence analysis
Software libraries
Journal
BMC bioinformatics
ISSN: 1471-2105
Titre abrégé: BMC Bioinformatics
Pays: England
ID NLM: 100965194
Informations de publication
Date de publication:
19 Jun 2024
19 Jun 2024
Historique:
received:
05
03
2024
accepted:
13
06
2024
medline:
20
6
2024
pubmed:
20
6
2024
entrez:
19
6
2024
Statut:
epublish
Résumé
With the surge in genomic data driven by advancements in sequencing technologies, the demand for efficient bioinformatics tools for sequence analysis has become paramount. BLAST-like alignment tool (BLAT), a sequence alignment tool, faces limitations in performance efficiency and integration with modern programming environments, particularly Python. This study introduces PxBLAT, a Python-based framework designed to enhance the capabilities of BLAT, focusing on usability, computational efficiency, and seamless integration within the Python ecosystem. PxBLAT demonstrates significant improvements over BLAT in execution speed and data handling, as evidenced by comprehensive benchmarks conducted across various sample groups ranging from 50 to 600 samples. These experiments highlight a notable speedup, reducing execution time compared to BLAT. The framework also introduces user-friendly features such as improved server management, data conversion utilities, and shell completion, enhancing the overall user experience. Additionally, the provision of extensive documentation and comprehensive testing supports community engagement and facilitates the adoption of PxBLAT. PxBLAT stands out as a robust alternative to BLAT, offering performance and user interaction enhancements. Its development underscores the potential for modern programming languages to improve bioinformatics tools, aligning with the needs of contemporary genomic research. By providing a more efficient, user-friendly tool, PxBLAT has the potential to impact genomic data analysis workflows, supporting faster and more accurate sequence analysis in a Python environment.
Sections du résumé
BACKGROUND
BACKGROUND
With the surge in genomic data driven by advancements in sequencing technologies, the demand for efficient bioinformatics tools for sequence analysis has become paramount. BLAST-like alignment tool (BLAT), a sequence alignment tool, faces limitations in performance efficiency and integration with modern programming environments, particularly Python. This study introduces PxBLAT, a Python-based framework designed to enhance the capabilities of BLAT, focusing on usability, computational efficiency, and seamless integration within the Python ecosystem.
RESULTS
RESULTS
PxBLAT demonstrates significant improvements over BLAT in execution speed and data handling, as evidenced by comprehensive benchmarks conducted across various sample groups ranging from 50 to 600 samples. These experiments highlight a notable speedup, reducing execution time compared to BLAT. The framework also introduces user-friendly features such as improved server management, data conversion utilities, and shell completion, enhancing the overall user experience. Additionally, the provision of extensive documentation and comprehensive testing supports community engagement and facilitates the adoption of PxBLAT.
CONCLUSIONS
CONCLUSIONS
PxBLAT stands out as a robust alternative to BLAT, offering performance and user interaction enhancements. Its development underscores the potential for modern programming languages to improve bioinformatics tools, aligning with the needs of contemporary genomic research. By providing a more efficient, user-friendly tool, PxBLAT has the potential to impact genomic data analysis workflows, supporting faster and more accurate sequence analysis in a Python environment.
Identifiants
pubmed: 38898394
doi: 10.1186/s12859-024-05844-0
pii: 10.1186/s12859-024-05844-0
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
219Subventions
Organisme : US National Institute of General Medical Sciences
ID : R35GM142441
Informations de copyright
© 2024. The Author(s).
Références
Perkel JM. Programming: pick up Python. Nature. 2015;518(7537):125–6. https://doi.org/10.1038/518125a .
doi: 10.1038/518125a
pubmed: 25653001
Putri GH, Anders S, Pyl PT, Pimanda JE, Zanini F. Analysing high-throughput sequencing data in Python with HTSeq 2.0. Bioinformatics. 2022;38(10):2943–5. https://doi.org/10.1093/bioinformatics/btac166 .
doi: 10.1093/bioinformatics/btac166
pubmed: 35561197
pmcid: 9113351
Cock PJA, Antao T, Chang JT, Chapman BA, Cox CJ, Dalke A, Friedberg I, Hamelryck T, Kauff F, Wilczynski B, Hoon MJL. Biopython: freely available python tools for computational molecular biology and bioinformatics. Bioinformatics. 2009;25(11):1422–3. https://doi.org/10.1093/bioinformatics/btp163 .
doi: 10.1093/bioinformatics/btp163
pubmed: 19304878
pmcid: 2682512
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215(3):403–10. https://doi.org/10.1016/S0022-2836(05)80360-2 .
doi: 10.1016/S0022-2836(05)80360-2
pubmed: 2231712
Higgins DG, Sharp PM. CLUSTAL: a package for performing multiple sequence alignment on a microcomputer. Gene. 1988;73(1):237–44. https://doi.org/10.1016/0378-1119(88)90330-7 .
doi: 10.1016/0378-1119(88)90330-7
pubmed: 3243435
Kent WJ. BLAT-The BLAST-like alignment tool. Genome Res. 2002;12(4):656–64. https://doi.org/10.1101/gr.229202 . arXiv: 1193.2250 .
doi: 10.1101/gr.229202
pubmed: 11932250
pmcid: 187518
Grant GR, Farkas MH, Pizarro AD, Lahens NF, Schug J, Brunk BP, Stoeckert CJ, Hogenesch JB, Pierce EA. Comparative analysis of RNA-seq alignment algorithms and the RNA-seq unified mapper (RUM). Bioinformatics. 2011;27(18):2518–28.
doi: 10.1093/bioinformatics/btr427
pubmed: 21775302
pmcid: 3167048
Borozan I, Watt SN, Ferretti V. Evaluation of alignment algorithms for discovery and identification of pathogens using RNA-seq. PloS ONE. 2013;8(10):76935.
doi: 10.1371/journal.pone.0076935
Mardis ER. Next-generation DNA sequencing methods. Annu Rev Genomics Hum Genet. 2008;9:387–402.
doi: 10.1146/annurev.genom.9.081307.164359
pubmed: 18576944
Marx V. Method of the year: long-read sequencing. Nat Methods. 2023;20(1):6–11.
doi: 10.1038/s41592-022-01730-w
pubmed: 36635542
Sielemann K, Pucker B, Schmidt N, Viehöver P, Weisshaar B, Heitkam T, Holtgräwe D. Complete pan-plastome sequences enable high resolution phylogenetic classification of sugar beet and closely related crop wild relatives. BMC Genomics. 2022;23(1):113.
doi: 10.1186/s12864-022-08336-8
pubmed: 35139817
pmcid: 8830136
Coates BS, Walden KK, Lata D, Vellichirammal NN, Mitchell RF, Andersson MN, McKay R, Lorenzen MD, Grubbs N, Wang Y-H, et al. A draft Diabrotica virgifera virgifera genome: insights into control and host plant adaption by a major maize pest insect. BMC Genomics. 2023;24(1):19.
doi: 10.1186/s12864-022-08990-y
pubmed: 36639634
pmcid: 9840275
Carbonnel S, Falquet L, Hazak O. Deeper genomic insights into tomato CLE genes repertoire identify new active peptides. BMC Genomics. 2022;23(1):756.
doi: 10.1186/s12864-022-08980-0
pubmed: 36396987
pmcid: 9670457
Dressler L, Bortolomeazzi M, Keddar MR, Misetic H, Sartini G, Acha-Sagredo A, Montorsi L, Wijewardhane N, Repana D, Nulsen J, et al. Comparative assessment of genes driving cancer and somatic evolution in non-cancer tissues: an update of the network of cancer genes (NCG) resource. Genome Biol. 2022;23(1):35.
doi: 10.1186/s13059-022-02607-z
pubmed: 35078504
pmcid: 8790917
Zhu Y, Gomez JA, Laufer BI, Mordaunt CE, Mouat JS, Soto DC, Dennis MY, Benke KS, Bakulski KM, Dou J, et al. Placental methylome reveals a 22q13. 33 brain regulatory gene locus associated with autism. Genome Biol. 2022;23(1):46.
doi: 10.1186/s13059-022-02613-1
pubmed: 35168652
pmcid: 8848662
Wang M, Kong L. pblat: a multithread blat algorithm speeding up aligning sequences to genomes. BMC Bioinform. 2019;20(1):1–4.
Jakob W, Rhinelander J, Moldovan D. pybind11 – Seamless operability between C++11 and Python. 2016; https://github.com/pybind/pybind11 .