VCF observer: a user-friendly software tool for preliminary VCF file analysis and comparison.
Benchmarking
Comparison
Graphical
User-friendly
VCF
Visualization
Journal
BMC bioinformatics
ISSN: 1471-2105
Titre abrégé: BMC Bioinformatics
Pays: England
ID NLM: 100965194
Informations de publication
Date de publication:
03 Sep 2024
03 Sep 2024
Historique:
received:
24
04
2024
accepted:
10
07
2024
medline:
4
9
2024
pubmed:
4
9
2024
entrez:
3
9
2024
Statut:
epublish
Résumé
Advancements over the past decade in DNA sequencing technology and computing power have created the potential to revolutionize medicine. There has been a marked increase in genetic data available, allowing for the advancement of areas such as personalized medicine. A crucial type of data in this context is genetic variant data which is stored in variant call format (VCF) files. However, the rapid growth in genomics has presented challenges in analyzing and comparing VCF files. In response to the limitations of existing tools, this paper introduces a novel web application that provides a user-friendly solution for VCF file analyses and comparisons. The software tool enables researchers and clinicians to perform high-level analysis with ease and enhances productivity. The application's interface allows users to conveniently upload, analyze, and visualize their VCF files using simple drag-and-drop and point-and-click operations. Essential visualizations such as Venn diagrams, clustergrams, and precision-recall plots are provided to users. A key feature of the application is its support for metadata-based file grouping, accomplished through flexible data matrix uploads, streamlining organization and analysis of user-defined categories. Additionally, the application facilitates standardized benchmarking of VCF files by integrating user-provided ground truth regions and variant lists. By providing a user-friendly interface and supporting essential visualizations, this software enhances the accessibility of VCF file analysis and assists researchers and clinicians in their scientific inquiries.
Sections du résumé
BACKGROUND
BACKGROUND
Advancements over the past decade in DNA sequencing technology and computing power have created the potential to revolutionize medicine. There has been a marked increase in genetic data available, allowing for the advancement of areas such as personalized medicine. A crucial type of data in this context is genetic variant data which is stored in variant call format (VCF) files. However, the rapid growth in genomics has presented challenges in analyzing and comparing VCF files.
RESULTS
RESULTS
In response to the limitations of existing tools, this paper introduces a novel web application that provides a user-friendly solution for VCF file analyses and comparisons. The software tool enables researchers and clinicians to perform high-level analysis with ease and enhances productivity. The application's interface allows users to conveniently upload, analyze, and visualize their VCF files using simple drag-and-drop and point-and-click operations. Essential visualizations such as Venn diagrams, clustergrams, and precision-recall plots are provided to users. A key feature of the application is its support for metadata-based file grouping, accomplished through flexible data matrix uploads, streamlining organization and analysis of user-defined categories. Additionally, the application facilitates standardized benchmarking of VCF files by integrating user-provided ground truth regions and variant lists.
CONCLUSIONS
CONCLUSIONS
By providing a user-friendly interface and supporting essential visualizations, this software enhances the accessibility of VCF file analysis and assists researchers and clinicians in their scientific inquiries.
Identifiants
pubmed: 39227760
doi: 10.1186/s12859-024-05860-0
pii: 10.1186/s12859-024-05860-0
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
290Subventions
Organisme : Türkiye Sağlık Enstitüleri Başkanlığı
ID : 24295
Informations de copyright
© 2024. The Author(s).
Références
Pabinger S, Dander A, Fischer M, et al. A survey of tools for variant analysis of next-generation genome sequencing data. Brief Bioinform. 2014;15(2):256–78. https://doi.org/10.1093/bib/bbs086 .
doi: 10.1093/bib/bbs086
pubmed: 23341494
Pei S, Liu T, Ren X, et al. Benchmarking variant callers in next-generation and third-generation sequencing analysis. Brief Bioinform. 2021;22(3):bbaa148. https://doi.org/10.1093/bib/bbaa148 .
doi: 10.1093/bib/bbaa148
pubmed: 32698196
He X, Chen S, Li R, et al. Comprehensive fundamental somatic variant calling and quality management strategies for human cancer genomes. Brief Bioinform. 2021;22(3):bbaa083. https://doi.org/10.1093/bib/bbaa083 .
doi: 10.1093/bib/bbaa083
pubmed: 32510555
Crippa V, Fina E, Ramazzotti D, et al. Control-FREEC viewer: a tool for the visualization and exploration of copy number variation data. BMC Bioinform. 2024;25:72. https://doi.org/10.1186/s12859-024-05694-w .
doi: 10.1186/s12859-024-05694-w
Wang X, Budowle B, Ge J. USAT: a bioinformatic toolkit to facilitate interpretation and comparative visualization of tandem repeat sequences. BMC Bioinform. 2022;23:497. https://doi.org/10.1186/s12859-022-05021-1 .
doi: 10.1186/s12859-022-05021-1
Zia M, Spurgeon P, Levesque A, et al. GenESysV: a fast, intuitive and scalable genome exploration open source tool for variants generated from high-throughput sequencing projects. BMC Bioinform. 2019;20:61. https://doi.org/10.1186/s12859-019-2636-5 .
doi: 10.1186/s12859-019-2636-5
Danecek P, Auton A, Abecasis G, et al. The variant call format and VCFtools. Bioinformatics. 2011;27(15):2156–8. https://doi.org/10.1093/bioinformatics/btr330 .
doi: 10.1093/bioinformatics/btr330
pubmed: 21653522
pmcid: 3137218
Danecek P, Bonfield JK, Liddle J, et al. Twelve years of SAMtools and BCFtools. GigaScience. 2021;10(2):giab008. https://doi.org/10.1093/gigascience/giab008 .
doi: 10.1093/gigascience/giab008
pubmed: 33590861
pmcid: 7931819
Hart SN, Duffy P, Quest DJ, et al. VCF-miner: GUI-based application for mining variants and annotations stored in VCF files. Brief Bioinform. 2016;17(2):346–51. https://doi.org/10.1093/bib/bbv051 .
doi: 10.1093/bib/bbv051
pubmed: 26210358
Salatino S, Ramraj V. BrowseVCF: a web-based application and workflow to quickly prioritize disease-causative variants in VCF files. Brief Bioinform. 2017;18(5):774–9. https://doi.org/10.1093/bib/bbw054 .
doi: 10.1093/bib/bbw054
pubmed: 27373737
Eidi M, Abdolalizadeh S, Moeini S, et al. 123VCF: an intuitive and efficient tool for filtering VCF files. BMC Bioinform. 2024;25:68. https://doi.org/10.1186/s12859-024-05661-5 .
doi: 10.1186/s12859-024-05661-5
Tollefson GA, Schuster J, Gelin F, et al. VIVA (visualization of variants): a VCF file visualization tool. Sci Rep. 2019;9:12648. https://doi.org/10.1038/s41598-019-49114-z .
doi: 10.1038/s41598-019-49114-z
pubmed: 31477778
pmcid: 6718772
Zhao Y, Fang LT, Shen TW, et al. Whole genome and exome sequencing reference datasets from a multi-center and cross-platform benchmark study. Sci Data. 2021;8(1):296.
doi: 10.1038/s41597-021-01077-5
pubmed: 34753956
pmcid: 8578599
Pan B, Ren L, Onuchic V, et al. Assessing reproducibility of inherited variants detected with short-read whole genome sequencing. Genome Biol. 2022;23:2. https://doi.org/10.1186/s13059-021-02569-8 .
doi: 10.1186/s13059-021-02569-8
pubmed: 34980216
pmcid: 8722114
Fang LT, Zhu B, Zhao Y, et al. Somatic Mutation Working Group of Sequencing Quality Control Phase II Consortium. Establishing community reference samples, data and call sets for benchmarking cancer mutation detection using whole-genome sequencing. Nat Biotechnol. 2021;39(9):1151–60. https://doi.org/10.1038/s41587-021-00993-6 .
doi: 10.1038/s41587-021-00993-6
pubmed: 34504347
pmcid: 8532138
Dunn T, Narayanasamy S. vcfdist: accurately benchmarking phased small variant calls in human genomes. Nat Commun. 2023;14:8149. https://doi.org/10.1038/s41467-023-43876-x .
doi: 10.1038/s41467-023-43876-x
pubmed: 38071244
pmcid: 10710436