vSNP: a SNP pipeline for the generation of transparent SNP matrices and phylogenetic trees from whole genome sequencing data sets.
Bioinformatics
SNPs
Sequencing
Journal
BMC genomics
ISSN: 1471-2164
Titre abrégé: BMC Genomics
Pays: England
ID NLM: 100965258
Informations de publication
Date de publication:
01 Jun 2024
01 Jun 2024
Historique:
received:
04
01
2024
accepted:
21
05
2024
medline:
1
6
2024
pubmed:
1
6
2024
entrez:
31
5
2024
Statut:
epublish
Résumé
Several single nucleotide polymorphism (SNP) pipelines exist, each offering its own advantages. Among them and described here is vSNP that has been developed over the past decade and is specifically tailored to meet the needs of diagnostic laboratories. Laboratories that aim to provide rapid whole genome sequencing results during outbreak investigations face unique challenges. vSNP addresses these challenges by enabling users to verify and validate sequence accuracy with ease- having utility across various pathogens, being fully auditable, and presenting results that are easy to interpret and can be comprehended by individuals with diverse backgrounds. vSNP has proven effective for real-time phylogenetic analysis of disease outbreaks and eradication efforts, including bovine tuberculosis, brucellosis, virulent Newcastle disease, SARS-CoV-2, African swine fever, and highly pathogenic avian influenza. The pipeline produces easy-to-read SNP matrices, sorted for convenience, as well as corresponding phylogenetic trees, making the output easily understandable. Essential data for verifying SNPs is included in the output, and the process has been divided into two steps for ease of use and faster processing times. vSNP requires minimal computational resources to run and can be run in a wide range of environments. Several utilities have been developed to make analysis more accessible for subject matter experts who may not have computational expertise. The vSNP pipeline integrates seamlessly into a diagnostic workflow and meets the criteria for quality control accreditation programs, such as 17025 by the International Organization for Standardization. Its versatility and robustness make it suitable for use with a diverse range of organisms, providing detailed, reproducible, and transparent results, making it a valuable tool in various applications, including phylogenetic analysis performed in real time.
Sections du résumé
BACKGROUND
BACKGROUND
Several single nucleotide polymorphism (SNP) pipelines exist, each offering its own advantages. Among them and described here is vSNP that has been developed over the past decade and is specifically tailored to meet the needs of diagnostic laboratories. Laboratories that aim to provide rapid whole genome sequencing results during outbreak investigations face unique challenges. vSNP addresses these challenges by enabling users to verify and validate sequence accuracy with ease- having utility across various pathogens, being fully auditable, and presenting results that are easy to interpret and can be comprehended by individuals with diverse backgrounds.
RESULTS
RESULTS
vSNP has proven effective for real-time phylogenetic analysis of disease outbreaks and eradication efforts, including bovine tuberculosis, brucellosis, virulent Newcastle disease, SARS-CoV-2, African swine fever, and highly pathogenic avian influenza. The pipeline produces easy-to-read SNP matrices, sorted for convenience, as well as corresponding phylogenetic trees, making the output easily understandable. Essential data for verifying SNPs is included in the output, and the process has been divided into two steps for ease of use and faster processing times. vSNP requires minimal computational resources to run and can be run in a wide range of environments. Several utilities have been developed to make analysis more accessible for subject matter experts who may not have computational expertise.
CONCLUSION
CONCLUSIONS
The vSNP pipeline integrates seamlessly into a diagnostic workflow and meets the criteria for quality control accreditation programs, such as 17025 by the International Organization for Standardization. Its versatility and robustness make it suitable for use with a diverse range of organisms, providing detailed, reproducible, and transparent results, making it a valuable tool in various applications, including phylogenetic analysis performed in real time.
Identifiants
pubmed: 38822271
doi: 10.1186/s12864-024-10437-5
pii: 10.1186/s12864-024-10437-5
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
545Informations de copyright
© 2024. This is a U.S. Government work and not under copyright protection in the US; foreign copyright protection may apply.
Références
Salvador LCM, O’Brien DJ, Cosgrove MK, Stuber TP, Schooley AM, Crispell J, et al. Disease management at the wildlife-livestock interface: Using whole-genome sequencing to study the role of elk in Mycobacterium bovis transmission in Michigan, USA. Mol Ecol. 2019;28(9):2192–205.
doi: 10.1111/mec.15061
pubmed: 30807679
Allard MW, Luo Y, Strain E, Pettengill J, Timme R, Wang C, et al. On the evolutionary history, population genetics and diversity among isolates of Salmonella Enteritidis PFGE pattern JEGX01.0004. PLoS One. 2013;8(1):e55254.
doi: 10.1371/journal.pone.0055254
pubmed: 23383127
pmcid: 3559427
Jajou R, Kohl TA, Walker T, Norman A, Cirillo DM, Tagliani E, et al. Towards standardisation: comparison of five whole genome sequencing (WGS) analysis pipelines for detection of epidemiologically linked tuberculosis cases. Euro Surveill. 2019;24(50):1900130.
doi: 10.2807/1560-7917.ES.2019.24.50.1900130
pubmed: 31847944
pmcid: 6918587
Davis S, Pettengill JB, Luo Y, Payne J, Shpuntoff A, Rand H, et al. CFSAN SNP Pipeline: an automated method for constructing SNP matrices from next-generation sequence data. PeerJ Computer Science. 2015;1:e20.
doi: 10.7717/peerj-cs.20
Sahl JW, Lemmer D, Travis J, Schupp JM, Gillece JD, Aziz M, et al. NASP: an accurate, rapid method for the identification of SNPs in WGS datasets that supports flexible input and output formats. Microb Genom. 2016;2(8): e000074.
pubmed: 28348869
pmcid: 5320593
Orloski K, Robbe-Austerman S, Stuber T, Hench B, Schoenbaum M. Whole genome sequencing of mycobacterium bovis isolated from livestock in the United States, 1989–2018. Front Vet Sci. 2018;5:253.
doi: 10.3389/fvets.2018.00253
pubmed: 30425994
pmcid: 6219248
Kamath PL, Foster JT, Drees KP, Luikart G, Quance C, Anderson NJ, et al. Genomics reveals historic and contemporary transmission dynamics of a bacterial disease among wildlife and livestock. Nat Commun. 2016;7:11448.
doi: 10.1038/ncomms11448
pubmed: 27165544
pmcid: 4865865
Hicks J, Stuber T, Lantz K, Erdman M, Robbe-Austerman S, Huang X. Genomic diversity of Taylorella equigenitalis introduced into the United States from 1978 to 2012. PLoS One. 2018;13(3):e0194253.
doi: 10.1371/journal.pone.0194253
pubmed: 29584782
pmcid: 5870977
Lorente-Leal V, Farrell D, Romero B, Alvarez J, de Juan L, Gordon SV. Performance and agreement between wgs variant calling pipelines used for bovine tuberculosis control: toward international standardization. Front Vet Sci. 2021;8: 780018.
doi: 10.3389/fvets.2021.780018
pubmed: 34970617
pmcid: 8712436
Anaconda. Available from: https://www.anaconda.com/ .
Li H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. 2013. arXiv:13033997v1 [q-bioGN]. https://doi.org/10.48550/arXiv.1303.3997 .
Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25(14):1754–60.
doi: 10.1093/bioinformatics/btp324
pubmed: 19451168
pmcid: 2705234
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The sequence alignment/map format and SAMtools. Bioinformatics. 2009;25(16):2078–9.
doi: 10.1093/bioinformatics/btp352
pubmed: 19505943
pmcid: 2723002
Garrison E, Marth G. Haplotype-based variant detection from short-read sequencing. arXiv. 2012(1207.3907). https://doi.org/10.48550/arXiv.1207.3907 .
Hicks J. Available from: https://github.com/jameshicks/vcffilter .
Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, et al. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol. 2012;19(5):455–77.
doi: 10.1089/cmb.2012.0021
pubmed: 22506599
pmcid: 3342519
Robinson JT, Thorvaldsdottir H, Winckler W, Guttman M, Lander ES, Getz G, et al. Integrative genomics viewer. Nat Biotechnol. 2011;29(1):24–6.
doi: 10.1038/nbt.1754
pubmed: 21221095
pmcid: 3346182
Pozo P, Lorente-Leal V, Robbe-Austerman S, Hicks J, Stuber T, Bezos J, et al. Use of Whole-genome sequencing to unravel the genetic diversity of a prevalent mycobacterium bovis spoligotype in a multi-host scenario in Spain. Front Microbiol. 2022;13: 915843.
doi: 10.3389/fmicb.2022.915843
pubmed: 35898917
pmcid: 9309649
Perera O, Perea C, Davalos E, Flores V, Salazar G, Rosas C, et al. Whole genome sequencing links mycobacterium bovis from cattle, cheese and humans in baja California. recent advances in bovine tuberculosis. 2022;8(674307):143.
Buss BF, Keyser-Metobo A, Rother J, et al. Possible Airborne Person-to-Person Transmission of Mycobacterium bovis — Nebraska 2014–2015. MMWR Morb Mortal Wkly Rep. 2016;65:197–201. https://doi.org/10.15585/mmwr.mm6508a1 .
USDA:APHIS:VS:Center for epidemiology and animal health. epidemiologic analyses of virulent newcastle disease in poultry in California, March 2021. USDA-APHIS; 2021. https://www.aphis.usda.gov/animal_health/downloads/animal_diseases/ai/epi-analy-vnd-poultry-calif.pdf .
Glaser L, Carstensen M, Shaw S, Robbe-Austerman S, Wunschmann A, Grear D, et al. Descriptive epidemiology and whole genome sequencing analysis for an outbreak of bovine tuberculosis in beef cattle and white-tailed deer in northwestern Minnesota. PLoS ONE. 2016;11(1): e0145735.
doi: 10.1371/journal.pone.0145735
pubmed: 26785113
pmcid: 4718535
Lakin SM, O’Donnell V, Xu L, Barrette RW, Barnabei J, Nunez R, et al. Whole genome sequencing and molecular epidemiology of the 2021 African swine fever virus outbreak in the Dominican Republic. Authorea Preprints; 2022.
Price-Carter M, Brauning R, De Lisle GW, Livingstone P, Neill M, Sinclair J, et al. Whole genome sequencing for determining the source of Mycobacterium bovis infections in livestock herds and wildlife in New Zealand. Front Vet Sci. 2018;5:272.
doi: 10.3389/fvets.2018.00272
pubmed: 30425997
pmcid: 6218598
Ortiz AP, Perea C, Davalos E, Velázquez EF, González KS, Camacho ER, et al. Whole genome sequencing links mycobacterium bovis from cattle, cheese and humans in Baja California, Mexico. Front Vet Sci. 2021;8: 674307.
doi: 10.3389/fvets.2021.674307
pubmed: 34414224
pmcid: 8370811
Quance C, Robbe-Austerman S, Stuber T, Brignole T, DeBess EE, Boyd L, et al. Identification of source of Brucella suis infection in human by whole-genome sequencing, United States and Tonga. Emerg Infect Dis. 2016;22(1):79.
doi: 10.3201/eid2201.150843
pubmed: 26689610
pmcid: 4696693
Srednik ME, Morningstar-Shaw BR, Hicks JA, Mackie TA, Schlater LK. Antimicrobial resistance and genomic characterization of Salmonella enterica serovar Senftenberg isolates in production animals from the United States. Front Microbiol. 2022;13: 979790.
doi: 10.3389/fmicb.2022.979790
pubmed: 36406424
pmcid: 9668867
Srednik ME, Lantz K, Hicks JA, Morningstar-Shaw BR, Mackie TA, Schlater LK. Antimicrobial resistance and genomic characterization of Salmonella Dublin isolates in cattle from the United States. PLoS One. 2021;16(9):e0249617.
doi: 10.1371/journal.pone.0249617
pubmed: 34547028
pmcid: 8454963
Thacker TC, Palmer MV, Robbe-Austerman S, Stuber TP, Waters WR. Anatomical distribution of Mycobacterium bovis genotypes in experimentally infected white-tailed deer. Vet Microbiol. 2015;180(1–2):75–81.
doi: 10.1016/j.vetmic.2015.07.006
pubmed: 26243696
Srednik ME, Perea CA, Giacoboni GI, Hicks JA, Foxx CL, Harris B, et al. Genomic features of antimicrobial resistance in staphylococcus pseudintermedius isolated from dogs with pyoderma in Argentina and the United States: A Comparative Study. Int J Mol Sci. 2023;24(14):11361.
doi: 10.3390/ijms241411361
pubmed: 37511121
pmcid: 10379401
Abadi S, Azouri D, Pupko T, Mayrose I. Model selection may not be a mandatory step for phylogeny reconstruction. Nat Commun. 2019;10(1):934.
doi: 10.1038/s41467-019-08822-w
pubmed: 30804347
pmcid: 6389923
Li H. New strategies to improve minimap2 alignment accuracy. Bioinformatics. 2021;37(23):4572-4574.
doi: 10.1093/bioinformatics/btab705
pubmed: 34623391
pmcid: 8652018