Backward compatibility of whole genome sequencing data with MLVA typing using a new MLVAtype shiny application for Vibrio cholerae.
Journal
PloS one
ISSN: 1932-6203
Titre abrégé: PLoS One
Pays: United States
ID NLM: 101285081
Informations de publication
Date de publication:
2019
2019
Historique:
received:
24
07
2019
accepted:
13
11
2019
entrez:
12
12
2019
pubmed:
12
12
2019
medline:
26
3
2020
Statut:
epublish
Résumé
Multiple-Locus Variable Number of Tandem Repeats (VNTR) Analysis (MLVA) is widely used by laboratory-based surveillance networks for subtyping pathogens causing foodborne and water-borne disease outbreaks. However, Whole Genome Sequencing (WGS) has recently emerged as the new more powerful reference for pathogen subtyping, making a data conversion method necessary which enables the users to compare the MLVA identified by either method. The MLVAType shiny application was designed to extract MLVA profiles of Vibrio cholerae isolates from WGS data while ensuring backward compatibility with traditional MLVA typing methods. To test and validate the MLVAType algorithm, WGS-derived MLVA profiles of nineteen Vibrio cholerae isolates from Democratic Republic of the Congo (n = 9) and Uganda (n = 10) were compared to MLVA profiles generated by an in silico PCR approach and Sanger sequencing, the latter being used as the reference method. Results obtained by Sanger sequencing and MLVAType were totally concordant. However, the latter were affected by censored estimations whose percentage was inversely proportional to the k-mer parameter used during genome assembly. With a k-mer of 127, less than 15% estimation of V. cholerae VNTR was censored. Preventing censored estimation was only achievable when using a longer k-mer size (i.e. 175), which is not proposed in the SPAdes v.3.13.0 software. As NGS read lengths and qualities tend to increase with time, one may expect the increase of k-mer size in a near future. Using MLVAType application with a longer k-mer size will then efficiently retrieve MLVA profiles from WGS data while avoiding censored estimation.
Sections du résumé
BACKGROUND
Multiple-Locus Variable Number of Tandem Repeats (VNTR) Analysis (MLVA) is widely used by laboratory-based surveillance networks for subtyping pathogens causing foodborne and water-borne disease outbreaks. However, Whole Genome Sequencing (WGS) has recently emerged as the new more powerful reference for pathogen subtyping, making a data conversion method necessary which enables the users to compare the MLVA identified by either method. The MLVAType shiny application was designed to extract MLVA profiles of Vibrio cholerae isolates from WGS data while ensuring backward compatibility with traditional MLVA typing methods.
METHODS
To test and validate the MLVAType algorithm, WGS-derived MLVA profiles of nineteen Vibrio cholerae isolates from Democratic Republic of the Congo (n = 9) and Uganda (n = 10) were compared to MLVA profiles generated by an in silico PCR approach and Sanger sequencing, the latter being used as the reference method.
RESULTS
Results obtained by Sanger sequencing and MLVAType were totally concordant. However, the latter were affected by censored estimations whose percentage was inversely proportional to the k-mer parameter used during genome assembly. With a k-mer of 127, less than 15% estimation of V. cholerae VNTR was censored. Preventing censored estimation was only achievable when using a longer k-mer size (i.e. 175), which is not proposed in the SPAdes v.3.13.0 software.
CONCLUSION
As NGS read lengths and qualities tend to increase with time, one may expect the increase of k-mer size in a near future. Using MLVAType application with a longer k-mer size will then efficiently retrieve MLVA profiles from WGS data while avoiding censored estimation.
Identifiants
pubmed: 31825986
doi: 10.1371/journal.pone.0225848
pii: PONE-D-19-20834
pmc: PMC6905556
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
e0225848Déclaration de conflit d'intérêts
The authors have declared that no competing interests exist.
Références
BMC Genomics. 2019 Jan 9;20(1):23
pubmed: 30626323
PLoS Negl Trop Dis. 2020 Apr 20;14(4):e0007642
pubmed: 32310947
Genome Med. 2014 Nov 20;6(11):90
pubmed: 25422674
Infect Genet Evol. 2013 Jun;16:38-53
pubmed: 23357583
Foodborne Pathog Dis. 2010 Feb;7(2):129-36
pubmed: 19785535
FEMS Microbiol Lett. 2008 Nov;288(2):196-201
pubmed: 18811655
PLoS Negl Trop Dis. 2018 Jun 4;12(6):e0006492
pubmed: 29864113
J Bacteriol. 2010 Sep;192(17):4367-76
pubmed: 20585059
J Comput Biol. 2012 May;19(5):455-77
pubmed: 22506599
Bioinformatics. 2013 Apr 15;29(8):1072-5
pubmed: 23422339
Euro Surveill. 2017 Jun 8;22(23):
pubmed: 28662764
Front Microbiol. 2018 Jul 12;9:1545
pubmed: 30050522
J Clin Microbiol. 2015 Jan;53(1):212-8
pubmed: 25378576