Detection of single nucleotide polymorphisms in virus genomes assembled from high-throughput sequencing data: large-scale performance testing of sequence analysis strategies.


Journal

PeerJ
ISSN: 2167-8359
Titre abrégé: PeerJ
Pays: United States
ID NLM: 101603425

Informations de publication

Date de publication:
2023
Historique:
received: 20 02 2023
accepted: 10 07 2023
medline: 22 8 2023
pubmed: 21 8 2023
entrez: 21 8 2023
Statut: epublish

Résumé

Recent developments in high-throughput sequencing (HTS) technologies and bioinformatics have drastically changed research in virology, especially for virus discovery. Indeed, proper monitoring of the viral population requires information on the different isolates circulating in the studied area. For this purpose, HTS has greatly facilitated the sequencing of new genomes of detected viruses and their comparison. However, bioinformatics analyses allowing reconstruction of genome sequences and detection of single nucleotide polymorphisms (SNPs) can potentially create bias and has not been widely addressed so far. Therefore, more knowledge is required on the limitations of predicting SNPs based on HTS-generated sequence samples. To address this issue, we compared the ability of 14 plant virology laboratories, each employing a different bioinformatics pipeline, to detect 21 variants of pepino mosaic virus (PepMV) in three samples through large-scale performance testing (PT) using three artificially designed datasets. To evaluate the impact of bioinformatics analyses, they were divided into three key steps: reads pre-processing, virus-isolate identification, and variant calling. Each step was evaluated independently through an original, PT design including discussion and validation between participants at each step. Overall, this work underlines key parameters influencing SNPs detection and proposes recommendations for reliable variant calling for plant viruses. The identification of the closest reference, mapping parameters and manual validation of the detection were recognized as the most impactful analysis steps for the success of the SNPs detections. Strategies to improve the prediction of SNPs are also discussed.

Identifiants

pubmed: 37601254
doi: 10.7717/peerj.15816
pii: 15816
pmc: PMC10439718
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

e15816

Informations de copyright

© 2023 Rollin et al.

Déclaration de conflit d'intérêts

The authors declare that they have no competing interests.

Références

J Gen Virol. 2014 Mar;95(Pt 3):724-732
pubmed: 24362963
Adv Virus Res. 2014;88:161-91
pubmed: 24373312
Mol Plant. 2015 Jun;8(6):831-46
pubmed: 25676455
Mol Ecol Resour. 2021 May;21(4):1216-1229
pubmed: 33534960
Front Plant Sci. 2020 Jul 17;11:1092
pubmed: 32765569
Genome Biol. 2017 Apr 27;18(1):77
pubmed: 28449691
Nucleic Acids Res. 2018 Jul 2;46(W1):W209-W214
pubmed: 29722874
Front Cell Infect Microbiol. 2022 Jan 18;11:781429
pubmed: 35118007
J Virol. 2015 May;89(9):4760-9
pubmed: 25673712
Phytopathology. 2019 Mar;109(3):488-497
pubmed: 30070618
Virus Genes. 2008 Feb;36(1):241-9
pubmed: 18074213
J Virol. 2017 Jul 27;91(16):
pubmed: 28592544
PLoS Genet. 2019 Oct 17;15(10):e1008271
pubmed: 31622336
Virology. 2017 Jan;500:130-138
pubmed: 27825033
Vaccines (Basel). 2021 Oct 18;9(10):
pubmed: 34696303
BMC Genomics. 2022 Feb 22;23(1):155
pubmed: 35193511
Nat Rev Microbiol. 2011 Jul 04;9(8):617-26
pubmed: 21725337
Genome Med. 2020 Oct 26;12(1):91
pubmed: 33106175
Bioinformatics. 2012 Feb 15;28(4):593-4
pubmed: 22199392
Pathogens. 2021 Sep 12;10(9):
pubmed: 34578206
PLoS Pathog. 2015 May 05;11(5):e1004838
pubmed: 25941809
Brief Bioinform. 2021 May 20;22(3):
pubmed: 34020538
Proc Natl Acad Sci U S A. 1999 Jul 6;96(14):8022-7
pubmed: 10393941
Virus Res. 2017 Jul 15;239:136-142
pubmed: 28192164

Auteurs

Johan Rollin (J)

Laboratory of Plant Pathology-TERRA-Gembloux Agro-Bio Tech, University of Liège, Gembloux, Belgium.

Rachelle Bester (R)

Citrus Research International, Matieland, South Africa.
Department of Genetics, Stellenbosch University, Matieland, South Africa.

Yves Brostaux (Y)

Laboratory of Statistics, Computer Science and Modelling Applied to Bioengineering, TERRA, Gembloux Agro-Bio Tech, Teaching and Research Centre, University of Liège, Gembloux, Belgium.

Kadriye Caglayan (K)

Plant Protection Department, Agricultural Faculty, Hatay Mustafa Kemal University, Hatay, Turkey.

Kris De Jonghe (K)

Fisheries and Food (ILVO), Plant Sciences Unit, Flanders Research Institute for Agriculture, Merelbeke, Belgium.

Ales Eichmeier (A)

Mendeleum-Institute of Genetics, Faculty of Horticulture, Mendel University in Brno, Lednice, Czech Republic.

Yoika Foucart (Y)

Fisheries and Food (ILVO), Plant Sciences Unit, Flanders Research Institute for Agriculture, Merelbeke, Belgium.

Annelies Haegeman (A)

Fisheries and Food (ILVO), Plant Sciences Unit, Flanders Research Institute for Agriculture, Merelbeke, Belgium.

Igor Koloniuk (I)

Biology Centre CAS, Ceske Budejovice, Czech Republic.

Petr Kominek (P)

Crop Research Institute, Praha, Czech Republic.

Hans Maree (H)

Citrus Research International, Matieland, South Africa.
Department of Genetics, Stellenbosch University, Matieland, South Africa.

Serkan Onder (S)

Department of Plant Protection, Faculty of Agriculture, Eskişehir Osmangazi University, Eskişehir, Turkey.

Susana Posada Céspedes (S)

Department of Biosystems Science and Engineering, ETH Zurich, Basel, 4058, Switzerland.
Swiss Institute of Bioinformatics (SIB), Basel, Switzerland.

Vahid Roumi (V)

Plant Protection Department, Faculty of Agriculture, University of Maragheh, Maragheh, Iran.

Dana Šafářová (D)

Department of Cell Biology and Genetics, Faculty of Science, Palacký University Olomouc, Olomouc, Czech Republic.

Olivier Schumpp (O)

Plant Protection Department, Agroscope, Nyon, Switzerland.

Cigdem Ulubas Serce (C)

Plant Production and Technologies Department, Ayhan Şahenk Faculty of Agricultural Science and Technologies, Niğde Ömer Halisdemir University, Niğde, Turkey.

Merike Sõmera (M)

Department of Chemistry and Biotechnology, Tallinn University of Technology, Tallinn, Estonia.

Lucie Tamisier (L)

Pathologie Végétale, Institut National de la Recherche pour l'Agriculture, l'Alimentation et l'Environnement (INRAE), Montfavet, France.
GAFL, Institut National de la Recherche pour l'Agriculture, l'Alimentation et l'Environnement (INRAE), Montfavet, France.

Eeva Vainio (E)

Natural Resources Institute Finland, Helsinki, Finland.

Rene Aa van der Vlugt (RA)

Wageningen University & Research, Wageningen, The Netherlands.

Sebastien Massart (S)

Laboratory of Plant Pathology-TERRA-Gembloux Agro-Bio Tech, University of Liège, Gembloux, Belgium.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C

Classifications MeSH