Multiple Occurrences of a 168-Nucleotide Deletion in SARS-CoV-2 ORF8, Unnoticed by Standard Amplicon Sequencing and Variant Calling Pipelines.
ORF8 deletion
SARS-CoV-2
genomic surveillance
nanopore sequencing
viral genomics
Journal
Viruses
ISSN: 1999-4915
Titre abrégé: Viruses
Pays: Switzerland
ID NLM: 101509722
Informations de publication
Date de publication:
18 09 2021
18 09 2021
Historique:
received:
13
07
2021
revised:
14
09
2021
accepted:
15
09
2021
entrez:
28
9
2021
pubmed:
29
9
2021
medline:
14
10
2021
Statut:
epublish
Résumé
Genomic surveillance of the SARS-CoV-2 pandemic is crucial and mainly achieved by amplicon sequencing protocols. Overlapping tiled-amplicons are generated to establish contiguous SARS-CoV-2 genome sequences, which enable the precise resolution of infection chains and outbreaks. We investigated a SARS-CoV-2 outbreak in a local hospital and used nanopore sequencing with a modified ARTIC protocol employing 1200 bp long amplicons. We detected a long deletion of 168 nucleotides in the ORF8 gene in 76 samples from the hospital outbreak. This deletion is difficult to identify with the classical amplicon sequencing procedures since it removes two amplicon primer-binding sites. We analyzed public SARS-CoV-2 sequences and sequencing read data from ENA and identified the same deletion in over 100 genomes belonging to different lineages of SARS-CoV-2, pointing to a mutation hotspot or to positive selection. In almost all cases, the deletion was not represented in the virus genome sequence after consensus building. Additionally, further database searches point to other deletions in the ORF8 coding region that have never been reported by the standard data analysis pipelines. These findings and the fact that ORF8 is especially prone to deletions, make a clear case for the urgent necessity of public availability of the raw data for this and other large deletions that might change the physiology of the virus towards endemism.
Identifiants
pubmed: 34578452
pii: v13091870
doi: 10.3390/v13091870
pmc: PMC8518987
pii:
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Subventions
Organisme : Bundesministerium für Bildung und Forschung
ID : 031A532B
Organisme : Bundesministerium für Bildung und Forschung
ID : 031A533A
Références
Bioinformatics. 2018 Dec 1;34(23):4121-4123
pubmed: 29790939
Lancet Infect Dis. 2020 May;20(5):533-534
pubmed: 32087114
J Infect Dis. 2020 Jun 29;222(2):223-233
pubmed: 32433742
Bioinformatics. 2019 May 15;35(10):1763-1765
pubmed: 30295730
Nat Methods. 2015 Aug;12(8):733-5
pubmed: 26076426
PeerJ. 2022 Mar 21;10:e13136
pubmed: 35341060
Acta Crystallogr D Biol Crystallogr. 1999 Jun;55(Pt 6):1158-67
pubmed: 10329778
Proc Natl Acad Sci U S A. 2021 Jan 12;118(2):
pubmed: 33361333
Nucleic Acids Res. 2019 Jan 8;47(D1):D520-D528
pubmed: 30357364
Lancet Microbe. 2020 Jul;1(3):e99-e100
pubmed: 32835336
Lancet. 2020 Feb 22;395(10224):565-574
pubmed: 32007145
Sci Rep. 2018 Oct 11;8(1):15177
pubmed: 30310104
Emerg Microbes Infect. 2020 Dec;9(1):221-236
pubmed: 31987001
Nucleic Acids Res. 2002 Jul 15;30(14):3059-66
pubmed: 12136088
Infect Genet Evol. 2020 Nov;85:104525
pubmed: 32890763
Virus Evol. 2018 Jan 08;4(1):vex042
pubmed: 29340210
Bioinformatics. 2017 May 1;33(9):1394-1395
pubmed: 28453688
Biol Methods Protoc. 2020 Jul 18;5(1):bpaa014
pubmed: 33029559
Biochem Biophys Res Commun. 2021 Jan 29;538:116-124
pubmed: 33685621
J Open Source Softw. 2021;6(57):
pubmed: 34189396
Nat Commun. 2019 Oct 11;10(1):4660
pubmed: 31604920
Virus Evol. 2020 Oct 05;6(2):veaa075
pubmed: 33318859
Proc Natl Acad Sci U S A. 2021 Jun 8;118(23):
pubmed: 34021074
Lancet. 2020 Aug 29;396(10251):603-611
pubmed: 32822564
Protein Sci. 2020 Apr;29(4):1069-1078
pubmed: 31730249
mBio. 2020 Jul 21;11(4):
pubmed: 32694143
Virus Res. 2020 Sep;286:198074
pubmed: 32589897
Bioinformatics. 2016 Jan 15;32(2):309-11
pubmed: 26415722