Large scale genomic analysis of 3067 SARS-CoV-2 genomes reveals a clonal geo-distribution and a rich genetic variations of hotspots mutations.
Betacoronavirus
/ classification
COVID-19
China
Coronavirus Infections
/ pathology
Evolution, Molecular
Genetic Variation
Genome, Viral
Humans
Pandemics
Phylogeny
Pneumonia, Viral
/ pathology
Polyproteins
Protein Structure, Tertiary
SARS-CoV-2
Spike Glycoprotein, Coronavirus
/ chemistry
Viral Proteins
/ chemistry
Journal
PloS one
ISSN: 1932-6203
Titre abrégé: PLoS One
Pays: United States
ID NLM: 101285081
Informations de publication
Date de publication:
2020
2020
Historique:
received:
26
05
2020
accepted:
24
09
2020
entrez:
10
11
2020
pubmed:
11
11
2020
medline:
20
11
2020
Statut:
epublish
Résumé
In late December 2019, an emerging viral infection COVID-19 was identified in Wuhan, China, and became a global pandemic. Characterization of the genetic variants of SARS-CoV-2 is crucial in following and evaluating it spread across countries. In this study, we collected and analyzed 3,067 SARS-CoV-2 genomes isolated from 55 countries during the first three months after the onset of this virus. Using comparative genomics analysis, we traced the profiles of the whole-genome mutations and compared the frequency of each mutation in the studied population. The accumulation of mutations during the epidemic period with their geographic locations was also monitored. The results showed 782 variants sites, of which 512 (65.47%) had a non-synonymous effect. Frequencies of mutated alleles revealed the presence of 68 recurrent mutations, including ten hotspot non-synonymous mutations with a prevalence higher than 0.10 in this population and distributed in six SARS-CoV-2 genes. The distribution of these recurrent mutations on the world map revealed that certain genotypes are specific to geographic locations. We also identified co-occurring mutations resulting in the presence of several haplotypes. Moreover, evolution over time has shown a mechanism of mutation co-accumulation which might affect the severity and spread of the SARS-CoV-2. The phylogentic analysis identified two major Clades C1 and C2 harboring mutations L3606F and G614D, respectively and both emerging for the first time in China. On the other hand, analysis of the selective pressure revealed the presence of negatively selected residues that could be taken into considerations as therapeutic targets. We have also created an inclusive unified database (http://covid-19.medbiotech.ma) that lists all of the genetic variants of the SARS-CoV-2 genomes found in this study with phylogeographic analysis around the world.
Identifiants
pubmed: 33170902
doi: 10.1371/journal.pone.0240345
pii: PONE-D-20-15804
pmc: PMC7654798
doi:
Substances chimiques
ORF1ab polyprotein, SARS-CoV-2
0
Polyproteins
0
Spike Glycoprotein, Coronavirus
0
Viral Proteins
0
spike protein, SARS-CoV-2
0
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
e0240345Déclaration de conflit d'intérêts
The authors have declared that no competing interests exist.
Références
Mol Biol Evol. 2018 Jun 1;35(6):1547-1549
pubmed: 29722887
BMC Bioinformatics. 2011 Apr 28;12:124
pubmed: 21526987
Cell. 2017 Mar 23;169(1):35-46.e19
pubmed: 28340348
Mol Biol Evol. 2013 Apr;30(4):772-80
pubmed: 23329690
Gene Rep. 2020 Jun;19:100682
pubmed: 32300673
RNA Biol. 2011 Mar-Apr;8(2):270-9
pubmed: 21593585
Antiviral Res. 2016 Nov;135:97-107
pubmed: 27743916
Mol Biol Evol. 2005 May;22(5):1208-22
pubmed: 15703242
Virology. 2015 Oct;484:313-22
pubmed: 26149721
mBio. 2013 Aug 13;4(4):
pubmed: 23943763
Nucleic Acids Res. 2015 Jul 1;43(W1):W566-70
pubmed: 25969447
J Infect. 2020 Jun;80(6):671-693
pubmed: 32145215
J Med Virol. 2020 Jun;92(6):584-588
pubmed: 32083328
Fly (Austin). 2012 Apr-Jun;6(2):80-92
pubmed: 22728672
J Med Virol. 2020 Jun;92(6):577-583
pubmed: 32162702
Virology. 2014 Jun;458-459:125-35
pubmed: 24928045
Genome Res. 2002 Jun;12(6):962-8
pubmed: 12045149
Nature. 2020 Mar;579(7798):265-269
pubmed: 32015508
J Virol. 2004 Dec;78(24):13600-12
pubmed: 15564471
Nat Med. 2020 Apr;26(4):450-452
pubmed: 32284615
Mol Biol Evol. 2013 May;30(5):1196-205
pubmed: 23420840
Antiviral Res. 2014 Jan;101:122-30
pubmed: 24269475
Virus Genes. 2007 Oct;35(2):175-86
pubmed: 17508277
Mol Biol Evol. 2015 Jan;32(1):268-74
pubmed: 25371430
PLoS Genet. 2012;8(7):e1002764
pubmed: 22807683
Science. 2020 Mar 27;367(6485):1444-1448
pubmed: 32132184
mBio. 2020 Jul 21;11(4):
pubmed: 32694143
Bioinformatics. 2009 Aug 15;25(16):2078-9
pubmed: 19505943
Bioinformatics. 2005 Mar 1;21(5):676-9
pubmed: 15509596
Cell. 2020 Apr 16;181(2):271-280.e8
pubmed: 32142651
Bioinformatics. 2018 Sep 15;34(18):3094-3100
pubmed: 29750242
mBio. 2016 Dec 13;7(6):
pubmed: 27965448
J Transl Med. 2020 Apr 22;18(1):179
pubmed: 32321524