A comprehensive whole genome database of ethnic minority populations.
Ethnic minority populations
GMGD database
Guizhou Province in southwest China
Human whole genome sequencing
Journal
Scientific reports
ISSN: 2045-2322
Titre abrégé: Sci Rep
Pays: England
ID NLM: 101563288
Informations de publication
Date de publication:
17 06 2024
17 06 2024
Historique:
received:
28
11
2023
accepted:
03
06
2024
medline:
18
6
2024
pubmed:
18
6
2024
entrez:
17
6
2024
Statut:
epublish
Résumé
China, is characterized by its remarkable ethnical diversity, which necessitates whole genome variation data from multiple populations as crucial tools for advancing population genetics and precision medical research. However, there has been a scarcity of research concentrating on the whole genome of ethnic minority groups. To fill this gap, we developed the Guizhou Multi-ethnic Genome Database (GMGD). It comprises whole genome sequencing data from 476 healthy unrelated individuals spanning 11 ethnic minorities groups in Guizhou Province, Southwest China, including Bouyei, Dong, Miao, Yi, Bai, Gelo, Zhuang, Tujia, Yao, Hui, and Sui. The GMGD database comprises more than 16.33 million variants in GRCh38 and 16.20 million variants in GRCh37. Among these, approximately 11.9% (1,956,322) of the variants in GRCh38 and 18.5% (3,009,431) of the variants in GRCh37 are entirely new and do not exist in the dbSNP database. These novel variants shed light on the genetic diversity landscape across these populations, providing valuable insights with an average coverage of 5.5 ×. This makes GMGD the largest genome-wide database encompassing the most diverse ethnic groups to date. The GMGD interactive interface facilitates researchers with multi-dimensional mutation search methods and displays population frequency differences among global populations. Furthermore, GMGD is equipped with a genotype-imputation function, enabling enhanced capabilities for low-depth genomic research or targeted region capture studies. GMGD offers unique insights into the genomic variation landscape of different ethnic groups, which are freely accessible at https://db.cngb.org/pop/gmgd/ .
Identifiants
pubmed: 38886537
doi: 10.1038/s41598-024-63892-1
pii: 10.1038/s41598-024-63892-1
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
13954Subventions
Organisme : National Natural Science Foundation of China
ID : 32360154
Organisme : Major Scientific and Technological Special Project of Guizhou Province
ID : [(2019) 2807]
Organisme : the project of Key Laboratory of Endemic and Ethnic Diseases, Ministry of Education, Guizhou Medical University
ID : FZSW-2022-001
Organisme : the project of Key Laboratory of Endemic and Ethnic Diseases, Ministry of Education, Guizhou Medical University
ID : FZSW-2022-001
Informations de copyright
© 2024. The Author(s).
Références
Cell Rep. 2021 Nov 16;37(7):110017
pubmed: 34788621
Nat Commun. 2022 May 26;13(1):2939
pubmed: 35618720
Nature. 2019 Dec;576(7785):106-111
pubmed: 31802016
Cell Res. 2020 Sep;30(9):717-731
pubmed: 32355288
Science. 2020 Mar 20;367(6484):
pubmed: 32193295
Front Genet. 2021 Sep 20;12:735084
pubmed: 34616433
Genetics. 2012 Nov;192(3):1065-93
pubmed: 22960212
Nucleic Acids Res. 2021 Jan 8;49(D1):D1186-D1191
pubmed: 33170268
Bioinformatics. 2009 Aug 15;25(16):2078-9
pubmed: 19505943
Nature. 2021 Feb;590(7845):290-299
pubmed: 33568819
Bioinformatics. 2009 Jul 15;25(14):1754-60
pubmed: 19451168
Nat Rev Genet. 2016 Jul;17(7):392-406
pubmed: 27140283
Nucleic Acids Res. 2022 Jan 7;50(D1):D27-D38
pubmed: 34718731
Mol Biol Evol. 2018 Nov 1;35(11):2736-2750
pubmed: 30169787
Cell Genom. 2022 Oct 12;2(11):100197
pubmed: 36776991
Genome Biol. 2016 Jun 06;17(1):122
pubmed: 27268795
Nat Genet. 2011 May;43(5):491-8
pubmed: 21478889
Front Genet. 2021 Jan 11;11:618614
pubmed: 33505437
Front Genet. 2022 Jan 03;12:815160
pubmed: 35047024
Forensic Sci Int Genet. 2019 May;40:e231-e239
pubmed: 30910535
Forensic Sci Int. 2018 Oct;291:109-114
pubmed: 30195151
Am J Hum Genet. 2007 Sep;81(3):559-75
pubmed: 17701901
Front Genet. 2020 Apr 30;11:360
pubmed: 32425974