Whole-genome sequencing of 1029 Indian individuals reveals unique and rare structural variants.
Journal
Journal of human genetics
ISSN: 1435-232X
Titre abrégé: J Hum Genet
Pays: England
ID NLM: 9808008
Informations de publication
Date de publication:
Jun 2023
Jun 2023
Historique:
received:
14
09
2022
accepted:
06
02
2023
revised:
31
01
2023
medline:
26
5
2023
pubmed:
23
2
2023
entrez:
22
2
2023
Statut:
ppublish
Résumé
Structural variants contribute to genetic variability in human genomes and they can be presented in population-specific patterns. We aimed to understand the landscape of structural variants in the genomes of healthy Indian individuals and explore their potential implications in genetic disease conditions. For the identification of structural variants, a whole genome sequencing dataset of 1029 self-declared healthy Indian individuals from the IndiGen project was analysed. Further, these variants were evaluated for potential pathogenicity and their associations with genetic diseases. We also compared our identified variations with the existing global datasets. We generated a compendium of total 38,560 high-confident structural variants, comprising 28,393 deletions, 5030 duplications, 5038 insertions, and 99 inversions. Particularly, we identified around 55% of all these variants were found to be unique to the studied population. Further analysis revealed 134 deletions with predicted pathogenic/likely pathogenic effects and their affected genes were majorly enriched for neurological disease conditions, such as intellectual disability and neurodegenerative diseases. The IndiGenomes dataset helped us to understand the unique spectrum of structural variants in the Indian population. More than half of identified variants were not present in the publicly available global dataset on structural variants. Clinically important deletions identified in IndiGenomes might aid in improving the diagnosis of unsolved genetic diseases, particularly in neurological conditions. Along with basal allele frequency data and clinically important deletions, IndiGenomes data might serve as a baseline resource for future studies on genomic structural variant analysis in the Indian population.
Identifiants
pubmed: 36813834
doi: 10.1038/s10038-023-01131-7
pii: 10.1038/s10038-023-01131-7
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
409-417Subventions
Organisme : Council of Scientific and Industrial Research (CSIR)
ID : MLP1809, MLP1801, MLP2001
Informations de copyright
© 2023. The Author(s), under exclusive licence to The Japan Society of Human Genetics.
Références
Collins RL, Brand H, Karczewski KJ, Zhao X, Alföldi J, Francioli LC, et al. A structural variation reference for medical and population genetics. Nature. 2020;581:444–51.
doi: 10.1038/s41586-020-2287-8
pubmed: 32461652
pmcid: 7334194
Keegan NP, Wilton SD, Fletcher S. Breakpoint junction features of seven DMD deletion mutations. Hum Genome Var. 2019;6:39.
doi: 10.1038/s41439-019-0070-x
pubmed: 31645977
pmcid: 6804640
Cusin V, Clermont O, Gérard B, Chantereau D, Elion J. Prevalence of SMN1 deletion and duplication in carrier and normal populations: implication for genetic counselling. J Med Genet. 2003;40:e39.
doi: 10.1136/jmg.40.4.e39
pubmed: 12676912
pmcid: 1735434
Kurtovic-Kozaric A, Mehinovic L, Stomornjak-Vukadin M, Kurtovic-Basic I, Catibusic F, Kozaric M, et al. Diagnostics of common microdeletion syndromes using fluorescence in situ hybridization: single center experience in a developing country. Bosn J Basic Med Sci. 2016;16:121–5.
pubmed: 26937776
pmcid: 4852993
Sudmant PH, Rausch T, Gardner EJ, Handsaker RE, Abyzov A, Huddleston J, et al. An integrated map of structural variation in 2504 human genomes. Nature. 2015;526:75–81.
doi: 10.1038/nature15394
pubmed: 26432246
pmcid: 4617611
Ramaswamy S, Jain R, El Naofal M, Halabi N, Yaslam S, Taylor A, et al. Middle Eastern Genetic Variation Improves Clinical Annotation of the Human Genome. J Pers Med [Internet]. 2022;12. Available from: https://doi.org/10.3390/jpm12030423
Genome of the Netherlands Consortium. Whole-genome sequence variation, population structure and demographic history of the Dutch population. Nat Genet. 2014;46:818–25.
doi: 10.1038/ng.3021
Gautam P, Jha P, Kumar D, Tyagi S, Varma B, Dash D, et al. Spectrum of large copy number variations in 26 diverse Indian populations: potential involvement in phenotypic diversity. Hum Genet. 2012;131:131–43.
doi: 10.1007/s00439-011-1050-5
pubmed: 21744140
Mastana SS. Unity in diversity: an overview of the genomic anthropology of India. Ann Hum Biol. 2014;41:287–99.
doi: 10.3109/03014460.2014.922615
pubmed: 24932744
Sharma SK, Kalam MA, Ghosh S, Roy S. Prevalence and determinants of consanguineous marriage and its types in India: evidence from the National Family Health Survey, 2015–2016. J Biosoc Sci. 2021;53:566–76.
doi: 10.1017/S0021932020000383
pubmed: 32641190
Reich D, Thangaraj K, Patterson N, Price AL, Singh L. Reconstructing Indian population history. Nature. 2009;461:489–94.
doi: 10.1038/nature08365
pubmed: 19779445
pmcid: 2842210
Nakatsuka N, Moorjani P, Rai N, Sarkar B, Tandon A, Patterson N, et al. The promise of discovering population-specific disease-associated genes in South Asia. Nat Genet. 2017;49:1403–7.
doi: 10.1038/ng.3917
pubmed: 28714977
pmcid: 5675555
Jain A, Bhoyar RC, Pandhare K, Mishra A, Sharma D, Imran M, et al. IndiGenomes: a comprehensive resource of genetic variants from over 1000 Indian genomes. Nucleic Acids Res. 2021;49:D1225–32.
pubmed: 33095885
Mahmoud M, Gobet N, Cruz-Dávalos DI, Mounier N, Dessimoz C, Sedlazeck FJ. Structural variant calling: the long and the short of it. Genome Biol. 2019;20:246.
doi: 10.1186/s13059-019-1828-7
pubmed: 31747936
pmcid: 6868818
Cameron DL, Di Stefano L, Papenfuss AT. Comprehensive evaluation and characterisation of short read general-purpose structural variant calling software. Nat Commun. 2019;10:3240.
doi: 10.1038/s41467-019-11146-4
pubmed: 31324872
pmcid: 6642177
Sarwal V, Niehus S, Ayyala R, Kim M, Sarkar A, Chang S, et al. A comprehensive benchmarking of WGS-based deletion structural variant callers. Brief Bioinform [Internet]. 2022;23. Available from: https://doi.org/10.1093/bib/bbac221
Kosugi S, Momozawa Y, Liu X, Terao C, Kubo M, Kamatani Y. Comprehensive evaluation of structural variation detection algorithms for whole genome sequencing. Genome Biol. 2019;20:117.
doi: 10.1186/s13059-019-1720-5
pubmed: 31159850
pmcid: 6547561
Geoffroy V, Herenger Y, Kress A, Stoetzel C, Piton A, Dollfus H, et al. AnnotSV: an integrated tool for structural variations annotation. Bioinformatics. 2018;34:3572–4.
doi: 10.1093/bioinformatics/bty304
pubmed: 29669011
Layer RM, Chiang C, Quinlan AR, Hall IM. LUMPY: a probabilistic framework for structural variant discovery. Genome Biol. 2014;15:R84.
doi: 10.1186/gb-2014-15-6-r84
pubmed: 24970577
pmcid: 4197822
Chen X, Schulz-Trieglaff O, Shaw R, Barnes B, Schlesinger F, Källberg M, et al. Manta: rapid detection of structural variants and indels for germline and cancer sequencing applications. Bioinformatics. 2016;32:1220–2.
doi: 10.1093/bioinformatics/btv710
pubmed: 26647377
Chiang C, Layer RM, Faust GG, Lindberg MR, Rose DB, Garrison EP, et al. SpeedSeq: ultra-fast personal genome analysis and interpretation. Nat Methods. 2015;12:966–8.
doi: 10.1038/nmeth.3505
pubmed: 26258291
pmcid: 4589466
Jeffares DC, Jolly C, Hoti M, Speed D, Shaw L, Rallis C, et al. Transient structural variations have strong effects on quantitative traits and reproductive isolation in fission yeast. Nat Commun. 2017;8:14061.
doi: 10.1038/ncomms14061
pubmed: 28117401
pmcid: 5286201
Riggs ER, Andersen EF, Cherry AM, Kantarci S, Kearney H, Patel A, et al. Technical standards for the interpretation and reporting of constitutional copy-number variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics (ACMG) and the Clinical Genome Resource (ClinGen). Genet Med. 2020;22:245–57.
doi: 10.1038/s41436-019-0686-8
pubmed: 31690835
Zhang D, Hu Q, Liu X, Zou K, Sarkodie EK, Liu X, et al. AllEnricher: a comprehensive gene set function enrichment tool for both model and non-model species. BMC Bioinforma. 2020;21:106.
doi: 10.1186/s12859-020-3408-y
Karolchik D, Hinrichs AS, Kent WJ. The UCSC genome browser. Curr Protoc Bioinforma. 2009;Chapter 1:Unit1.4.
Halldorsson BV, Eggertsson HP, Moore KHS, Hauswedell H, Eiriksson O, Ulfarsson MO, et al. The sequences of 150,119 genomes in the UK Biobank. Nature. 2022;607:732–40.
doi: 10.1038/s41586-022-04965-x
pubmed: 35859178
pmcid: 9329122
Bose P, Hermetz KE, Conneely KN, Rudd MK. Tandem repeats and G-rich sequences are enriched at human CNV breakpoints. PLoS One. 2014;9:e101607.
doi: 10.1371/journal.pone.0101607
pubmed: 24983241
pmcid: 4090240
Wright CF, Fitzgerald TW, Jones WD, Clayton S, McRae JF, van Kogelenberg M, et al. Genetic diagnosis of developmental disorders in the DDD study: a scalable analysis of genome-wide research data. Lancet. 2015;385:1305–14.
doi: 10.1016/S0140-6736(14)61705-0
pubmed: 25529582
pmcid: 4392068
Amberger JS, Bocchini CA, Scott AF, Hamosh A. OMIM.org: leveraging knowledge across phenotype-gene relationships. Nucleic Acids Res. 2019;47:D1038–43.
doi: 10.1093/nar/gky1151
pubmed: 30445645
Yuan B, Wang L, Liu P, Shaw C, Dai H, Cooper L, et al. CNVs cause autosomal recessive genetic diseases with or without involvement of SNV/indels. Genet Med. 2020;22:1633–41.
doi: 10.1038/s41436-020-0864-8
pubmed: 32576985
pmcid: 8445517
Lalani SR, Liu P, Rosenfeld JA, Watkin LB, Chiang T, Leduc MS, et al. Recurrent muscle weakness with rhabdomyolysis, metabolic crises, and cardiac arrhythmia due to Bi-allelic TANGO2 mutations. Am J Hum Genet. 2016;98:347–57.
doi: 10.1016/j.ajhg.2015.12.008
pubmed: 26805781
pmcid: 4746334
Gupta D, Bijarnia-Mahay S, Saxena R, Kohli S, Dua-Puri R, Verma J, et al. Identification of mutations, genotype-phenotype correlation and prenatal diagnosis of maple syrup urine disease in Indian patients. Eur J Med Genet. 2015;58:471–8.
doi: 10.1016/j.ejmg.2015.08.002
pubmed: 26257134
Huie ML, Shanske AL, Kasper JS, Marion RW, Hirschhorn R. A large Alu-mediated deletion, identified by PCR, as the molecular basis for glycogen storage disease type II (GSDII). Hum Genet. 1999;104:94–8.
doi: 10.1007/s004390050916
pubmed: 10071199
Puri RD, Setia N, N V, Jagadeesh S, Nampoothiri S, Gupta N, et al. Late onset Pompe Disease in India - Beyond the Caucasian phenotype. Neuromuscul Disord. 2021;31:431–41.
doi: 10.1016/j.nmd.2021.02.013
pubmed: 33741225
Truty R, Paul J, Kennemer M, Lincoln SE, Olivares E, Nussbaum RL, et al. Prevalence and properties of intragenic copy-number variation in Mendelian disease genes. Genet Med. 2019;21:114–23.
doi: 10.1038/s41436-018-0033-5
pubmed: 29895855
Rice AM, McLysaght A. Dosage sensitivity is a major determinant of human copy number variant pathogenicity. Nat Commun. 2017;8:14366.
doi: 10.1038/ncomms14366
pubmed: 28176757
pmcid: 5309798
Aradhya S, Truty R. AB003. Prevalence of copy number and structural variants across Mendelian disorders. Ann Transl Med. AME Publishing Company; 2017;5:AB003–AB003.
India State-Level Disease Burden Initiative Neurological Disorders Collaborators. The burden of neurological disorders across the states of India: the Global Burden of Disease Study 1990–2019. Lancet Glob Health. 2021;9:e1129–44.
doi: 10.1016/S2214-109X(21)00164-9
Nalls MA, Blauwendraat C, Vallerga CL, Heilbron K, Bandres-Ciga S, Chang D, et al. Identification of novel risk loci, causal insights, and heritable risk for Parkinson’s disease: a meta-analysis of genome-wide association studies. Lancet Neurol. 2019;18:1091–102.
doi: 10.1016/S1474-4422(19)30320-5
pubmed: 31701892
pmcid: 8422160
GBD 2016 Parkinson’s Disease Collaborators. Global, regional, and national burden of Parkinson’s disease, 1990–2016: a systematic analysis for the Global Burden of Disease Study 2016. Lancet Neurol. 2018;17:939–53.
doi: 10.1016/S1474-4422(18)30295-3