Tutorial: a statistical genetics guide to identifying HLA alleles driving complex disease.
Journal
Nature protocols
ISSN: 1750-2799
Titre abrégé: Nat Protoc
Pays: England
ID NLM: 101284307
Informations de publication
Date de publication:
09 2023
09 2023
Historique:
received:
10
08
2022
accepted:
27
04
2023
medline:
8
9
2023
pubmed:
27
7
2023
entrez:
26
7
2023
Statut:
ppublish
Résumé
The human leukocyte antigen (HLA) locus is associated with more complex diseases than any other locus in the human genome. In many diseases, HLA explains more heritability than all other known loci combined. In silico HLA imputation methods enable rapid and accurate estimation of HLA alleles in the millions of individuals that are already genotyped on microarrays. HLA imputation has been used to define causal variation in autoimmune diseases, such as type I diabetes, and in human immunodeficiency virus infection control. However, there are few guidelines on performing HLA imputation, association testing, and fine mapping. Here, we present a comprehensive tutorial to impute HLA alleles from genotype data. We provide detailed guidance on performing standard quality control measures for input genotyping data and describe options to impute HLA alleles and amino acids either locally or using the web-based Michigan Imputation Server, which hosts a multi-ancestry HLA imputation reference panel. We also offer best practice recommendations to conduct association tests to define the alleles, amino acids, and haplotypes that affect human traits. Along with the pipeline, we provide a step-by-step online guide with scripts and available software ( https://github.com/immunogenomics/HLA_analyses_tutorial ). This tutorial will be broadly applicable to large-scale genotyping data and will contribute to defining the role of HLA in human diseases across global populations.
Identifiants
pubmed: 37495751
doi: 10.1038/s41596-023-00853-4
pii: 10.1038/s41596-023-00853-4
doi:
Substances chimiques
HLA Antigens
0
Histocompatibility Antigens Class I
0
Amino Acids
0
Types de publication
Journal Article
Review
Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
2625-2641Subventions
Organisme : NIAID NIH HHS
ID : F30 AI172238
Pays : United States
Informations de copyright
© 2023. Springer Nature Limited.
Références
Trowsdale, J. & Knight, J. C. Major histocompatibility complex genomics and human disease. Ann. Rev. Genomics Hum. Genet. 14, 301–323 (2013).
Amiel, J. in Histocompatibility Testing (ed. Teraski, P. I.) 79–81 (Munksgaard, 1967).
Murphy, K. & Weaver, C. Janeway’s immunology. America 1–277 (2017).
Dendrou, C. A., Petersen, J., Rossjohn, J. & Fugger, L. HLA variation and disease. Nat. Rev. Immunol. 18, 325–339 (2018).
pubmed: 29292391
Murphy, K. Kenneth M. & Weaver, C. Janeway’s Immunobiology (Garland Science, 2016).
Scally, S. W. et al. A molecular basis for the association of the HLA-DRB1 locus, citrullination, and rheumatoid arthritis. J. Exp. Med. 210, 2569–2582 (2013).
pubmed: 24190431
pmcid: 3832918
Ishigaki, K. et al. HLA autoimmune risk alleles restrict the hypervariable region of T cell receptors. Nat. Genet. 54, 393–402 (2022).
pubmed: 35332318
pmcid: 9010379
McGonagle, D., Aydin, S. Z., Gül, A., Mahr, A. & Direskeneli, H. ‘MHC-I-opathy’-unified concept for spondyloarthritis and Behçet disease. Nat. Rev. Rheumatol. 11, 731–740 (2015).
pubmed: 26526644
Sekar, A. et al. Schizophrenia risk from complex variation of complement component 4. Nature 530, 177 (2016).
pubmed: 26814963
pmcid: 4752392
Montgomery, R. A., Tatapudi, V. S., Leffell, M. S. & Zachary, A. A. HLA in transplantation. Nat. Rev. Nephrol. 14, 558–570 (2018).
pubmed: 29985463
Fleischhauer, K., Zino, E., Bordignon, C. & Benazzi, E. Complete generic and extensive fine-specificity typing of the HLA-B locus by the PCR-SSOP method. Tissue Antigens 46, 281–292 (1995).
pubmed: 8560446
Cereb, N., Maye, P., Lee, S., Kong, Y. & Yang, S. Y. Locus-specific amplification of HLA class I genes from genomic DNA: locus-specific sequences in the first and third introns of HLA-A, -B, and -C alleles. Tissue Antigens 45, 1–11 (1995).
pubmed: 7725305
Erlich, H. HLA DNA typing: past, present, and future. Tissue Antigens 80, 1–11 (2012).
pubmed: 22651253
Cereb, N., Kim, H. R., Ryu, J. & Yang, S. Y. Advances in DNA sequencing technologies for high resolution HLA typing. Hum. Immunol. 76, 923–927 (2015).
pubmed: 26423536
Smith, A. G. et al. Comparison of sequence-specific oligonucleotide probe vs next generation sequencing for HLA-A, B, C, DRB1, DRB3/B4/B5, DQA1, DQB1, DPA1, and DPB1 typing: toward single-pass high-resolution HLA typing in support of solid organ and hematopoietic cell transplant programs. HLA 94, 296–306 (2019).
pubmed: 31237117
pmcid: 6772026
Schöfl, G. et al. 2.7 million samples genotyped for HLA by next generation sequencing: lessons learned. BMC Genomics 18, 1–16 (2017).
Jiao, Y. et al. High-sensitivity HLA typing by saturated tiling capture sequencing (STC-Seq). BMC Genomics 19, 50 (2018).
pubmed: 29334893
pmcid: 5769328
Jia, X. et al. Imputing amino acid polymorphisms in human leukocyte antigens. PLoS One 8, e64683 (2013).
pubmed: 23762245
pmcid: 3675122
Dilthey, A. T., Moutsianas, L., Leslie, S. & McVean, G. HLA*IMP—an integrated framework for imputing classical HLA alleles from SNP genotypes. Bioinformatics 27, 968 (2011).
pubmed: 21300701
pmcid: 3065693
Zheng, X. et al. HIBAG—HLA genotype imputation with attribute bagging. Pharmacogenomics J. 14, 192–200 (2013).
pubmed: 23712092
pmcid: 3772955
Luo, Y. et al. A high-resolution HLA reference panel capturing global population diversity enables multi-ancestry fine-mapping in HIV host response. Nat. Genet. 53, 1504–1516 (2021).
pubmed: 34611364
pmcid: 8959399
Das, S. et al. Next-generation genotype imputation service and methods. Nat. Genet. 48, 1284–1287 (2016).
pubmed: 27571263
pmcid: 5157836
Raychaudhuri, S. et al. Five amino acids in three HLA proteins explain most of the association between MHC and seropositive rheumatoid arthritis. Nat. Genet. 44, 291–296 (2012).
pubmed: 22286218
pmcid: 3288335
Robinson, J. et al. IPD-IMGT/HLA database. Nucleic Acids Res. 48, D948–D955 (2020).
pubmed: 31667505
Marsh, S. G. E. et al. Nomenclature for factors of the HLA system, 2010. Tissue Antigens 75, 291 (2010).
pubmed: 20356336
pmcid: 2848993
Marsh, S. G. E. et al. An update to HLA nomenclature, 2010. Bone Marrow Transplant. 45, 846–848 (2010).
pubmed: 20348972
Marchini, J. & Howie, B. Genotype imputation for genome-wide association studies. Nat. Rev. Genet. 11, 499–511 (2010).
pubmed: 20517342
Dilthey, A. T. et al. High-accuracy HLA type inference from whole-genome sequencing data using population reference graphs. PLoS Comput. Biol. 12, e1005151 (2016).
pubmed: 27792722
pmcid: 5085092
Dilthey, A. T. et al. HLA*LA—HLA typing from linearly projected graph alignments. Bioinformatics 35, 4394–4396 (2019).
pubmed: 30942877
pmcid: 6821427
Shen, J. J. et al. HLA-IMPUTER: an easy to use web application for HLA imputation and association analysis using population-specific reference panels. Bioinformatics 35, 1244–1246 (2019).
pubmed: 30169743
Maiers, M. et al. GRIMM: GRaph IMputation and matching for HLA genotypes. Bioinformatics 35, 3520–3523 (2019).
pubmed: 30689784
Breiman, L. Bagging predictors. Mach. Learn. 24, 123–140 (1996).
Dilthey, A., Cox, C., Iqbal, Z., Nelson, M. R. & McVean, G. Improved genome inference in the MHC using a population reference graph. Nat. Genet. 47, 682–688 (2015).
pubmed: 25915597
pmcid: 4449272
Hirata, J. et al. Genetic and phenotypic landscape of the major histocompatibilty complex region in the Japanese population. Nat. Genet. 51, 470–480 (2019).
pubmed: 30692682
Hu, T., Chitnis, N., Monos, D. & Dinh, A. Next-generation sequencing technologies: an overview. Hum. Immunol. 82, 801–811 (2021).
Hosomichi, K., Jinam, T. A., Mitsunaga, S., Nakaoka, H. & Inoue, I. Phase-defined complete sequencing of the HLA genes by next-generation sequencing. BMC Genomics 14, 1–16 (2013).
Gibbs, R. A. et al. A global reference for human genetic variation. Nature 526, 68–74 (2015).
Browning, S. R. & Browning, B. L. Haplotype phasing: existing methods and new developments. Nat. Rev. Genet. 12, 703–714 (2011).
pubmed: 21921926
pmcid: 3217888
Verlouw, J. A. M. et al. A comparison of genotyping arrays. Eur. J. Hum. Genet. 29, 1611 (2021).
pubmed: 34140649
pmcid: 8560858
Vince, N. et al. SNP-HLA Reference Consortium (SHLARC): HLA and SNP data sharing for promoting MHC-centric analyses in genomics. Genet. Epidemiol. 44, 733–740 (2020).
pubmed: 32681667
pmcid: 7540691
Klareskog, L., Catrina, A. I. & Paget, S. Rheumatoid arthritis. Lancet 373, 659–672 (2009).
pubmed: 19157532
Padyukov, L. et al. A genome-wide association study suggests contrasting associations in ACPA-positive versus ACPA-negative rheumatoid arthritis. Ann. Rheum. Dis. 70, 259–265 (2011).
pubmed: 21156761
Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209 (2018).
pubmed: 30305743
pmcid: 6786975
Wu, P. et al. Mapping ICD-10 and ICD-10-CM codes to phecodes: workflow development and initial evaluation. JMIR Med. Inform. https://medinform.jmir.org/2019/4/e14325 (2019).
Gutierrez-Arcelus, M. et al. Allele-specific expression changes dynamically during T cell activation in HLA and other autoimmune loci. Nat. Genet. 52, 247 (2020).
pubmed: 32066938
pmcid: 7135372
D’Antonio, M. et al. Systematic genetic analysis of the MHC region reveals mechanistic underpinnings of HLA type associations with disease. eLife 8, e48476 (2019).
pubmed: 31746734
pmcid: 6904215
Aguiar, V. R. C., César, J., Delaneau, O., Dermitzakis, E. T. & Meyer, D. Expression estimation and eQTL mapping for HLA genes with a personalized pipeline. PLoS Genet. 15, e1008091 (2019).
pubmed: 31009447
pmcid: 6497317
Stoeckius, M. et al. Simultaneous epitope and transcriptome measurement in single cells. Nat. Methods 14, 865–868 (2017).
pubmed: 28759029
pmcid: 5669064
Anderson, C. A. et al. Data quality control in genetic case-control association studies. Nat. Protoc. 5, 1564–1573 (2010).
pubmed: 21085122
pmcid: 3025522
Uffelmann, E. et al. Genome-wide association studies. Nat. Rev. Methods Prim. 1, 1–21 (2021).
Gilly, A. et al. Very low-depth whole-genome sequencing in complex trait association studies. Bioinformatics 35, 2555–2561 (2019).
pubmed: 30576415
Gilly, A. et al. Very low-depth sequencing in a founder population identifies a cardioprotective APOC3 signal missed by genome-wide imputation. Hum. Mol. Genet. 25, 2360–2365 (2016).
pubmed: 27146844
pmcid: 5081052
Martin, A. R. et al. Low-coverage sequencing cost-effectively detects known and novel variation in underrepresented populations. Am. J. Hum. Genet. 108, 656–668 (2021).
pubmed: 33770507
pmcid: 8059370
Marees, A. T. et al. A tutorial on conducting genome-wide association studies: quality control and statistical analysis. Int. J. Methods Psychiatr. Res 27, e1608 (2018).
pubmed: 29484742
pmcid: 6001694
Hinrichs, A. S. et al. The UCSC genome browser database: update 2006. Nucleic Acids Res. 34, D590–D598 (2006).
pubmed: 16381938
Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443 (2020).
pubmed: 32461654
pmcid: 7334197
Gomes, I. et al. Hardy–Weinberg quality control. Ann. Hum. Genet. 63, 535–538 (1999).
pubmed: 11246455
Hosking, L. et al. Detection of genotyping errors by Hardy–Weinberg equilibrium testing. Eur. J. Hum. Genet. 12, 395–399 (2004).
pubmed: 14872201
Wittke-Thompson, J. K., Pluzhnikov, A. & Cox, N. J. Rational inferences about departures from Hardy–Weinberg equilibrium. Am. J. Hum. Genet 76, 967 (2005).
pubmed: 15834813
pmcid: 1196455
Galinsky, K. J. et al. Fast principal-component analysis reveals convergent evolution of ADH1B in Europe and East Asia. Am. J. Hum. Genet. 98, 456–472 (2016).
pubmed: 26924531
pmcid: 4827102
Cook, S. et al. Accurate imputation of human leukocyte antigens with CookHLA. Nat. Commun. 12, 1–11 (2021).
Browning, S. R. & Browning, B. L. Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. Am. J. Hum. Genet. 81, 1084–1097 (2007).
pubmed: 17924348
pmcid: 2265661
Delaneau, O., Zagury, J. F. & Marchini, J. Improved whole-chromosome phasing for disease and population genetic studies. Nat. Methods 10, 5–6 (2013).
pubmed: 23269371
Loh, P.-R. et al. Reference-based phasing using the Haplotype Reference Consortium panel. Nat. Genet. 48, 1443–1448 (2016).
pubmed: 27694958
pmcid: 5096458
Gourraud, P. A. et al. HLA diversity in the 1000 Genomes Dataset. PLoS One 9, e97282 (2014).
pubmed: 24988075
pmcid: 4079705
Abi-Rached, L. et al. Immune diversity sheds light on missing variation in worldwide genetic diversity panels. PLoS One 13, e0206512 (2018).
pubmed: 30365549
pmcid: 6203392
Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).
pubmed: 17701901
pmcid: 1950838
Zhou, W. et al. Efficiently controlling for case-control imbalance and sample relatedness in large-scale genetic association studies. Nat. Genet. 50, 1335–1341 (2018).
pubmed: 30104761
pmcid: 6119127
Loh, P. R. et al. Efficient Bayesian mixed-model analysis increases association power in large cohorts. Nat. Genet. 47, 284–290 (2015).
pubmed: 25642633
pmcid: 4342297
Wordsworth, P. et al. HLA heterozygosity contributes to susceptibility to rheumatoid arthritis. Am. J. Hum. Genet. 51, 585 (1992).
pubmed: 1496989
pmcid: 1682725
Koeleman, B. P. C. et al. Genotype effects and epistasis in type 1 diabetes and HLA-DQ trans dimer associations with disease. Genes Immun. 5, 381–388 (2004).
pubmed: 15164102
Thomson, G. et al. Relative predispositional effects of HLA class II DRB1-DQB1 haplotypes and genotypes on type 1 diabetes: a meta-analysis. Tissue Antigens 70, 110–127 (2007).
pubmed: 17610416
Woelfing, B., Traulsen, A., Milinski, M. & Boehm, T. Does intra-individual major histocompatibility complex diversity keep a golden mean? Philos. Trans. R. Soc. Lond. B Biol. Sci. 364, 117–128 (2009).
pubmed: 18926972
Lipsitch, M., Bergstrom, C. T. & Antia, R. Effect of human leukocyte antigen heterozygosity on infectious disease outcome: the need for allele-specific measures. BMC Med. Genet. 4, 2 (2003).
pubmed: 12542841
pmcid: 149356
Tsai, S. & Santamaria, P. MHC class II polymorphisms, autoreactive T-cells, and autoimmunity. Front. Immunol. 4, 321 (2013).
pubmed: 24133494
pmcid: 3794362
Goyette, P. et al. High-density mapping of the MHC identifies a shared role for HLA-DRB1*01:03 in inflammatory bowel diseases and heterozygous advantage in ulcerative colitis. Nat. Genet. 47, 172–179 (2015).
pubmed: 25559196
pmcid: 4310771
Lenz, T. L. et al. Widespread non-additive and interaction effects within HLA loci modulate the risk of autoimmune diseases. Nat. Genet. 47, 1085–1090 (2015).
pubmed: 26258845
pmcid: 4552599
Arora, J. et al. HLA heterozygote advantage against HIV-1 is driven by quantitative and qualitative differences in HLA allele-specific peptide presentation. Mol. Biol. Evol. 37, 639–650 (2020).
pubmed: 31651980
Hu, X. et al. Additive and interaction effects at three amino acid positions in HLA-DQ and HLA-DR molecules drive type 1 diabetes risk. Nat. Genet. 47, 898–905 (2015).
pubmed: 26168013
pmcid: 4930791
Reynolds, E. G. M. et al. Non-additive association analysis using proxy phenotypes identifies novel cattle syndromes. Nat. Genet. 53, 949–954 (2021).
pubmed: 34045765
Segal, M. R., Cummings, M. P. & Hubbard, A. E. Relating amino acid sequence to phenotype: analysis of peptide-binding data. Biometrics 57, 632–643 (2001).
pubmed: 11414594
Chen, B. et al. Predicting HLA class II antigen presentation through integrated deep learning. Nat. Biotechnol. 37, 1332–1343 (2019).
pubmed: 31611695
pmcid: 7075463
Pierini, F. & Lenz, T. L. Divergent allele advantage at human MHC genes: signatures of past and ongoing selection. Mol. Biol. Evol. 35, 2145–2158 (2018).
pubmed: 29893875
pmcid: 6106954
Wakeland, E. K. et al. Ancestral polymorphisms of MHC class II genes: divergent allele advantage. Immunol. Res. 9, 115–122 (1990).
pubmed: 2189934
Radwan, J., Babik, W., Kaufman, J., Lenz, T. L. & Winternitz, J. Advances in the evolutionary understanding of MHC polymorphism. Trends Genet. 36, 298–311 (2020).
pubmed: 32044115
Chowell, D. et al. Evolutionary divergence of HLA class I genotype impacts efficacy of cancer immunotherapy. Nat. Med. 25, 1715–1720 (2019).
pubmed: 31700181
pmcid: 7938381
Choudhury, A. et al. High-depth African genomes inform human migration and health. Nature 586, 741–748 (2020).
pubmed: 33116287
pmcid: 7759466
Wall, J. D. et al. The GenomeAsia 100K Project enables genetic discoveries across Asia. Nature 576, 106–111 (2019).
Nakane, T. et al. Single-particle cryo-EM at atomic resolution. Nature 587, 152–156 (2020).
pubmed: 33087931
pmcid: 7611073
Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).
pubmed: 34265844
pmcid: 8371605
Pillai, N. E. et al. Predicting HLA alleles from high-resolution SNP data in three Southeast Asian populations. Hum. Mol. Genet. 23, 4443–4451 (2014).
pubmed: 24698974
Okada, Y. et al. Construction of a population-specific HLA imputation reference panel and its application to Graves’ disease risk in Japanese. Nat. Genet. 47, 798–802 (2015).
pubmed: 26029868
Zhou, F. et al. Deep sequencing of the MHC region in the Chinese population contributes to studies of complex disease. Nat. Genet. 48, 740–746 (2016).
pubmed: 27213287
Kim, K., Bang, S. Y., Lee, H. S. & Bae, S. C. Construction and application of a Korean reference panel for imputing classical alleles and amino acids of human leukocyte antigen genes. PLoS One 9, e112546 (2014).
pubmed: 25398076
pmcid: 4232350
Degenhardt, F. et al. Construction and benchmarking of a multi-ethnic reference panel for the imputation of HLA class I and II alleles. Hum. Mol. Genet. 28, 2078–2092 (2019).
Sakaue, S. et al. A cross-population atlas of genetic associations for 220 human phenotypes. Nat. Genet. 53, 1415–1424 (2021).
pubmed: 34594039