Benchmark of tools for in silico prediction of MHC class I and class II genotypes from NGS data.


Journal

BMC genomics
ISSN: 1471-2164
Titre abrégé: BMC Genomics
Pays: England
ID NLM: 100965258

Informations de publication

Date de publication:
09 May 2023
Historique:
received: 19 04 2023
accepted: 30 04 2023
medline: 11 5 2023
pubmed: 10 5 2023
entrez: 10 5 2023
Statut: epublish

Résumé

The Human Leukocyte Antigen (HLA) genes are a group of highly polymorphic genes that are located in the Major Histocompatibility Complex (MHC) region on chromosome 6. The HLA genotype affects the presentability of tumour antigens to the immune system. While knowledge of these genotypes is of utmost importance to study differences in immune responses between cancer patients, gold standard, PCR-derived genotypes are rarely available in large Next Generation Sequencing (NGS) datasets. Therefore, a variety of methods for in silico NGS-based HLA genotyping have been developed, bypassing the need to determine these genotypes with separate experiments. However, there is currently no consensus on the best performing tool. We evaluated 13 MHC class I and/or class II HLA callers that are currently available for free academic use and run on either Whole Exome Sequencing (WES) or RNA sequencing data. Computational resource requirements were highly variable between these tools. Three orthogonal approaches were used to evaluate the accuracy on several large publicly available datasets: a direct benchmark using PCR-derived gold standard HLA calls, a correlation analysis with population-based allele frequencies and an analysis of the concordance between the different tools. The highest MHC-I calling accuracies were found for Optitype (98.0%) and arcasHLA (99.4%) on WES and RNA sequencing data respectively, while for MHC-II HLA-HD was the most accurate tool for both data types (96.2% and 99.4% on WES and RNA data respectively). The optimal strategy for HLA genotyping from NGS data depends on the availability of either WES or RNA data, the size of the dataset and the available computational resources. If sufficient resources are available, we recommend Optitype and HLA-HD for MHC-I and MHC-II genotype calling respectively.

Sections du résumé

BACKGROUND BACKGROUND
The Human Leukocyte Antigen (HLA) genes are a group of highly polymorphic genes that are located in the Major Histocompatibility Complex (MHC) region on chromosome 6. The HLA genotype affects the presentability of tumour antigens to the immune system. While knowledge of these genotypes is of utmost importance to study differences in immune responses between cancer patients, gold standard, PCR-derived genotypes are rarely available in large Next Generation Sequencing (NGS) datasets. Therefore, a variety of methods for in silico NGS-based HLA genotyping have been developed, bypassing the need to determine these genotypes with separate experiments. However, there is currently no consensus on the best performing tool.
RESULTS RESULTS
We evaluated 13 MHC class I and/or class II HLA callers that are currently available for free academic use and run on either Whole Exome Sequencing (WES) or RNA sequencing data. Computational resource requirements were highly variable between these tools. Three orthogonal approaches were used to evaluate the accuracy on several large publicly available datasets: a direct benchmark using PCR-derived gold standard HLA calls, a correlation analysis with population-based allele frequencies and an analysis of the concordance between the different tools. The highest MHC-I calling accuracies were found for Optitype (98.0%) and arcasHLA (99.4%) on WES and RNA sequencing data respectively, while for MHC-II HLA-HD was the most accurate tool for both data types (96.2% and 99.4% on WES and RNA data respectively).
CONCLUSION CONCLUSIONS
The optimal strategy for HLA genotyping from NGS data depends on the availability of either WES or RNA data, the size of the dataset and the available computational resources. If sufficient resources are available, we recommend Optitype and HLA-HD for MHC-I and MHC-II genotype calling respectively.

Identifiants

pubmed: 37161318
doi: 10.1186/s12864-023-09351-z
pii: 10.1186/s12864-023-09351-z
pmc: PMC10170851
doi:

Substances chimiques

HLA Antigens 0

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

247

Subventions

Organisme : Kom op tegen Kanker
ID : STI.VLK.2022.0013.01
Organisme : Bijzonder Onderzoeksfonds UGent
ID : BOF.STG.2019.0073.01

Informations de copyright

© 2023. The Author(s).

Références

Genome Med. 2012 Dec 22;4(12):102
pubmed: 23259685
Transfus Med Hemother. 2019 Oct;46(5):312-325
pubmed: 31832057
Proc Natl Acad Sci U S A. 2017 Jul 25;114(30):8059-8064
pubmed: 28674023
Cell Mol Immunol. 2015 Mar;12(2):139-53
pubmed: 25418469
Mol Oncol. 2021 Jul;15(7):1764-1782
pubmed: 33411982
Brief Bioinform. 2018 Mar 1;19(2):179-187
pubmed: 27802932
PLoS Genet. 2021 Feb 8;17(2):e1009368
pubmed: 33556087
Lancet Oncol. 2022 Jan;23(1):172-184
pubmed: 34895481
Cell. 2017 Nov 2;171(4):934-949.e16
pubmed: 29033130
Nat Genet. 2019 Dec;51(12):1741-1748
pubmed: 31768072
J Transl Med. 2005 Mar 04;3(1):11
pubmed: 15748285
Nucleic Acids Res. 2015 Jun 23;43(11):e70
pubmed: 25753671
Brief Bioinform. 2021 May 20;22(3):
pubmed: 32940337
Brief Bioinform. 2021 May 20;22(3):
pubmed: 32662817
Cancer Res. 2018 Aug 15;78(16):4573-4585
pubmed: 29752262
BMC Bioinformatics. 2018 Jun 25;19(1):239
pubmed: 29940840
Nucleic Acids Res. 2013 Aug;41(14):e142
pubmed: 23748956
Genome Med. 2015 Mar 16;7(1):25
pubmed: 25908942
Bioinformatics. 2020 Jan 1;36(1):33-40
pubmed: 31173059
HLA. 2019 Dec;94(6):504-513
pubmed: 31496113
Nat Med. 2019 Dec;25(12):1916-1927
pubmed: 31792460
Cell. 2015 Jan 15;160(1-2):48-61
pubmed: 25594174
PLoS One. 2018 Oct 26;13(10):e0206512
pubmed: 30365549
Bioinformatics. 2019 Nov 1;35(21):4394-4396
pubmed: 30942877
Science. 2013 Mar 29;339(6127):1546-58
pubmed: 23539594
Nature. 1999 Oct 28;401(6756):921-3
pubmed: 10553908
Immunology. 2003 Oct;110(2):163-9
pubmed: 14511229
Cancer Res. 2013 Jul 15;73(14):4372-82
pubmed: 23856246
Immunity. 2005 Mar;22(3):371-83
pubmed: 15780993
Genome Med. 2017 Sep 27;9(1):86
pubmed: 28954626
Bioinformatics. 2018 Mar 1;34(5):867-868
pubmed: 29096012
PLoS One. 2013 Jul 24;8(7):e69388
pubmed: 23894464
PLoS One. 2013 Jun 28;8(6):e67885
pubmed: 23840783
J Hum Genet. 2017 Mar;62(3):397-405
pubmed: 27881843
Cancer Res. 2019 Jul 1;79(13):3514-3524
pubmed: 31113817
J Transl Med. 2004 Sep 13;2(1):30
pubmed: 15363110
Genome Biol. 2018 Feb 7;19(1):16
pubmed: 29415772
Cancer Lett. 2017 Apr 28;392:17-25
pubmed: 28104443
PLoS One. 2013 Jun 06;8(6):e64683
pubmed: 23762245
Nature. 2013 Sep 26;501(7468):506-11
pubmed: 24037378
BMC Genomics. 2014 May 01;15:325
pubmed: 24884790
HLA. 2021 Jun;97(6):481-492
pubmed: 33655664
Methods Mol Biol. 2018;1802:193-201
pubmed: 29858810
Front Immunol. 2021 Mar 31;12:652258
pubmed: 33868290
Front Immunol. 2021 Oct 01;12:688183
pubmed: 34659196
J Comput Biol. 2019 Sep;26(9):923-937
pubmed: 30942618
Nature. 2019 Oct;574(7780):696-701
pubmed: 31645760
Hum Mutat. 2017 Jul;38(7):788-797
pubmed: 28419628
Genome Med. 2012 Dec 10;4(12):95
pubmed: 23228053
Science. 2015 Apr 3;348(6230):69-74
pubmed: 25838375
Nucleic Acids Res. 2020 Jan 8;48(D1):D783-D788
pubmed: 31722398
Proc Natl Acad Sci U S A. 1985 Aug;82(15):5140-4
pubmed: 3860848
PLoS One. 2011;6(6):e20284
pubmed: 21695124
Nat Rev Genet. 2004 Dec;5(12):889-99
pubmed: 15573121
Trends Genet. 1993 Apr;9(4):117-22
pubmed: 8516845
BMC Genomics. 2015;16 Suppl 2:S7
pubmed: 25708870
Clin Cancer Res. 2019 Apr 15;25(8):2392-2402
pubmed: 30463850
Gigascience. 2017 Jul 1;6(7):1-8
pubmed: 28531267
Hum Genome Var. 2019 Jun 19;6:29
pubmed: 31240105
Science. 2018 Feb 2;359(6375):582-587
pubmed: 29217585
PLoS One. 2014 Jul 02;9(7):e97282
pubmed: 24988075
BMC Bioinformatics. 2017 May 12;18(1):258
pubmed: 28499414
Bioinformatics. 2014 Dec 1;30(23):3310-6
pubmed: 25143287

Auteurs

Arne Claeys (A)

Department of Human Structure and Repair, Ghent University, Ghent, Belgium.
Cancer Research Institute Ghent, Ghent, Belgium.

Peter Merseburger (P)

Department of Human Structure and Repair, Ghent University, Ghent, Belgium.
Cancer Research Institute Ghent, Ghent, Belgium.

Jasper Staut (J)

Department of Human Structure and Repair, Ghent University, Ghent, Belgium.

Kathleen Marchal (K)

Cancer Research Institute Ghent, Ghent, Belgium.
Department of Information Technology, Ghent University, IDLab, Ghent, Belgium.
Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium.

Jimmy Van den Eynden (J)

Department of Human Structure and Repair, Ghent University, Ghent, Belgium. jimmy.vandeneynden@ugent.be.
Cancer Research Institute Ghent, Ghent, Belgium. jimmy.vandeneynden@ugent.be.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C

Classifications MeSH