USMPep: universal sequence models for major histocompatibility complex binding affinity prediction.


Journal

BMC bioinformatics
ISSN: 1471-2105
Titre abrégé: BMC Bioinformatics
Pays: England
ID NLM: 100965194

Informations de publication

Date de publication:
02 Jul 2020
Historique:
received: 14 11 2019
accepted: 23 06 2020
entrez: 4 7 2020
pubmed: 4 7 2020
medline: 15 8 2020
Statut: epublish

Résumé

Immunotherapy is a promising route towards personalized cancer treatment. A key algorithmic challenge in this process is to decide if a given peptide (neoepitope) binds with the major histocompatibility complex (MHC). This is an active area of research and there are many MHC binding prediction algorithms that can predict the MHC binding affinity for a given peptide to a high degree of accuracy. However, most of the state-of-the-art approaches make use of complicated training and model selection procedures, are restricted to peptides of a certain length and/or rely on heuristics. We put forward USMPep, a simple recurrent neural network that reaches state-of-the-art approaches on MHC class I binding prediction with a single, generic architecture and even a single set of hyperparameters both on IEDB benchmark datasets and on the very recent HPV dataset. Moreover, the algorithm is competitive for a single model trained from scratch, while ensembling multiple regressors and language model pretraining can still slightly improve the performance. The direct application of the approach to MHC class II binding prediction shows a solid performance despite of limited training data. We demonstrate that competitive performance in MHC binding affinity prediction can be reached with a standard architecture and training procedure without relying on any heuristics.

Sections du résumé

BACKGROUND BACKGROUND
Immunotherapy is a promising route towards personalized cancer treatment. A key algorithmic challenge in this process is to decide if a given peptide (neoepitope) binds with the major histocompatibility complex (MHC). This is an active area of research and there are many MHC binding prediction algorithms that can predict the MHC binding affinity for a given peptide to a high degree of accuracy. However, most of the state-of-the-art approaches make use of complicated training and model selection procedures, are restricted to peptides of a certain length and/or rely on heuristics.
RESULTS RESULTS
We put forward USMPep, a simple recurrent neural network that reaches state-of-the-art approaches on MHC class I binding prediction with a single, generic architecture and even a single set of hyperparameters both on IEDB benchmark datasets and on the very recent HPV dataset. Moreover, the algorithm is competitive for a single model trained from scratch, while ensembling multiple regressors and language model pretraining can still slightly improve the performance. The direct application of the approach to MHC class II binding prediction shows a solid performance despite of limited training data.
CONCLUSIONS CONCLUSIONS
We demonstrate that competitive performance in MHC binding affinity prediction can be reached with a standard architecture and training procedure without relying on any heuristics.

Identifiants

pubmed: 32615972
doi: 10.1186/s12859-020-03631-1
pii: 10.1186/s12859-020-03631-1
pmc: PMC7330990
doi:

Substances chimiques

Histocompatibility Antigens Class I 0
Histocompatibility Antigens Class II 0
Peptides 0

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

279

Références

BMC Bioinformatics. 2010 Nov 22;11:568
pubmed: 21092157
PLoS Comput Biol. 2018 Nov 8;14(11):e1006457
pubmed: 30408041
J Immunol. 2013 Dec 15;191(12):5831-9
pubmed: 24190657
Bioinformatics. 2016 Feb 15;32(4):511-7
pubmed: 26515819
Nucleic Acids Res. 2019 Jan 8;47(D1):D339-D343
pubmed: 30357391
Brief Bioinform. 2019 Jun 14;:
pubmed: 31204427
Cell Syst. 2018 Jul 25;7(1):129-132.e4
pubmed: 29960884
Science. 2018 Mar 23;359(6382):1355-1360
pubmed: 29567706
Nat Rev Immunol. 2018 Mar;18(3):168-182
pubmed: 29226910
Immunogenetics. 2005 Apr;57(1-2):33-41
pubmed: 15744535
Nat Biotechnol. 2006 Jul;24(7):817-9
pubmed: 16767078
BMC Bioinformatics. 2019 May 28;20(1):270
pubmed: 31138107
Nat Biotechnol. 2015 Aug;33(8):831-8
pubmed: 26213851
BMC Bioinformatics. 2014 Jul 14;15:241
pubmed: 25017736
Nat Biomed Eng. 2019 Oct;3(10):768-782
pubmed: 31406259
Cancer Immunol Res. 2019 May;7(5):719-736
pubmed: 30902818
Bioinformatics. 2020 Apr 15;36(8):2401-2409
pubmed: 31913448
J Immunol. 2017 Nov 1;199(9):3360-3368
pubmed: 28978689
Science. 2015 Apr 3;348(6230):69-74
pubmed: 25838375
BMC Bioinformatics. 2009 Nov 30;10:394
pubmed: 19948066
Brief Bioinform. 2018 Mar 1;19(2):231-244
pubmed: 27881430

Auteurs

Johanna Vielhaben (J)

Fraunhofer Heinrich Hertz Institute, Einsteinufer 37, Berlin, 10587, Germany.

Markus Wenzel (M)

Fraunhofer Heinrich Hertz Institute, Einsteinufer 37, Berlin, 10587, Germany.

Wojciech Samek (W)

Fraunhofer Heinrich Hertz Institute, Einsteinufer 37, Berlin, 10587, Germany.

Nils Strodthoff (N)

Fraunhofer Heinrich Hertz Institute, Einsteinufer 37, Berlin, 10587, Germany. nils.strodthoff@hhi.fraunhofer.de.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH