Semi-supervised learning for somatic variant calling and peptide identification in personalized cancer immunotherapy.


Journal

BMC bioinformatics
ISSN: 1471-2105
Titre abrégé: BMC Bioinformatics
Pays: England
ID NLM: 100965194

Informations de publication

Date de publication:
30 Dec 2020
Historique:
received: 05 10 2020
accepted: 13 10 2020
entrez: 30 12 2020
pubmed: 31 12 2020
medline: 13 1 2021
Statut: epublish

Résumé

Personalized cancer vaccines are emerging as one of the most promising approaches to immunotherapy of advanced cancers. However, only a small proportion of the neoepitopes generated by somatic DNA mutations in cancer cells lead to tumor rejection. Since it is impractical to experimentally assess all candidate neoepitopes prior to vaccination, developing accurate methods for predicting tumor-rejection mediating neoepitopes (TRMNs) is critical for enabling routine clinical use of cancer vaccines. In this paper we introduce Positive-unlabeled Learning using AuTOml (PLATO), a general semi-supervised approach to improving accuracy of model-based classifiers. PLATO generates a set of high confidence positive calls by applying a stringent filter to model-based predictions, then rescores remaining candidates by using positive-unlabeled learning. To achieve robust performance on clinical samples with large patient-to-patient variation, PLATO further integrates AutoML hyper-parameter tuning, classification threshold selection based on spies, and support for bootstrapping. Experimental results on real datasets demonstrate that PLATO has improved performance compared to model-based approaches for two key steps in TRMN prediction, namely somatic variant calling from exome sequencing data and peptide identification from MS/MS data.

Sections du résumé

BACKGROUND BACKGROUND
Personalized cancer vaccines are emerging as one of the most promising approaches to immunotherapy of advanced cancers. However, only a small proportion of the neoepitopes generated by somatic DNA mutations in cancer cells lead to tumor rejection. Since it is impractical to experimentally assess all candidate neoepitopes prior to vaccination, developing accurate methods for predicting tumor-rejection mediating neoepitopes (TRMNs) is critical for enabling routine clinical use of cancer vaccines.
RESULTS RESULTS
In this paper we introduce Positive-unlabeled Learning using AuTOml (PLATO), a general semi-supervised approach to improving accuracy of model-based classifiers. PLATO generates a set of high confidence positive calls by applying a stringent filter to model-based predictions, then rescores remaining candidates by using positive-unlabeled learning. To achieve robust performance on clinical samples with large patient-to-patient variation, PLATO further integrates AutoML hyper-parameter tuning, classification threshold selection based on spies, and support for bootstrapping.
CONCLUSIONS CONCLUSIONS
Experimental results on real datasets demonstrate that PLATO has improved performance compared to model-based approaches for two key steps in TRMN prediction, namely somatic variant calling from exome sequencing data and peptide identification from MS/MS data.

Identifiants

pubmed: 33375939
doi: 10.1186/s12859-020-03813-x
pii: 10.1186/s12859-020-03813-x
pmc: PMC7772914
doi:

Substances chimiques

Epitopes 0
Peptides 0

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

498

Subventions

Organisme : National Science Foundation
ID : 1564936
Organisme : Office of Postsecondary Education
ID : P200A180092

Références

Nat Commun. 2019 Mar 4;10(1):1041
pubmed: 30833567
Cancer Immunol Res. 2015 Sep;3(9):969-77
pubmed: 26342008
Nat Biotechnol. 2018 Nov;36(10):983-987
pubmed: 30247488
R Soc Open Sci. 2017 Apr 5;4(4):170050
pubmed: 28484631
Front Immunol. 2019 Jun 24;10:1392
pubmed: 31293573
J Proteomics. 2013 Mar 27;80:123-31
pubmed: 23268117
Cancer Res. 2012 Mar 1;72(5):1081-91
pubmed: 22237626
Genome Med. 2016 Jan 29;8(1):11
pubmed: 26825632
Bioinformatics. 2012 Jul 15;28(14):1811-7
pubmed: 22581179
Nat Methods. 2007 Mar;4(3):207-14
pubmed: 17327847
Nature. 2014 Nov 27;515(7528):577-81
pubmed: 25428507
BMC Genomics. 2016 Nov 14;17(1):912
pubmed: 27842494
Anal Chem. 2015 Nov 17;87(22):11361-7
pubmed: 26499134
Nat Genet. 2018 Dec;50(12):1735-1743
pubmed: 30397337
Cell Rep. 2018 Nov 6;25(6):1446-1457
pubmed: 30404001
Nat Commun. 2014 Oct 31;5:5277
pubmed: 25358478
Nucleic Acids Res. 2001 Jan 1;29(1):308-11
pubmed: 11125122
Cancer Immunol Immunother. 2017 Sep;66(9):1123-1130
pubmed: 28429069
Sci Rep. 2016 Nov 22;6:36540
pubmed: 27874022
PLoS One. 2016 Mar 22;11(3):e0151664
pubmed: 27002637
Nature. 2014 Nov 27;515(7528):572-6
pubmed: 25428506
JCI Insight. 2019 Jun 20;5:
pubmed: 31219806
Genome Biol. 2016 Aug 24;17(1):178
pubmed: 27557938
Nat Methods. 2007 Nov;4(11):923-5
pubmed: 17952086
J Exp Med. 2014 Oct 20;211(11):2231-48
pubmed: 25245761
Science. 2015 Apr 3;348(6230):69-74
pubmed: 25838375
Nat Commun. 2016 Nov 21;7:13404
pubmed: 27869121
Bioinformatics. 2019 Sep 1;35(17):3157-3159
pubmed: 30649191
Genome Biol. 2015 Sep 17;16:197
pubmed: 26381235
BMC Genomics. 2012 Apr 12;13 Suppl 2:S6
pubmed: 22537301

Auteurs

Elham Sherafat (E)

Computer Science and Engineering Department, University of Connecticut, Storrs, CT, 06269, USA.

Jordan Force (J)

Computer Science and Engineering Department, University of Connecticut, Storrs, CT, 06269, USA.

Ion I Măndoiu (II)

Computer Science and Engineering Department, University of Connecticut, Storrs, CT, 06269, USA. ion@engr.uconn.edu.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH