sumrep: A Summary Statistic Framework for Immune Receptor Repertoire Comparison and Model Validation.
B cell receptor
T cell receptor
model validation
rep-seq
repertoire comparison
summary statistics
Journal
Frontiers in immunology
ISSN: 1664-3224
Titre abrégé: Front Immunol
Pays: Switzerland
ID NLM: 101560960
Informations de publication
Date de publication:
2019
2019
Historique:
received:
19
07
2019
accepted:
11
10
2019
entrez:
19
11
2019
pubmed:
19
11
2019
medline:
18
11
2020
Statut:
epublish
Résumé
The adaptive immune system generates an incredible diversity of antigen receptors for B and T cells to keep dangerous pathogens at bay. The DNA sequences coding for these receptors arise by a complex recombination process followed by a series of productivity-based filters, as well as affinity maturation for B cells, giving considerable diversity to the circulating pool of receptor sequences. Although these datasets hold considerable promise for medical and public health applications, the complex structure of the resulting adaptive immune receptor repertoire sequencing (AIRR-seq) datasets makes analysis difficult. In this paper we introduce sumrep, an R package that efficiently performs a wide variety of repertoire summaries and comparisons, and show how sumrep can be used to perform model validation. We find that summaries vary in their ability to differentiate between datasets, although many are able to distinguish between covariates such as donor, timepoint, and cell type for BCR and TCR repertoires. We show that deletion and insertion lengths resulting from V(D)J recombination tend to be more discriminative characterizations of a repertoire than summaries that describe the amino acid composition of the CDR3 region. We also find that state-of-the-art generative models excel at recapitulating gene usage and recombination statistics in a given experimental repertoire, but struggle to capture many physiochemical properties of real repertoires.
Identifiants
pubmed: 31736960
doi: 10.3389/fimmu.2019.02533
pmc: PMC6838214
doi:
Substances chimiques
Receptors, Immunologic
0
Types de publication
Journal Article
Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
2533Subventions
Organisme : NIAID NIH HHS
ID : U19 AI128914
Pays : United States
Organisme : NIH HHS
ID : S10 OD020069
Pays : United States
Organisme : NIAID NIH HHS
ID : R01 AI146028
Pays : United States
Organisme : Howard Hughes Medical Institute
Pays : United States
Organisme : NIGMS NIH HHS
ID : R01 GM113246
Pays : United States
Informations de copyright
Copyright © 2019 Olson, Moghimi, Schramm, Obraztsova, Ralph, Vander Heiden, Shugay, Shepherd, Lees and Matsen.
Références
J Immunol. 2010 Jun 15;184(12):6986-92
pubmed: 20495067
Nat Commun. 2016 Mar 23;7:11112
pubmed: 27005435
BMC Bioinformatics. 2015 May 28;16:175
pubmed: 26017500
J Immunol. 2016 Jun 15;196(12):5005-13
pubmed: 27183615
Front Immunol. 2017 Nov 15;8:1500
pubmed: 29187849
Cancer Res. 2019 Apr 1;79(7):1671-1680
pubmed: 30622114
Proc Natl Acad Sci U S A. 2005 May 3;102(18):6395-400
pubmed: 15851683
BMC Bioinformatics. 2017 Sep 7;18(1):401
pubmed: 28882107
Nucleic Acids Res. 2012 Sep 1;40(17):e134
pubmed: 22641856
BMC Bioinformatics. 2017 Mar 7;18(1):155
pubmed: 28264647
Sci Rep. 2016 Sep 27;6:33843
pubmed: 27669665
PLoS Comput Biol. 2016 Jan 11;12(1):e1004409
pubmed: 26751373
Philos Trans R Soc Lond B Biol Sci. 2015 Sep 5;370(1676):null
pubmed: 26194751
Proc Natl Acad Sci U S A. 2014 Jul 8;111(27):9875-80
pubmed: 24941953
Nat Commun. 2018 Feb 8;9(1):561
pubmed: 29422654
Blood. 2010 Aug 19;116(7):1070-8
pubmed: 20457872
J Immunol. 2017 May 15;198(10):4156-4165
pubmed: 28416602
PLoS Comput Biol. 2015 Nov 25;11(11):e1004503
pubmed: 26606115
PLoS One. 2016 Nov 11;11(11):e0166126
pubmed: 27835690
Mol Immunol. 2007 Feb;44(6):1057-64
pubmed: 16930714
BMC Bioinformatics. 2015 Aug 12;16:252
pubmed: 26264428
J Immunol. 2012 Sep 15;189(6):3221-30
pubmed: 22865917
Proc Natl Acad Sci U S A. 2014 Apr 1;111(13):4928-33
pubmed: 24639495
Bioinformatics. 2015 Oct 15;31(20):3356-8
pubmed: 26069265
Front Immunol. 2018 Sep 28;9:2206
pubmed: 30323809
Front Immunol. 2012 Nov 15;3:342
pubmed: 23162556
Nat Commun. 2016 Dec 20;7:13642
pubmed: 27995928
Front Immunol. 2018 Jul 30;9:1686
pubmed: 30105017
J Immunol. 2017 Mar 15;198(6):2489-2499
pubmed: 28179494
Bioinformatics. 2014 Nov 15;30(22):3181-8
pubmed: 25095879
Nucleic Acids Res. 2013 Jul;41(Web Server issue):W34-40
pubmed: 23671333
Bioinformatics. 2017 Apr 1;33(7):951-955
pubmed: 28073756
PLoS One. 2016 Aug 23;11(8):e0161569
pubmed: 27551775
PLoS Comput Biol. 2016 Oct 17;12(10):e1005086
pubmed: 27749910
Proc Natl Acad Sci U S A. 2018 Dec 11;115(50):12704-12709
pubmed: 30459272
Proc Natl Acad Sci U S A. 2015 Apr 7;112(14):E1754-62
pubmed: 25831525
Bioinformatics. 2004 Jan 22;20(2):289-90
pubmed: 14734327
Front Immunol. 2011 Dec 26;2:81
pubmed: 22566870
Proc Natl Acad Sci U S A. 2015 Feb 24;112(8):E862-70
pubmed: 25675496