IKAP-Identifying K mAjor cell Population groups in single-cell RNA-sequencing analysis.
Seurat
cell ontology
clustering
single-cell RNA-sequencing
Journal
GigaScience
ISSN: 2047-217X
Titre abrégé: Gigascience
Pays: United States
ID NLM: 101596872
Informations de publication
Date de publication:
01 10 2019
01 10 2019
Historique:
received:
23
04
2019
revised:
05
08
2019
accepted:
16
09
2019
entrez:
2
10
2019
pubmed:
2
10
2019
medline:
3
4
2020
Statut:
ppublish
Résumé
In single-cell RNA-sequencing analysis, clustering cells into groups and differentiating cell groups by differentially expressed (DE) genes are 2 separate steps for investigating cell identity. However, the ability to differentiate between cell groups could be affected by clustering. This interdependency often creates a bottleneck in the analysis pipeline, requiring researchers to repeat these 2 steps multiple times by setting different clustering parameters to identify a set of cell groups that are more differentiated and biologically relevant. To accelerate this process, we have developed IKAP-an algorithm to identify major cell groups and improve differentiating cell groups by systematically tuning parameters for clustering. We demonstrate that, with default parameters, IKAP successfully identifies major cell types such as T cells, B cells, natural killer cells, and monocytes in 2 peripheral blood mononuclear cell datasets and recovers major cell types in a previously published mouse cortex dataset. These major cell groups identified by IKAP present more distinguishing DE genes compared with cell groups generated by different combinations of clustering parameters. We further show that cell subtypes can be identified by recursively applying IKAP within identified major cell types, thereby delineating cell identities in a multi-layered ontology. By tuning the clustering parameters to identify major cell groups, IKAP greatly improves the automation of single-cell RNA-sequencing analysis to produce distinguishing DE genes and refine cell ontology using single-cell RNA-sequencing data.
Sections du résumé
BACKGROUND
In single-cell RNA-sequencing analysis, clustering cells into groups and differentiating cell groups by differentially expressed (DE) genes are 2 separate steps for investigating cell identity. However, the ability to differentiate between cell groups could be affected by clustering. This interdependency often creates a bottleneck in the analysis pipeline, requiring researchers to repeat these 2 steps multiple times by setting different clustering parameters to identify a set of cell groups that are more differentiated and biologically relevant.
FINDINGS
To accelerate this process, we have developed IKAP-an algorithm to identify major cell groups and improve differentiating cell groups by systematically tuning parameters for clustering. We demonstrate that, with default parameters, IKAP successfully identifies major cell types such as T cells, B cells, natural killer cells, and monocytes in 2 peripheral blood mononuclear cell datasets and recovers major cell types in a previously published mouse cortex dataset. These major cell groups identified by IKAP present more distinguishing DE genes compared with cell groups generated by different combinations of clustering parameters. We further show that cell subtypes can be identified by recursively applying IKAP within identified major cell types, thereby delineating cell identities in a multi-layered ontology.
CONCLUSIONS
By tuning the clustering parameters to identify major cell groups, IKAP greatly improves the automation of single-cell RNA-sequencing analysis to produce distinguishing DE genes and refine cell ontology using single-cell RNA-sequencing data.
Identifiants
pubmed: 31574155
pii: 5579995
doi: 10.1093/gigascience/giz121
pmc: PMC6771546
pii:
doi:
Types de publication
Journal Article
Research Support, N.I.H., Intramural
Langues
eng
Sous-ensembles de citation
IM
Subventions
Organisme : NHLBI NIH HHS
ID : ZIC HL006228
Pays : United States
Informations de copyright
© The Author(s) 2019. Published by Oxford University Press.
Références
Science. 2015 Mar 6;347(6226):1138-42
pubmed: 25700174
Int Immunol. 2008 Jan;20(1):155-64
pubmed: 18048391
Semin Immunol. 1997 Apr;9(2):117-25
pubmed: 9194222
Immunogenetics. 2001 Aug;53(6):468-76
pubmed: 11685457
Mol Aspects Med. 2018 Feb;59:114-122
pubmed: 28712804
BMC Bioinformatics. 2017 Dec 21;18(Suppl 17):559
pubmed: 29322913
Gigascience. 2019 Oct 1;8(10):
pubmed: 31574155
Nat Biotechnol. 2018 Jun;36(5):411-420
pubmed: 29608179
Appl Immunohistochem Mol Morphol. 2001 Jun;9(2):97-106
pubmed: 11396639
Nucleic Acids Res. 2019 Sep 19;47(16):e95
pubmed: 31226206
J Biol Chem. 1997 Oct 17;272(42):26236-46
pubmed: 9334192
Bioinformatics. 2015 Aug 1;31(15):2595-7
pubmed: 25810428
Hum Mol Genet. 2018 May 1;27(R1):R40-R47
pubmed: 29590361
Elife. 2017 Dec 05;6:
pubmed: 29206104
Nat Methods. 2017 May;14(5):483-486
pubmed: 28346451
J Pathol. 1994 Aug;173(4):303-7
pubmed: 7525907
Nat Rev Genet. 2019 May;20(5):273-282
pubmed: 30617341
Blood. 2011 Sep 22;118(12):e50-61
pubmed: 21803849