CellO: comprehensive and hierarchical cell type classification of human cells with the Cell Ontology.

Classification of Bioinformatical Subject Genomic Analysis Genomics

Journal

iScience
ISSN: 2589-0042
Titre abrégé: iScience
Pays: United States
ID NLM: 101724038

Informations de publication

Date de publication:
22 Jan 2021
Historique:
received: 30 07 2020
revised: 28 10 2020
accepted: 02 12 2020
entrez: 28 12 2020
pubmed: 29 12 2020
medline: 29 12 2020
Statut: epublish

Résumé

Cell type annotation is a fundamental task in the analysis of single-cell RNA-sequencing data. In this work, we present CellO, a machine learning-based tool for annotating human RNA-seq data with the Cell Ontology. CellO enables accurate and standardized cell type classification of cell clusters by considering the rich hierarchical structure of known cell types. Furthermore, CellO comes pre-trained on a comprehensive data set of human, healthy, untreated primary samples in the Sequence Read Archive. CellO's comprehensive training set enables it to run out of the box on diverse cell types and achieves competitive or even superior performance when compared to existing state-of-the-art methods. Lastly, CellO's linear models are easily interpreted, thereby enabling exploration of cell-type-specific expression signatures across the ontology. To this end, we also present the CellO Viewer: a web application for exploring CellO's models across the ontology.

Identifiants

pubmed: 33364592
doi: 10.1016/j.isci.2020.101913
pii: S2589-0042(20)31110-X
pmc: PMC7753962
doi:

Types de publication

Journal Article

Langues

eng

Pagination

101913

Subventions

Organisme : NLM NIH HHS
ID : T15 LM007359
Pays : United States

Informations de copyright

© 2020 The Authors.

Déclaration de conflit d'intérêts

The authors declare no competing interests.

Références

Oncogene. 2018 Aug;37(32):4343-4357
pubmed: 29720723
BMC Genomics. 2013 Sep 20;14:632
pubmed: 24053356
Bioinformatics. 2020 Jan 15;36(2):533-538
pubmed: 31359028
Nucleic Acids Res. 2011 Jan;39(Database issue):D507-13
pubmed: 21030441
Science. 2014 Feb 14;343(6172):776-9
pubmed: 24531970
Sci Rep. 2019 Mar 26;9(1):5233
pubmed: 30914743
J Cell Biol. 1995 Jul;130(2):393-405
pubmed: 7615639
Immunology. 2013 Sep;140(1):22-30
pubmed: 23621371
Nat Methods. 2019 Oct;16(10):1007-1015
pubmed: 31501550
Bioinformatics. 2013 Dec 1;29(23):3036-44
pubmed: 24037214
Nat Rev Neurosci. 2008 Jun;9(6):437-52
pubmed: 18478032
PLoS One. 2018 Oct 10;13(10):e0205499
pubmed: 30304022
Nat Immunol. 2019 Feb;20(2):163-172
pubmed: 30643263
Biol Reprod. 2011 Oct;85(4):733-43
pubmed: 21653890
Cell Syst. 2016 Nov 23;3(5):491-495.e5
pubmed: 27863955
Nat Med. 2020 Feb;26(2):259-269
pubmed: 32042191
Nat Biotechnol. 2016 May;34(5):525-7
pubmed: 27043002
Cell Syst. 2019 Aug 28;9(2):207-213.e2
pubmed: 31377170
Genome Biol. 2005;6(2):R21
pubmed: 15693950
Nat Methods. 2015 May;12(5):453-7
pubmed: 25822800
Nat Methods. 2013 Nov;10(11):1096-8
pubmed: 24056875
Genome Biol. 2018 Feb 6;19(1):15
pubmed: 29409532
BMC Bioinformatics. 2017 Oct 12;18(1):449
pubmed: 29025394
iScience. 2020 Jul 24;23(7):101273
pubmed: 32599560
Cell. 2016 Oct 6;167(2):566-580.e19
pubmed: 27716510
Bioinformatics. 2018 Jul 1;34(13):2322-2324
pubmed: 29949954
Genome Biol Evol. 2018 Feb 1;10(2):538-552
pubmed: 29373668
Nucleic Acids Res. 2019 Jan 8;47(D1):D721-D728
pubmed: 30289549
Genome Biol. 2019 Sep 9;20(1):194
pubmed: 31500660
Nat Biotechnol. 2018 Dec 03;:
pubmed: 30531897
Nucleic Acids Res. 2011 Jan;39(Database issue):D19-21
pubmed: 21062823
Genome Biol. 2019 Dec 12;20(1):264
pubmed: 31829268
IEEE Trans Vis Comput Graph. 2013 Dec;19(12):2042-51
pubmed: 24051770
Bioinformatics. 2019 Nov 1;35(22):4688-4695
pubmed: 31028376
Genome Biol. 2012 Jan 31;13(1):R5
pubmed: 22293552
iScience. 2020 Mar 27;23(3):100882
pubmed: 32062421
Circulation. 2007 Aug 21;116(8):954-60
pubmed: 17709650
Nucleic Acids Res. 2017 Jan 4;45(D1):D737-D743
pubmed: 27794045
Bioinformatics. 2017 Sep 15;33(18):2914-2923
pubmed: 28535296
Nucleic Acids Res. 2019 Sep 19;47(16):e95
pubmed: 31226206
Cell Rep. 2019 Feb 5;26(6):1627-1640.e7
pubmed: 30726743
Nat Methods. 2019 Oct;16(10):983-986
pubmed: 31501545
Nat Rev Genet. 2016 Dec;17(12):744-757
pubmed: 27818507
Cell Metab. 2016 Oct 11;24(4):593-607
pubmed: 27667667
Nat Commun. 2017 Jan 16;8:14049
pubmed: 28091601
Genome Biol. 2008;9 Suppl 1:S6
pubmed: 18613950
Genome Biol. 2017 Nov 15;18(1):220
pubmed: 29141660
Nat Rev Genet. 2019 May;20(5):273-282
pubmed: 30617341
Nucleic Acids Res. 2016 Jan 4;44(D1):D726-32
pubmed: 26527727

Auteurs

Matthew N Bernstein (MN)

Morgridge Institute for Research, Madison, WI 53715, USA.

Zhongjie Ma (Z)

Department of Computer Sciences, University of Wisconsin - Madison, Madison, WI 53706, USA.

Michael Gleicher (M)

Department of Computer Sciences, University of Wisconsin - Madison, Madison, WI 53706, USA.

Colin N Dewey (CN)

Department of Computer Sciences, University of Wisconsin - Madison, Madison, WI 53706, USA.
Department of Biostatistics and Medical Informatics, University of Wisconsin - Madison, Madison, WI 53792, USA.

Classifications MeSH