SIMS: A deep-learning label transfer tool for single-cell RNA sequencing analysis.

RNA sequencing TabNet brain organoids cell atlas label transfer machine learning neurodevelopment neuroscience data reference mapping single cell analysis

Journal

Cell genomics
ISSN: 2666-979X
Titre abrégé: Cell Genom
Pays: United States
ID NLM: 9918284260106676

Informations de publication

Date de publication:
24 May 2024
Historique:
received: 17 11 2023
revised: 02 04 2024
accepted: 09 05 2024
medline: 2 6 2024
pubmed: 2 6 2024
entrez: 1 6 2024
Statut: aheadofprint

Résumé

Cell atlases serve as vital references for automating cell labeling in new samples, yet existing classification algorithms struggle with accuracy. Here we introduce SIMS (scalable, interpretable machine learning for single cell), a low-code data-efficient pipeline for single-cell RNA classification. We benchmark SIMS against datasets from different tissues and species. We demonstrate SIMS's efficacy in classifying cells in the brain, achieving high accuracy even with small training sets (<3,500 cells) and across different samples. SIMS accurately predicts neuronal subtypes in the developing brain, shedding light on genetic changes during neuronal differentiation and postmitotic fate refinement. Finally, we apply SIMS to single-cell RNA datasets of cortical organoids to predict cell identities and uncover genetic variations between cell lines. SIMS identifies cell-line differences and misannotated cell lineages in human cortical organoids derived from different pluripotent stem cell lines. Altogether, we show that SIMS is a versatile and robust tool for cell-type classification from single-cell datasets.

Identifiants

pubmed: 38823397
pii: S2666-979X(24)00165-4
doi: 10.1016/j.xgen.2024.100581
pii:
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

100581

Informations de copyright

Copyright © 2024 The Author(s). Published by Elsevier Inc. All rights reserved.

Déclaration de conflit d'intérêts

Declaration of interests J.L., V.D.J., and M.A.M.-R. have submitted patent applications related to the work in this paper.

Auteurs

Jesus Gonzalez-Ferrer (J)

Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA 95060, USA; Live Cell Biotechnology Discovery Lab, University of California, Santa Cruz, Santa Cruz, CA 95060, USA; Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA 95060, USA.

Julian Lehrer (J)

Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA 95060, USA; Live Cell Biotechnology Discovery Lab, University of California, Santa Cruz, Santa Cruz, CA 95060, USA; Department of Applied Mathematics, University of California, Santa Cruz, Santa Cruz, CA 95060, USA.

Ash O'Farrell (A)

Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA 95060, USA.

Benedict Paten (B)

Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA 95060, USA; Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA 95060, USA.

Mircea Teodorescu (M)

Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA 95060, USA; Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA 95060, USA; Department of Electrical and Computer Engineering, University of California, Santa Cruz, Santa Cruz, CA 95060, USA.

David Haussler (D)

Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA 95060, USA; Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA 95060, USA.

Vanessa D Jonsson (VD)

Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA 95060, USA; Department of Applied Mathematics, University of California, Santa Cruz, Santa Cruz, CA 95060, USA. Electronic address: vjonsson@ucsc.edu.

Mohammed A Mostajo-Radji (MA)

Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA 95060, USA; Live Cell Biotechnology Discovery Lab, University of California, Santa Cruz, Santa Cruz, CA 95060, USA. Electronic address: mmostajo@ucsc.edu.

Classifications MeSH