Automated single-cell omics end-to-end framework with data-driven batch inference.

batch identification cell-type mapping information theory integration scATAC-seq scRNA-seq single-cell genomics

Journal

Cell systems
ISSN: 2405-4720
Titre abrégé: Cell Syst
Pays: United States
ID NLM: 101656080

Informations de publication

Date de publication:
28 Sep 2024
Historique:
received: 25 10 2023
revised: 20 06 2024
accepted: 12 09 2024
medline: 5 10 2024
pubmed: 5 10 2024
entrez: 4 10 2024
Statut: aheadofprint

Résumé

To facilitate single-cell multi-omics analysis and improve reproducibility, we present single-cell pipeline for end-to-end data integration (SPEEDI), a fully automated end-to-end framework for batch inference, data integration, and cell-type labeling. SPEEDI introduces data-driven batch inference and transforms the often heterogeneous data matrices obtained from different samples into a uniformly annotated and integrated dataset. Without requiring user input, it automatically selects parameters and executes pre-processing, sample integration, and cell-type mapping. It can also perform downstream analyses of differential signals between treatment conditions and gene functional modules. SPEEDI's data-driven batch-inference method works with widely used integration and cell-typing tools. By developing data-driven batch inference, providing full end-to-end automation, and eliminating parameter selection, SPEEDI improves reproducibility and lowers the barrier to obtaining biological insight from these valuable single-cell datasets. The SPEEDI interactive web application can be accessed at https://speedi.princeton.edu/. A record of this paper's transparent peer review process is included in the supplemental information.

Identifiants

pubmed: 39366377
pii: S2405-4712(24)00267-9
doi: 10.1016/j.cels.2024.09.003
pii:
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Informations de copyright

Copyright © 2024 Elsevier Inc. All rights reserved.

Déclaration de conflit d'intérêts

Declaration of interests S.C.S. is a consultant, equity owner, and interim chief scientific officer at GNOMX Corp. Patents were filed related to this work. O.G.T. is on the advisory board of Cell Systems.

Auteurs

Yuan Wang (Y)

Department of Computer Science, Princeton University, Princeton, NJ 08540, USA; Lewis-Sigler Institute of Integrative Genomics, Princeton University, Princeton, NJ 08540, USA.

William Thistlethwaite (W)

Lewis-Sigler Institute of Integrative Genomics, Princeton University, Princeton, NJ 08540, USA.

Alicja Tadych (A)

Lewis-Sigler Institute of Integrative Genomics, Princeton University, Princeton, NJ 08540, USA.

Frederique Ruf-Zamojski (F)

Department of Neurology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA.

Daniel J Bernard (DJ)

Department of Pharmacology and Therapeutics, McGill University, Montreal, QC H3G 1Y6, Canada.

Antonio Cappuccio (A)

Department of Neurology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA.

Elena Zaslavsky (E)

Department of Neurology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA.

Xi Chen (X)

Lewis-Sigler Institute of Integrative Genomics, Princeton University, Princeton, NJ 08540, USA; Center for Computational Biology, Flatiron Institute, New York, NY 10010, USA. Electronic address: xchen@flatironinstitute.org.

Stuart C Sealfon (SC)

Department of Neurology, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA. Electronic address: stuart.sealfon@mssm.edu.

Olga G Troyanskaya (OG)

Department of Computer Science, Princeton University, Princeton, NJ 08540, USA; Lewis-Sigler Institute of Integrative Genomics, Princeton University, Princeton, NJ 08540, USA; Center for Computational Biology, Flatiron Institute, New York, NY 10010, USA. Electronic address: ogt@genomics.princeton.edu.

Classifications MeSH