Compound models and Pearson residuals for normalization of single-cell RNA-seq data without UMIs.

Journal

bioRxiv : the preprint server for biology

Titre abrégé: bioRxiv

Pays: United States

ID NLM: 101680187

Informations de publication

Date de publication:
05 Aug 2023

Historique:

pubmed: 14 8 2023

medline: 14 8 2023

entrez: 14 8 2023

Statut: epublish

Résumé

Before downstream analysis can reveal biological signals in single-cell RNA sequencing data, normalization and variance stabilization are required to remove technical noise. Recently, Pearson residuals based on negative binomial models have been suggested as an efficient normalization approach. These methods were developed for UMI-based sequencing protocols, where unique molecular identifiers (UMIs) help to remove PCR amplification noise by keeping track of the original molecules. In contrast, full-length protocols such as Smart-seq2 lack UMIs and retain amplification noise, making negative binomial models inapplicable. Here, we extend Pearson residuals to such read count data by modeling them as a compound process: we assume that the captured RNA molecules follow the negative binomial distribution, but are replicated according to an amplification distribution. Based on this model, we introduce

Identifiants

DOI: 10.1101/2023.08.02.551637 PMID: 37577688 PMC: PMC10418209

pubmed: 37577688

doi: 10.1101/2023.08.02.551637

pmc: PMC10418209

pii:

doi:

Types de publication

Preprint

Langues

eng

Compound models and Pearson residuals for normalization of single-cell RNA-seq data without UMIs.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Auteurs

Jan Lause (J)

Christoph Ziegenhain (C)

Leonard Hartmanis (L)

Philipp Berens (P)

Dmitry Kobak (D)

Classifications MeSH