Imputing dropouts for single-cell RNA sequencing based on multi-objective optimization.

Sequence Analysis, RNA Single-Cell Analysis Gene Expression Profiling Software Exome Sequencing

Journal

Bioinformatics (Oxford, England)

ISSN: 1367-4811

Titre abrégé: Bioinformatics

Pays: England

ID NLM: 9808944

Informations de publication

Date de publication:
13 06 2022

Historique:

received: 06 09 2021

revised: 21 04 2022

accepted: 25 04 2022

pubmed: 30 4 2022

medline: 15 11 2022

entrez: 29 4 2022

Statut: ppublish

Résumé

Single-cell RNA sequencing (scRNA-seq) technologies have been testified revolutionary for their promotion on the profiling of single-cell transcriptomes at single-cell resolution. Excess zeros due to various technical noises, called dropouts, will mislead downstream analyses. Therefore, it is crucial to have accurate imputation methods to address the dropout problem. In this article, we develop a new dropout imputation method for scRNA-seq data based on multi-objective optimization. Our method is different from existing ones, which assume that the underlying data has a preconceived structure and impute the dropouts according to the information learned from such structure. We assume that the data combines three types of latent structures, including the horizontal structure (genes are similar to each other), the vertical structure (cells are similar to each other) and the low-rank structure. The combination weights and latent structures are learned using multi-objective optimization. And, the weighted average of the observed data and the imputation results learned from the three types of structures are considered as the final result. Comprehensive downstream experiments show the superiority of our method in terms of recovery of true gene expression profiles, differential expression analysis, cell clustering and cell trajectory inference. The R package is available at https://github.com/Zhangxf-ccnu/scMOO and https://zenodo.org/record/5785195. The codes to reproduce the downstream analyses in this article can be found at https://github.com/Zhangxf-ccnu/scMOO_experiments_codes and https://zenodo.org/record/5786211. The detailed list of data sets used in the present study is represented in Supplementary Table S1 in the Supplementary materials. Supplementary data are available at Bioinformatics online.

Identifiants

DOI: 10.1093/bioinformatics/btac300 PMID: 35485740

pubmed: 35485740

pii: 6575885

doi: 10.1093/bioinformatics/btac300

doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

Pagination

3222-3230

Subventions

Organisme : National Natural Science Foundation of China

ID : 11871026

Organisme : Hubei Provincial Science and Technology Innovation Base (Platform) Special Project

ID : 2020DFH002

Organisme : Hong Kong Innovation and Technology Commission

Organisme : Hong Kong Research Grants Council

ID : 11200818

Organisme : City University of Hong Kong

ID : 9610460

Imputing dropouts for single-cell RNA sequencing based on multi-objective optimization.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Pagination

Subventions

Informations de copyright

Auteurs

Ke Jin (K)

Bo Li (B)

Hong Yan (H)

Xiao-Fei Zhang (XF)

Articles similaires

Selecting optimal software code descriptors-The case of Java.

Exploring blood-brain barrier passage using atomic weighted vector and machine learning.

Systematical characterization of Rab7 gene family in Gossypium and potential functions of GhRab7B3-A gene in drought tolerance.

Accuracy of web-based automated versus digital manual cephalometric landmark identification.

Classifications MeSH