SC-JNMF: single-cell clustering integrating multiple quantification methods based on joint non-negative matrix factorization.

Clustering Non-negative matrix factorization RNA-seq Single-cell

Journal

PeerJ
ISSN: 2167-8359
Titre abrégé: PeerJ
Pays: United States
ID NLM: 101603425

Informations de publication

Date de publication:
2021
Historique:
received: 10 03 2021
accepted: 07 08 2021
entrez: 17 9 2021
pubmed: 18 9 2021
medline: 18 9 2021
Statut: epublish

Résumé

Single-cell RNA-sequencing is a rapidly evolving technology that enables us to understand biological processes at unprecedented resolution. Single-cell expression analysis requires a complex data processing pipeline, and the pipeline is divided into two main parts: The quantification part, which converts the sequence information into gene-cell matrix data; the analysis part, which analyzes the matrix data using statistics and/or machine learning techniques. In the analysis part, unsupervised cell clustering plays an important role in identifying cell types and discovering cell diversity and subpopulations. Identified cell clusters are also used for subsequent analysis, such as finding differentially expressed genes and inferring cell trajectories. However, single-cell clustering using gene expression profiles shows different results depending on the quantification methods. Clustering results are greatly affected by the quantification method used in the upstream process. In other words, even if the original RNA-sequence data is the same, gene expression profiles processed by different quantification methods will produce different clusters. In this article, we propose a robust and highly accurate clustering method based on joint non-negative matrix factorization (joint-NMF) by utilizing the information from multiple gene expression profiles quantified using different methods from the same RNA-sequence data. Our joint-NMF can extract common factors among multiple gene expression profiles by applying each NMF under the constraint that one of the factorized matrices is shared among multiple NMFs. The joint-NMF determines more robust and accurate cell clustering results by leveraging multiple quantification methods compared to conventional clustering methods, which use only a single gene expression profile. Additionally, we showed the usefulness of discovering marker genes with the extracted features using our method.

Identifiants

pubmed: 34532161
doi: 10.7717/peerj.12087
pii: 12087
pmc: PMC8404576
doi:

Types de publication

Journal Article

Langues

eng

Pagination

e12087

Informations de copyright

©2021 Shiga et al.

Déclaration de conflit d'intérêts

The authors declare there are no competing interests.

Références

Proc Natl Acad Sci U S A. 2018 Jul 24;115(30):7723-7728
pubmed: 29987051
BMC Genomics. 2018 Jul 3;19(1):510
pubmed: 29969991
Bioinformatics. 2010 Feb 15;26(4):493-500
pubmed: 20022975
Sci Rep. 2018 Jun 27;8(1):9743
pubmed: 29950679
Nat Biotechnol. 2014 Oct;32(10):1053-8
pubmed: 25086649
Nat Methods. 2020 Mar;17(3):261-272
pubmed: 32015543
Nucleic Acids Res. 2012 Oct;40(19):9379-91
pubmed: 22879375
Nat Commun. 2019 Oct 11;10(1):4667
pubmed: 31604912
F1000Res. 2018 Aug 15;7:1297
pubmed: 30228881
Nat Methods. 2019 Jun;16(6):479-487
pubmed: 31133762
Nat Biotechnol. 2015 May;33(5):495-502
pubmed: 25867923
Proc Natl Acad Sci U S A. 2004 Mar 23;101(12):4164-9
pubmed: 15016911
Nature. 1999 Oct 21;401(6755):788-91
pubmed: 10548103
Nat Biotechnol. 2010 May;28(5):511-5
pubmed: 20436464
Cell. 2012 May 25;149(5):979-93
pubmed: 22608084
Cell Metab. 2016 Oct 11;24(4):608-615
pubmed: 27667665
Nat Methods. 2017 May;14(5):483-486
pubmed: 28346451
Nat Biotechnol. 2016 May;34(5):525-7
pubmed: 27043002
Genome Biol. 2018 Feb 6;19(1):15
pubmed: 29409532
Genome Biol. 2020 Feb 3;21(1):25
pubmed: 32014031
Genome Biol. 2019 Mar 27;20(1):65
pubmed: 30917859
Bioinformatics. 2017 Jan 15;33(2):235-242
pubmed: 27663498
Nat Methods. 2012 Mar 04;9(4):357-9
pubmed: 22388286
PLoS One. 2017 Dec 21;12(12):e0190152
pubmed: 29267363
Nature. 2014 May 15;509(7500):371-5
pubmed: 24739965
Neural Comput. 2007 Oct;19(10):2756-79
pubmed: 17716011
Bioinformatics. 2013 Jan 1;29(1):15-21
pubmed: 23104886
Bioinformatics. 2016 Jan 1;32(1):1-8
pubmed: 26377073
PeerJ. 2020 Oct 16;8:e10091
pubmed: 33088619
Cell Rep. 2019 Feb 5;26(6):1627-1640.e7
pubmed: 30726743
PeerJ. 2017 Jan 19;5:e2888
pubmed: 28133571
Bioinformatics. 2015 Feb 15;31(4):572-80
pubmed: 25411328
Genome Biol. 2019 Dec 10;20(1):269
pubmed: 31823809
Nat Methods. 2017 Apr;14(4):417-419
pubmed: 28263959
Cell Metab. 2016 Oct 11;24(4):593-607
pubmed: 27667667
Nat Commun. 2017 Jan 16;8:14049
pubmed: 28091601
Nat Rev Genet. 2019 May;20(5):273-282
pubmed: 30617341
IEEE Trans Nanobioscience. 2011 Jun;10(2):86-93
pubmed: 21742573

Auteurs

Mikio Shiga (M)

Graduate School of Information Science and Technology, Osaka University, Osaka, Japan.

Shigeto Seno (S)

Graduate School of Information Science and Technology, Osaka University, Osaka, Japan.

Makoto Onizuka (M)

Graduate School of Information Science and Technology, Osaka University, Osaka, Japan.

Hideo Matsuda (H)

Graduate School of Information Science and Technology, Osaka University, Osaka, Japan.

Classifications MeSH