Population-level integration of single-cell datasets enables multi-scale analysis across samples.
Journal
Nature methods
ISSN: 1548-7105
Titre abrégé: Nat Methods
Pays: United States
ID NLM: 101215604
Informations de publication
Date de publication:
Nov 2023
Nov 2023
Historique:
received:
08
12
2022
accepted:
05
09
2023
medline:
9
11
2023
pubmed:
10
10
2023
entrez:
9
10
2023
Statut:
ppublish
Résumé
The increasing generation of population-level single-cell atlases has the potential to link sample metadata with cellular data. Constructing such references requires integration of heterogeneous cohorts with varying metadata. Here we present single-cell population level integration (scPoli), an open-world learner that incorporates generative models to learn sample and cell representations for data integration, label transfer and reference mapping. We applied scPoli on population-level atlases of lung and peripheral blood mononuclear cells, the latter consisting of 7.8 million cells across 2,375 samples. We demonstrate that scPoli can explain sample-level biological and technical variations using sample embeddings revealing genes associated with batch effects and biological effects. scPoli is further applicable to single-cell sequencing assay for transposase-accessible chromatin and cross-species datasets, offering insights into chromatin accessibility and comparative genomics. We envision scPoli becoming an important tool for population-level single-cell data integration facilitating atlas use but also interpretation by means of multi-scale analyses.
Identifiants
pubmed: 37813989
doi: 10.1038/s41592-023-02035-2
pii: 10.1038/s41592-023-02035-2
pmc: PMC10630133
doi:
Substances chimiques
Chromatin
0
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
1683-1692Subventions
Organisme : Helmholtz Association
ID : ZT-I-PF-5-01
Organisme : Deutsche Forschungsgemeinschaft (German Research Foundation)
ID : 458958943
Informations de copyright
© 2023. The Author(s).
Références
Regev, A. et al. Science forum: The Human Cell Atlas. eLife 6, e27041 (2017).
pubmed: 29206104
pmcid: 5762154
HuBMAP Consortium. The human body at cellular resolution: the NIH Human Biomolecular Atlas Program. Nature 574, 187–192 (2019).
Muus, C. et al. Single-cell meta-analysis of SARS-CoV-2 entry genes across tissues and demographics. Nat. Med. 27, 546–559 (2021).
pubmed: 33654293
pmcid: 9469728
Sikkema, L. et al. An integrated cell atlas of the lung in health and disease. Nat. Med. 29, 1563–1577 (2023).
Gayoso, A. et al. A Python library for probabilistic analysis of single-cell omics data. Nat. Biotechnol. 40, 163–166 (2022).
pubmed: 35132262
Luecken, M. D. et al. Benchmarking atlas-level data integration in single-cell genomics. Nat. Methods 19, 41–50 (2022).
pubmed: 34949812
Argelaguet, R., Cuomo, A. S., Stegle, O. & Marioni, J. C. Computational principles and challenges in single-cell data integration. Nat. Biotechnol. 39, 1202–1215 (2021).
pubmed: 33941931
Welch, J. D. et al. Single-cell multi-omic integration compares and contrasts features of brain cell identity. Cell 177, 1873–1887 (2019).
pubmed: 31178122
pmcid: 6716797
Ritchie, M. E. et al. Limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 43, e47–e47 (2015).
pubmed: 25605792
pmcid: 4402510
Korsunsky, I. et al. Fast, sensitive and accurate integration of single-cell data with harmony. Nat. Methods 16, 1289–1296 (2019).
pubmed: 31740819
pmcid: 6884693
Kiselev, V. Y., Yiu, A. & Hemberg, M. Scmap: projection of single-cell RNA-seq data across data sets. Nat. Methods 15, 359–362 (2018).
pubmed: 29608555
Stuart, T. et al. Comprehensive integration of single-cell data. Cell 177, 1888–1902 (2019).
pubmed: 31178118
pmcid: 6687398
Polański, K. et al. BBKNN: fast batch alignment of single cell transcriptomes. Bioinformatics 36, 964–965 (2020).
pubmed: 31400197
Haghverdi, L. et al. Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors. Nat. Biotechnol. 36, 421–427 (2018).
Lopez, R., Regier, J., Cole, M. B., Jordan, M. I. & Yosef, N. Deep generative modeling for single-cell transcriptomics. Nat. Methods 15, 1053–1058 (2018).
pubmed: 30504886
pmcid: 6289068
Lotfollahi, M., Wolf, F. A. & Theis, F. J. scGen predicts single-cell perturbation responses. Nat. Methods 16, 715–721 (2019).
pubmed: 31363220
Amodio, M. et al. Exploring single-cell data with deep multitasking neural networks. Nat. Methods 16, 1139–1145 (2019).
pubmed: 31591579
pmcid: 10164410
Lähnemann, D. et al. Eleven grand challenges in single-cell data science. Genome Biol. 21, 1–35 (2020).
Hao, Y. et al. Integrated analysis of multimodal single-cell data. Cell 184, 3573–3587 (2021).
pubmed: 34062119
pmcid: 8238499
Kang, J. B. et al. Efficient and precise single-cell reference atlas mapping with Symphony. Nat. Commun. 12, 1–21 (2021).
Lotfollahi, M. et al. Mapping single-cell data to reference atlases by transfer learning. Nat. Biotechnol. 40, 121–130 (2022).
pubmed: 34462589
Michielsen, L. et al. Single-cell reference mapping to construct and extend cell-type hierarchies. NAR Genomics and Bioinformatics 5, lqad070 (2023).
Osorio, D., McGrail, D. J., Sahni, N. & Yi, S. S. Drug combination prioritization for cancer treatment using single-cell RNA-seq based transfer learning. Preprint at bioRxiv (2022).
Xu, C. et al. Probabilistic harmonization and annotation of single-cell transcriptomics data with deep generative models. Mol. Syst. Biol. 17, e9620 (2021).
pubmed: 33491336
pmcid: 7829634
Fetaya, E., Jacobsen, J.-H., Grathwohl, W. & Zemel, R. Understanding the limitations of conditional generative models. Preprint at https://doi.org/10.48550/arXiv.1906.01171 (2019).
Brbić, M. et al. MARS: discovering novel cell types across heterogeneous single-cell experiments. Nat. Methods 17, 1200–1206 (2020).
pubmed: 33077966
Sohn, K., Lee, H. & Yan, X. Learning structured output representation using deep conditional generative models. Adv. Neural Inf. Process. Syst. 28, 3483–3491 (2015).
Snell, J., Swersky, K. & Zemel, R. Prototypical networks for few-shot learning. Adv. Neural Inf. Process. Syst. 30, (2017).
Lotfollahi, M., Naghipourfar, M., Theis, F. J. & Wolf, F. A. Conditional out-of-distribution generation for unpaired data using transfer VAE. Bioinformatics 36, i610–i617 (2020).
pubmed: 33381839
Hospedales, T., Antoniou, A., Micaelli, P. & Storkey, A. Meta-learning in neural networks: a survey. Preprint at arXiv https://doi.org/10.48550/arXiv.2004.05439 (2020).
Köhler, N. D., Büttner, M. & Theis, F. J. Deep learning does not outperform classical machine learning for cell-type annotation. Preprint at bioRxiv https://doi.org/10.1101/653907 (2019).
Madissoon, E. et al. A spatially resolved atlas of the human lung characterizes a gland-associated immune niche. Nat Genet. 55, 66–77 (2023).
Grabski, I. N., Street, K. & Irizarry, R. A. Significance analysis for clustering with single-cell RNA-sequencing data. Nat. Methods 20, 1196–1202 (2023).
Su, Y. et al. Multiomic immunophenotyping of COVID-19 patients reveals early infection trajectories. Preprint at bioRxiv (2020).
Schulte-Schrepping, J. et al. Severe COVID-19 is marked by a dysregulated myeloid cell compartment. Cell 182, 1419–1440 (2020).
pubmed: 32810438
pmcid: 7405822
Bakken, T. E. et al. Comparative cellular analysis of motor cortex in human, marmoset and mouse. Nature 598, 111–119 (2021).
pubmed: 34616062
pmcid: 8494640
Martens, L. D., Fischer, D. S., Theis, F. J. & Gagneur, J. Modeling fragment counts improves single-cell ATAC-seq analysis. Preprint at bioRxiv https://doi.org/10.1101/2022.05.04.490536 (2022).
Luecken, M. D. et al. A sandbox for prediction and integration of DNA, RNA, and proteins in single cells. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2) (2021).
Ashuach, T., Reidenbach, D. A., Gayoso, A. & Yosef, N. PeakVI: a deep generative model for single-cell chromatin accessibility analysis. Cell Rep. methods 2, 100182 (2022).
pubmed: 35475224
pmcid: 9017241
Kingma, D. P. & Welling, M. Auto-encoding variational Bayes. Preprint at arXiv https://doi.org/10.48550/arXiv.1312.6114 (2013).
Higgins, I. et al. Beta-vae: learning basic visual concepts with a constrained variational framework. In International Conference on Learning Representations (2017).
Radford, A., Metz, L. & Chintala, S. Unsupervised representation learning with deep convolutional generative adversarial networks. Preprint at arXiv https://doi.org/10.48550/arXiv.1511.06434 (2015).
Integration and label transfer. Satija Lab https://satijalab.org/seurat/archive/v3.0/integration.html
Quickstart tutorial. R Project https://cran.r-project.org/web/packages/symphony/vignettes/quickstart_tutorial.html
Szabo, P. A. et al. Longitudinal profiling of respiratory and systemic immune responses reveals myeloid cell-driven lung inflammation in severe COVID-19. Immunity 54, 797–814 (2021).
pubmed: 33765436
pmcid: 7951561
Lee, J. S. et al. Immunophenotyping of COVID-19 and influenza highlights the role of type i interferons in development of severe COVID-19. Sci. Immunol. 5, eabd1554 (2020).
pubmed: 32651212
pmcid: 7402635
Stephenson, E. et al. Single-cell multi-omics analysis of the immune response in COVID-19. Nat. Med. 27, 904–916 (2021).
pubmed: 33879890
pmcid: 8121667
Yoshida, M. et al. Local and systemic responses to SARS-CoV-2 infection in children and adults. Nature 602, 321–327 (2022).
pubmed: 34937051
Savage, A. K. et al. Multimodal analysis for human ex vivo studies shows extensive molecular changes from delays in blood processing. iScience 24, 102404 (2021).
pubmed: 34113805
pmcid: 8169801
Yazar, S. et al. Single-cell eQTL mapping identifies cell type–specific genetic control of autoimmune disease. Science 376, eabf3041 (2022).
pubmed: 35389779
Guo, C. et al. Single-cell analysis of two severe COVID-19 patients reveals a monocyte-associated and tocilizumab-responding cytokine storm. Nat. Commun. 11, 1–11 (2020).
Arunachalam, P. S. et al. Systems biological assessment of immunity to mild versus severe COVID-19 infection in humans. Science 369, 1210–1220 (2020).
pubmed: 32788292
pmcid: 7665312
Ahern, D. J. et al. A blood atlas of COVID-19 defines hallmarks of disease severity and specificity. Cell 185, 916–938 (2022).
Travaglini, K. J. et al. A molecular cell atlas of the human lung from single-cell RNA sequencing. Nature 587, 619–625 (2020).
pubmed: 33208946
pmcid: 7704697
Liu, C. et al. Time-resolved systems immunology reveals a late juncture linked to fatal COVID-19. Cell 184, 1836–1857 (2021).
pubmed: 33713619
pmcid: 7874909
Wilk, A. J. et al. A single-cell atlas of the peripheral immune response in patients with severe COVID-19. Nat. Med. 26, 1070–1076 (2020).
pubmed: 32514174
pmcid: 7382903
Ren, X. et al. COVID-19 immune features revealed by a large-scale single-cell transcriptome atlas. Cell 184, 1895–1913 (2021).
pubmed: 33657410
pmcid: 7857060
Tabula Sapiens Consortium et al. The Tabula Sapiens: a multiple-organ, single-cell transcriptomic atlas of humans. Science 376, eabl4896 (2022).
Szabo, P. A. et al. Single-cell transcriptomics of human t cells reveals tissue and activation signatures in health and disease. Nat. Commun. 10, 1–16 (2019).
van der Wijst, M. G. et al. Type I interferon autoantibodies are associated with systemic immune alterations in patients with COVID-19. Sci. Transl. Med. 13, eabh2624 (2021).
pubmed: 34429372
pmcid: 8601717
Perez, R. K. et al. Single-cell RNA-seq reveals cell type–specific molecular and genetic associations to lupus. Science 376, eabf1970 (2022).
pubmed: 35389781
pmcid: 9297655
Single-cell immunology of SARS-CoV-2 infection. Fred Hutch https://atlas.fredhutch.org/fredhutch/covid/
treeArches: reference models & latent space. Zenodo https://zenodo.org/record/6786357
Series GSE194122. NCBI https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE194122
scArches-reproducibility. GitHub https://github.com/theislab/scArches-reproducibility
Benchmarking atlas-level data integration in single-cell genomics - integration task datasets. figshare https://doi.org/10.6084/m9.figshare.12420968
Nieto, P. et al. A single-cell tumor immune atlas for precision oncology. Genome Res. 31, 1913–1926 (2021).
pubmed: 34548323
pmcid: 8494216
A single-cell tumor immune atlas for precision oncology. Zenodo https://zenodo.org/record/4263972