Population-level integration of single-cell datasets enables multi-scale analysis across samples.


Journal

Nature methods
ISSN: 1548-7105
Titre abrégé: Nat Methods
Pays: United States
ID NLM: 101215604

Informations de publication

Date de publication:
Nov 2023
Historique:
received: 08 12 2022
accepted: 05 09 2023
medline: 9 11 2023
pubmed: 10 10 2023
entrez: 9 10 2023
Statut: ppublish

Résumé

The increasing generation of population-level single-cell atlases has the potential to link sample metadata with cellular data. Constructing such references requires integration of heterogeneous cohorts with varying metadata. Here we present single-cell population level integration (scPoli), an open-world learner that incorporates generative models to learn sample and cell representations for data integration, label transfer and reference mapping. We applied scPoli on population-level atlases of lung and peripheral blood mononuclear cells, the latter consisting of 7.8 million cells across 2,375 samples. We demonstrate that scPoli can explain sample-level biological and technical variations using sample embeddings revealing genes associated with batch effects and biological effects. scPoli is further applicable to single-cell sequencing assay for transposase-accessible chromatin and cross-species datasets, offering insights into chromatin accessibility and comparative genomics. We envision scPoli becoming an important tool for population-level single-cell data integration facilitating atlas use but also interpretation by means of multi-scale analyses.

Identifiants

pubmed: 37813989
doi: 10.1038/s41592-023-02035-2
pii: 10.1038/s41592-023-02035-2
pmc: PMC10630133
doi:

Substances chimiques

Chromatin 0

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

1683-1692

Subventions

Organisme : Helmholtz Association
ID : ZT-I-PF-5-01
Organisme : Deutsche Forschungsgemeinschaft (German Research Foundation)
ID : 458958943

Informations de copyright

© 2023. The Author(s).

Références

Regev, A. et al. Science forum: The Human Cell Atlas. eLife 6, e27041 (2017).
pubmed: 29206104 pmcid: 5762154
HuBMAP Consortium. The human body at cellular resolution: the NIH Human Biomolecular Atlas Program. Nature 574, 187–192 (2019).
Muus, C. et al. Single-cell meta-analysis of SARS-CoV-2 entry genes across tissues and demographics. Nat. Med. 27, 546–559 (2021).
pubmed: 33654293 pmcid: 9469728
Sikkema, L. et al. An integrated cell atlas of the lung in health and disease. Nat. Med. 29, 1563–1577 (2023).
Gayoso, A. et al. A Python library for probabilistic analysis of single-cell omics data. Nat. Biotechnol. 40, 163–166 (2022).
pubmed: 35132262
Luecken, M. D. et al. Benchmarking atlas-level data integration in single-cell genomics. Nat. Methods 19, 41–50 (2022).
pubmed: 34949812
Argelaguet, R., Cuomo, A. S., Stegle, O. & Marioni, J. C. Computational principles and challenges in single-cell data integration. Nat. Biotechnol. 39, 1202–1215 (2021).
pubmed: 33941931
Welch, J. D. et al. Single-cell multi-omic integration compares and contrasts features of brain cell identity. Cell 177, 1873–1887 (2019).
pubmed: 31178122 pmcid: 6716797
Ritchie, M. E. et al. Limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 43, e47–e47 (2015).
pubmed: 25605792 pmcid: 4402510
Korsunsky, I. et al. Fast, sensitive and accurate integration of single-cell data with harmony. Nat. Methods 16, 1289–1296 (2019).
pubmed: 31740819 pmcid: 6884693
Kiselev, V. Y., Yiu, A. & Hemberg, M. Scmap: projection of single-cell RNA-seq data across data sets. Nat. Methods 15, 359–362 (2018).
pubmed: 29608555
Stuart, T. et al. Comprehensive integration of single-cell data. Cell 177, 1888–1902 (2019).
pubmed: 31178118 pmcid: 6687398
Polański, K. et al. BBKNN: fast batch alignment of single cell transcriptomes. Bioinformatics 36, 964–965 (2020).
pubmed: 31400197
Haghverdi, L. et al. Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors. Nat. Biotechnol. 36, 421–427 (2018).
Lopez, R., Regier, J., Cole, M. B., Jordan, M. I. & Yosef, N. Deep generative modeling for single-cell transcriptomics. Nat. Methods 15, 1053–1058 (2018).
pubmed: 30504886 pmcid: 6289068
Lotfollahi, M., Wolf, F. A. & Theis, F. J. scGen predicts single-cell perturbation responses. Nat. Methods 16, 715–721 (2019).
pubmed: 31363220
Amodio, M. et al. Exploring single-cell data with deep multitasking neural networks. Nat. Methods 16, 1139–1145 (2019).
pubmed: 31591579 pmcid: 10164410
Lähnemann, D. et al. Eleven grand challenges in single-cell data science. Genome Biol. 21, 1–35 (2020).
Hao, Y. et al. Integrated analysis of multimodal single-cell data. Cell 184, 3573–3587 (2021).
pubmed: 34062119 pmcid: 8238499
Kang, J. B. et al. Efficient and precise single-cell reference atlas mapping with Symphony. Nat. Commun. 12, 1–21 (2021).
Lotfollahi, M. et al. Mapping single-cell data to reference atlases by transfer learning. Nat. Biotechnol. 40, 121–130 (2022).
pubmed: 34462589
Michielsen, L. et al. Single-cell reference mapping to construct and extend cell-type hierarchies. NAR Genomics and Bioinformatics 5, lqad070 (2023).
Osorio, D., McGrail, D. J., Sahni, N. & Yi, S. S. Drug combination prioritization for cancer treatment using single-cell RNA-seq based transfer learning. Preprint at bioRxiv (2022).
Xu, C. et al. Probabilistic harmonization and annotation of single-cell transcriptomics data with deep generative models. Mol. Syst. Biol. 17, e9620 (2021).
pubmed: 33491336 pmcid: 7829634
Fetaya, E., Jacobsen, J.-H., Grathwohl, W. & Zemel, R. Understanding the limitations of conditional generative models. Preprint at https://doi.org/10.48550/arXiv.1906.01171 (2019).
Brbić, M. et al. MARS: discovering novel cell types across heterogeneous single-cell experiments. Nat. Methods 17, 1200–1206 (2020).
pubmed: 33077966
Sohn, K., Lee, H. & Yan, X. Learning structured output representation using deep conditional generative models. Adv. Neural Inf. Process. Syst. 28, 3483–3491 (2015).
Snell, J., Swersky, K. & Zemel, R. Prototypical networks for few-shot learning. Adv. Neural Inf. Process. Syst. 30, (2017).
Lotfollahi, M., Naghipourfar, M., Theis, F. J. & Wolf, F. A. Conditional out-of-distribution generation for unpaired data using transfer VAE. Bioinformatics 36, i610–i617 (2020).
pubmed: 33381839
Hospedales, T., Antoniou, A., Micaelli, P. & Storkey, A. Meta-learning in neural networks: a survey. Preprint at arXiv https://doi.org/10.48550/arXiv.2004.05439 (2020).
Köhler, N. D., Büttner, M. & Theis, F. J. Deep learning does not outperform classical machine learning for cell-type annotation. Preprint at bioRxiv https://doi.org/10.1101/653907 (2019).
Madissoon, E. et al. A spatially resolved atlas of the human lung characterizes a gland-associated immune niche. Nat Genet. 55, 66–77 (2023).
Grabski, I. N., Street, K. & Irizarry, R. A. Significance analysis for clustering with single-cell RNA-sequencing data. Nat. Methods 20, 1196–1202 (2023).
Su, Y. et al. Multiomic immunophenotyping of COVID-19 patients reveals early infection trajectories. Preprint at bioRxiv (2020).
Schulte-Schrepping, J. et al. Severe COVID-19 is marked by a dysregulated myeloid cell compartment. Cell 182, 1419–1440 (2020).
pubmed: 32810438 pmcid: 7405822
Bakken, T. E. et al. Comparative cellular analysis of motor cortex in human, marmoset and mouse. Nature 598, 111–119 (2021).
pubmed: 34616062 pmcid: 8494640
Martens, L. D., Fischer, D. S., Theis, F. J. & Gagneur, J. Modeling fragment counts improves single-cell ATAC-seq analysis. Preprint at bioRxiv https://doi.org/10.1101/2022.05.04.490536 (2022).
Luecken, M. D. et al. A sandbox for prediction and integration of DNA, RNA, and proteins in single cells. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2) (2021).
Ashuach, T., Reidenbach, D. A., Gayoso, A. & Yosef, N. PeakVI: a deep generative model for single-cell chromatin accessibility analysis. Cell Rep. methods 2, 100182 (2022).
pubmed: 35475224 pmcid: 9017241
Kingma, D. P. & Welling, M. Auto-encoding variational Bayes. Preprint at arXiv https://doi.org/10.48550/arXiv.1312.6114 (2013).
Higgins, I. et al. Beta-vae: learning basic visual concepts with a constrained variational framework. In International Conference on Learning Representations (2017).
Radford, A., Metz, L. & Chintala, S. Unsupervised representation learning with deep convolutional generative adversarial networks. Preprint at arXiv https://doi.org/10.48550/arXiv.1511.06434 (2015).
Integration and label transfer. Satija Lab https://satijalab.org/seurat/archive/v3.0/integration.html
Quickstart tutorial. R Project https://cran.r-project.org/web/packages/symphony/vignettes/quickstart_tutorial.html
Szabo, P. A. et al. Longitudinal profiling of respiratory and systemic immune responses reveals myeloid cell-driven lung inflammation in severe COVID-19. Immunity 54, 797–814 (2021).
pubmed: 33765436 pmcid: 7951561
Lee, J. S. et al. Immunophenotyping of COVID-19 and influenza highlights the role of type i interferons in development of severe COVID-19. Sci. Immunol. 5, eabd1554 (2020).
pubmed: 32651212 pmcid: 7402635
Stephenson, E. et al. Single-cell multi-omics analysis of the immune response in COVID-19. Nat. Med. 27, 904–916 (2021).
pubmed: 33879890 pmcid: 8121667
Yoshida, M. et al. Local and systemic responses to SARS-CoV-2 infection in children and adults. Nature 602, 321–327 (2022).
pubmed: 34937051
Savage, A. K. et al. Multimodal analysis for human ex vivo studies shows extensive molecular changes from delays in blood processing. iScience 24, 102404 (2021).
pubmed: 34113805 pmcid: 8169801
Yazar, S. et al. Single-cell eQTL mapping identifies cell type–specific genetic control of autoimmune disease. Science 376, eabf3041 (2022).
pubmed: 35389779
Guo, C. et al. Single-cell analysis of two severe COVID-19 patients reveals a monocyte-associated and tocilizumab-responding cytokine storm. Nat. Commun. 11, 1–11 (2020).
Arunachalam, P. S. et al. Systems biological assessment of immunity to mild versus severe COVID-19 infection in humans. Science 369, 1210–1220 (2020).
pubmed: 32788292 pmcid: 7665312
Ahern, D. J. et al. A blood atlas of COVID-19 defines hallmarks of disease severity and specificity. Cell 185, 916–938 (2022).
Travaglini, K. J. et al. A molecular cell atlas of the human lung from single-cell RNA sequencing. Nature 587, 619–625 (2020).
pubmed: 33208946 pmcid: 7704697
Liu, C. et al. Time-resolved systems immunology reveals a late juncture linked to fatal COVID-19. Cell 184, 1836–1857 (2021).
pubmed: 33713619 pmcid: 7874909
Wilk, A. J. et al. A single-cell atlas of the peripheral immune response in patients with severe COVID-19. Nat. Med. 26, 1070–1076 (2020).
pubmed: 32514174 pmcid: 7382903
Ren, X. et al. COVID-19 immune features revealed by a large-scale single-cell transcriptome atlas. Cell 184, 1895–1913 (2021).
pubmed: 33657410 pmcid: 7857060
Tabula Sapiens Consortium et al. The Tabula Sapiens: a multiple-organ, single-cell transcriptomic atlas of humans. Science 376, eabl4896 (2022).
Szabo, P. A. et al. Single-cell transcriptomics of human t cells reveals tissue and activation signatures in health and disease. Nat. Commun. 10, 1–16 (2019).
van der Wijst, M. G. et al. Type I interferon autoantibodies are associated with systemic immune alterations in patients with COVID-19. Sci. Transl. Med. 13, eabh2624 (2021).
pubmed: 34429372 pmcid: 8601717
Perez, R. K. et al. Single-cell RNA-seq reveals cell type–specific molecular and genetic associations to lupus. Science 376, eabf1970 (2022).
pubmed: 35389781 pmcid: 9297655
Single-cell immunology of SARS-CoV-2 infection. Fred Hutch https://atlas.fredhutch.org/fredhutch/covid/
treeArches: reference models & latent space. Zenodo https://zenodo.org/record/6786357
Series GSE194122. NCBI https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE194122
scArches-reproducibility. GitHub https://github.com/theislab/scArches-reproducibility
Benchmarking atlas-level data integration in single-cell genomics - integration task datasets. figshare https://doi.org/10.6084/m9.figshare.12420968
Nieto, P. et al. A single-cell tumor immune atlas for precision oncology. Genome Res. 31, 1913–1926 (2021).
pubmed: 34548323 pmcid: 8494216
A single-cell tumor immune atlas for precision oncology. Zenodo https://zenodo.org/record/4263972

Auteurs

Carlo De Donno (C)

Institute of Computational Biology, Helmholtz Center Munich, Munich, Germany.
School of Life Sciences Weihenstephan, Technical University of Munich, Munich, Germany.

Soroor Hediyeh-Zadeh (S)

Institute of Computational Biology, Helmholtz Center Munich, Munich, Germany.

Amir Ali Moinfar (AA)

Institute of Computational Biology, Helmholtz Center Munich, Munich, Germany.
School of Computing, Information and Technology, Technical University of Munich, Munich, Germany.

Marco Wagenstetter (M)

Institute of Computational Biology, Helmholtz Center Munich, Munich, Germany.

Luke Zappia (L)

Institute of Computational Biology, Helmholtz Center Munich, Munich, Germany.
School of Computing, Information and Technology, Technical University of Munich, Munich, Germany.

Mohammad Lotfollahi (M)

Institute of Computational Biology, Helmholtz Center Munich, Munich, Germany. ml19@sanger.ac.uk.
Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK. ml19@sanger.ac.uk.

Fabian J Theis (FJ)

Institute of Computational Biology, Helmholtz Center Munich, Munich, Germany. fabian.theis@helmholtz-munich.de.
School of Life Sciences Weihenstephan, Technical University of Munich, Munich, Germany. fabian.theis@helmholtz-munich.de.
School of Computing, Information and Technology, Technical University of Munich, Munich, Germany. fabian.theis@helmholtz-munich.de.
Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK. fabian.theis@helmholtz-munich.de.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH