Library size confounds biology in spatial transcriptomics data.
Journal
Genome biology
ISSN: 1474-760X
Titre abrégé: Genome Biol
Pays: England
ID NLM: 100960660
Informations de publication
Date de publication:
18 Apr 2024
18 Apr 2024
Historique:
received:
12
07
2023
accepted:
09
04
2024
medline:
19
4
2024
pubmed:
19
4
2024
entrez:
18
4
2024
Statut:
epublish
Résumé
Spatial molecular data has transformed the study of disease microenvironments, though, larger datasets pose an analytics challenge prompting the direct adoption of single-cell RNA-sequencing tools including normalization methods. Here, we demonstrate that library size is associated with tissue structure and that normalizing these effects out using commonly applied scRNA-seq normalization methods will negatively affect spatial domain identification. Spatial data should not be specifically corrected for library size prior to analysis, and algorithms designed for scRNA-seq data should be adopted with caution.
Identifiants
pubmed: 38637899
doi: 10.1186/s13059-024-03241-7
pii: 10.1186/s13059-024-03241-7
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
99Subventions
Organisme : Cure Brain Cancer Foundation
ID : CBCNBCF-19-009
Organisme : National Health and Medical Research Council
ID : APP2021286
Organisme : National Health and Medical Research Council
ID : GNT1175653
Organisme : National Health and Medical Research Council
ID : APP2021041
Organisme : National Breast Cancer Foundation
ID : IIRS-23-069
Organisme : National Breast Cancer Foundation
ID : IIRS-23-069
Organisme : National Breast Cancer Foundation
ID : IIRS-19-009
Organisme : National Breast Cancer Foundation
ID : IIRS-19-009
Informations de copyright
© 2024. The Author(s).
Références
Marx V. Method of the year: spatially resolved transcriptomics. Nat Methods. 2021;18:9–14.
doi: 10.1038/s41592-020-01033-y
pubmed: 33408395
Chen A, Liao S, Cheng M, Ma K, Wu L, Lai Y, Qiu X, Yang J, Xu J, Hao S, et al. Spatiotemporal transcriptomic atlas of mouse organogenesis using DNA nanoball-patterned arrays. Cell. 2022;185(1777–1792): e1721.
He S, Bhatt R, Brown C, Brown EA, Buhr DL, Chantranuvatana K, Danaher P, Dunaway D, Garrison RG, Geiss G, et al. High-plex imaging of RNA and proteins at subcellular resolution in fixed tissue by spatial molecular imaging. Nat Biotechnol. 2022;40(12):1794–806.
doi: 10.1038/s41587-022-01483-z
pubmed: 36203011
Janesick A, Shelansky R, Gottscho AD, Wagner F, Williams SR, Rouault M, Beliakoff G, Morrison CA, Oliveira MF, Sicherman JT, Kohlway A, Abousoud J, Drennon TY, Mohabbat SH, 10x Development Teams, Taylor SEB. High resolution mapping of the tumor microenvironment using integrated single-cell, spatial and in situ analysis. Nat Commun. 2023;14(1):8353.
Moses L, Pachter L. Museum of spatial transcriptomics. Nat Methods. 2022;19:534–46.
doi: 10.1038/s41592-022-01409-2
pubmed: 35273392
Fu T, Dai LJ, Wu SY, Xiao Y, Ma D, Jiang YZ, Shao ZM. Spatial architecture of the immune microenvironment orchestrates tumor immunity and therapeutic response. J Hematol Oncol. 2021;14:98.
doi: 10.1186/s13045-021-01103-4
pubmed: 34172088
pmcid: 8234625
Dries R, Chen J, Del Rossi N, Khan MM, Sistig A, Yuan GC. Advances in spatial transcriptomic data analysis. Genome Res. 2021;31:1706–18.
doi: 10.1101/gr.275224.121
pubmed: 34599004
pmcid: 8494229
Zappia L, Phipson B, Oshlack A. Exploring the single-cell RNA-seq analysis landscape with the scRNA-tools database. PLoS Comput Biol. 2018;14:e1006245.
doi: 10.1371/journal.pcbi.1006245
pubmed: 29939984
pmcid: 6034903
Robinson MD, Oshlack A. A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol. 2010;11:R25.
doi: 10.1186/gb-2010-11-3-r25
pubmed: 20196867
pmcid: 2864565
Luecken MD, Theis FJ. Current best practices in single-cell RNA-seq analysis: a tutorial. Mol Syst Biol. 2019;15:e8746.
doi: 10.15252/msb.20188746
pubmed: 31217225
pmcid: 6582955
Hafemeister C, Satija R. Normalization and variance stabilization of single-cell RNA-seq data using regularized negative binomial regression. Genome Biol. 2019;20:296.
doi: 10.1186/s13059-019-1874-1
pubmed: 31870423
pmcid: 6927181
Lun AT, Bach K, Marioni JC. Pooling across cells to normalize single-cell RNA sequencing data with many zero counts. Genome Biol. 2016;17:75.
doi: 10.1186/s13059-016-0947-7
pubmed: 27122128
Salas SM, Czarnewski P, Kuemmerle LB, Helgadottir S, Mattsson Langseth C, Tiesmeyer S, Avenel C, Rehman H, Tiklova K, Andersson A, Chatzinikolaou M, Theis FJ, Luecken MD, Wählby C, Ishaque N, Nilsson M. Optimizing Xenium in situ data utility by quality assessment and best practice analysis workflows. bioRxiv. 2023;2023.02.13.528102.
Saiselet M, Rodrigues-Vitoria J, Tourneur A, Craciun L, Spinette A, Larsimont D, Andry G, Lundeberg J, Maenhaut C, Detours V. Transcriptional output, cell-type densities, and normalization in spatial transcriptomics. J Mol Cell Biol. 2020;12:906–8.
doi: 10.1093/jmcb/mjaa028
pubmed: 32573704
pmcid: 7883818
Maynard KR, Collado-Torres L, Weber LM, Uytingco C, Barry BK, Williams SR, Catallini JL 2nd, Tran MN, Besich Z, Tippani M, et al. Transcriptome-scale spatial gene expression in the human dorsolateral prefrontal cortex. Nat Neurosci. 2021;24:425–36.
doi: 10.1038/s41593-020-00787-0
pubmed: 33558695
pmcid: 8095368
Pardo B, Spangler A, Weber LM, Page SC, Hicks SC, Jaffe AE, Martinowich K, Maynard KR, Collado-Torres L. spatialLIBD: an R/Bioconductor package to visualize spatially-resolved transcriptomics data. BMC Genomics. 2022;23:434.
doi: 10.1186/s12864-022-08601-w
pubmed: 35689177
pmcid: 9188087
Fresh Frozen Mouse Brain Replicates - In Situ Gene Expression Dataset by Xenium Onboard Analysis 1.0.2 https://www.10xgenomics.com/resources/datasets/fresh-frozen-mouse-brain-replicates-1-standard
Lambrechts D, Wauters E, Boeckx B, Aibar S, Nittner D, Burton O, Bassez A, Decaluwe H, Pircher A, Van den Eynde K, et al. Phenotype molding of stromal cells in the lung tumor microenvironment. Nat Med. 2018;24:1277–89.
doi: 10.1038/s41591-018-0096-5
pubmed: 29988129
Salim A, Molania R, Wang J, De Livera A, Thijssen R, Speed TP. RUV-III-NB: normalization of single cell RNA-seq data. Nucleic Acids Res. 2022;50:e96.
doi: 10.1093/nar/gkac486
pubmed: 35758618
pmcid: 9458465
Amezquita RA, Lun ATL, Becht E, Carey VJ, Carpp LN, Geistlinger L, Marini F, Rue-Albrecht K, Risso D, Soneson C, et al. Orchestrating single-cell analysis with bioconductor. Nat Methods. 2020;17:137–45.
doi: 10.1038/s41592-019-0654-x
pubmed: 31792435
Zhao E, Stone MR, Ren X, Guenthoer J, Smythe KS, Pulliam T, Williams SR, Uytingco CR, Taylor SEB, Nghiem P, et al. Spatial transcriptomics at subspot resolution with BayesSpace. Nat Biotechnol. 2021;39:1375–84.
doi: 10.1038/s41587-021-00935-2
pubmed: 34083791
pmcid: 8763026
Hu J, Li X, Coleman K, Schroeder A, Ma N, Irwin DJ, Lee EB, Shinohara RT, Li M. SpaGCN: integrating gene expression, spatial location and histology to identify spatial domains and spatially variable genes by graph convolutional network. Nat Methods. 2021;18:1342–51.
doi: 10.1038/s41592-021-01255-8
pubmed: 34711970
Atta L, Clifton K, Anant M, Fan J. Gene count normalization in single-cell imaging-based spatially resolved transcriptomics. bioRxiv. 2023;2023.08.30.555624.
Birch CP, Oom SP, Beecham JA. Rectangular and hexagonal grids used for observation, experiment and simulation in ecology. Ecol Model. 2007;206:347–59.
doi: 10.1016/j.ecolmodel.2007.03.041
Yates F. The analysis of multiple classifications with unequal numbers in the different classes. J Am Stat Assoc. 1934;29:51–66.
doi: 10.1080/01621459.1934.10502686
Wang Q, Ding SL, Li Y, Royall J, Feng D, Lesnar P, Graddis N, Naeemi M, Facer B, Ho A, et al. The allen mouse brain common coordinate framework: a 3d reference atlas. Cell. 2020;181(936–953):e920.
doi: 10.3390/cells9040920
Schindelin J, Arganda-Carreras I, Frise E, Kaynig V, Longair M, Pietzsch T, Preibisch S, Rueden C, Saalfeld S, Schmid B, et al. Fiji: an open-source platform for biological-image analysis. Nat Methods. 2012;9:676–82.
doi: 10.1038/nmeth.2019
pubmed: 22743772
Bankhead P, Loughrey MB, Fernandez JA, Dombrowski Y, McArt DG, Dunne PD, McQuaid S, Gray RT, Murray LJ, Coleman HG, et al. QuPath: open source software for digital pathology image analysis. Sci Rep. 2017;7:16878.
doi: 10.1038/s41598-017-17204-5
pubmed: 29203879
pmcid: 5715110
Lin Y, Ghazanfar S, Strbenac D, Wang A, Patrick E, Lin DM, Speed T. Yang JYH. Yang P: Evaluating stably expressed genes in single cells. Gigascience; 2019. p. 8.
Lun AT, McCarthy DJ, Marioni JC. A step by step workflow for low level analysis of single-cell RNA seq data with bioconductor. F1000Res. 2016;5:2122.
pubmed: 27909575
pmcid: 5112579
Blondel VD, Guillaume J-L, Lambiotte R, Lefebvre E. Fast unfolding of communities in large networks. J Stat Mech: Theory Exp. 2008;2008:P10008.
doi: 10.1088/1742-5468/2008/10/P10008
Traag VA, Waltman L, van Eck NJ. From Louvain to Leiden: guaranteeing well-connected communities. Sci Rep. 2019;9:5233.
doi: 10.1038/s41598-019-41695-z
pubmed: 30914743
pmcid: 6435756
Su S, Tian L, Dong X, Hickey PF, Freytag S, Ritchie ME. Cell Bench: R/Bioconductor software for comparing single-cell RNA-seq analysis methods. Bioinformatics. 2020;36:2288–90.
doi: 10.1093/bioinformatics/btz889
pubmed: 31778143
Bhuva DD, Tan CW, Marceaux C, Pickering M, Salim A, Chen J, Kharbanda M, Jin X, Liu N, Feher K, et al. Library size confounds biology in spatial transcriptomics data. 2024. Zenodo. https://doi.org/10.5281/zenodo.7959786 .
Bhuva DD: SubcellularSpatialData: annotated spatial transcriptomics datasets from 10x Xenium, NanoString CosMx and BGI STOmics. Bioconductor. 2024 https://doi.org/10.18129/B9.bioc.SubcellularSpatialData .
Bhuva DD. Library size confounds biology in spatial transcriptomics. 2024. Zenodo. https://doi.org/10.5281/zenodo.10946961 .
Blischak JD, Carbonetto P, Stephens M. Creating and sharing reproducible research code the workflowr way. F1000Res. 2019;8:1749.
doi: 10.12688/f1000research.20843.1
pubmed: 31723427
pmcid: 6833990