Stardust: improving spatial transcriptomics data analysis through space-aware modularity optimization-based clustering.
clustering
spatial transcriptomics analysis
stability scores, parameters tuning, software comparison
Journal
GigaScience
ISSN: 2047-217X
Titre abrégé: Gigascience
Pays: United States
ID NLM: 101596872
Informations de publication
Date de publication:
10 08 2022
10 08 2022
Historique:
received:
14
12
2021
revised:
27
04
2022
accepted:
30
06
2022
entrez:
10
8
2022
pubmed:
11
8
2022
medline:
13
8
2022
Statut:
ppublish
Résumé
Spatial transcriptomics (ST) combines stained tissue images with spatially resolved high-throughput RNA sequencing. The spatial transcriptomic analysis includes challenging tasks like clustering, where a partition among data points (spots) is defined by means of a similarity measure. Improving clustering results is a key factor as clustering affects subsequent downstream analysis. State-of-the-art approaches group data by taking into account transcriptional similarity and some by exploiting spatial information as well. However, it is not yet clear how much the spatial information combined with transcriptomics improves the clustering result. We propose a new clustering method, Stardust, that easily exploits the combination of space and transcriptomic information in the clustering procedure through a manual or fully automatic tuning of algorithm parameters. Moreover, a parameter-free version of the method is also provided where the spatial contribution depends dynamically on the expression distances distribution in the space. We evaluated the proposed methods results by analyzing ST data sets available on the 10x Genomics website and comparing clustering performances with state-of-the-art approaches by measuring the spots' stability in the clusters and their biological coherence. Stability is defined by the tendency of each point to remain clustered with the same neighbors when perturbations are applied. Stardust is an easy-to-use methodology allowing to define how much spatial information should influence clustering on different tissues and achieving more stable results than state-of-the-art approaches.
Sections du résumé
BACKGROUND
Spatial transcriptomics (ST) combines stained tissue images with spatially resolved high-throughput RNA sequencing. The spatial transcriptomic analysis includes challenging tasks like clustering, where a partition among data points (spots) is defined by means of a similarity measure. Improving clustering results is a key factor as clustering affects subsequent downstream analysis. State-of-the-art approaches group data by taking into account transcriptional similarity and some by exploiting spatial information as well. However, it is not yet clear how much the spatial information combined with transcriptomics improves the clustering result.
RESULTS
We propose a new clustering method, Stardust, that easily exploits the combination of space and transcriptomic information in the clustering procedure through a manual or fully automatic tuning of algorithm parameters. Moreover, a parameter-free version of the method is also provided where the spatial contribution depends dynamically on the expression distances distribution in the space. We evaluated the proposed methods results by analyzing ST data sets available on the 10x Genomics website and comparing clustering performances with state-of-the-art approaches by measuring the spots' stability in the clusters and their biological coherence. Stability is defined by the tendency of each point to remain clustered with the same neighbors when perturbations are applied.
CONCLUSIONS
Stardust is an easy-to-use methodology allowing to define how much spatial information should influence clustering on different tissues and achieving more stable results than state-of-the-art approaches.
Identifiants
pubmed: 35946989
pii: 6659721
doi: 10.1093/gigascience/giac075
pmc: PMC9364686
pii:
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Informations de copyright
© The Author(s) 2022. Published by Oxford University Press GigaScience.
Références
Nat Commun. 2021 Oct 8;12(1):5909
pubmed: 34625546
Science. 2016 Jul 1;353(6294):78-82
pubmed: 27365449
Bioinformatics. 2021 Oct 08;:
pubmed: 34623423
Nat Biotechnol. 2015 Feb;33(2):155-60
pubmed: 25599176
Nature. 2021 Aug;596(7871):211-220
pubmed: 34381231
Gigascience. 2019 Sep 1;8(9):
pubmed: 31494672
Cell. 2019 Jun 13;177(7):1888-1902.e21
pubmed: 31178118
Nat Biotechnol. 2021 Nov;39(11):1375-1384
pubmed: 34083791
Bioessays. 2020 Oct;42(10):e1900221
pubmed: 32363691
Nat Biotechnol. 2018 Jun;36(5):411-420
pubmed: 29608179
Philos Trans A Math Phys Eng Sci. 2016 Apr 13;374(2065):20150202
pubmed: 26953178
Genome Biol. 2021 Mar 8;22(1):78
pubmed: 33685491
Nat Methods. 2021 Nov;18(11):1342-1351
pubmed: 34711970
Comput Struct Biotechnol J. 2021 Jul 01;19:3829-3841
pubmed: 34285782
BMC Bioinformatics. 2021 Aug 9;22(1):397
pubmed: 34372758
Nat Methods. 2021 Jan;18(1):9-14
pubmed: 33408395
Nat Methods. 2021 Sep;18(9):997-1012
pubmed: 34341583
Cell. 2021 Jun 24;184(13):3559-3572.e22
pubmed: 34115981
Nat Biotechnol. 2022 Apr;40(4):517-526
pubmed: 33603203