QOT: Efficient Computation of Sample Level Distance Matrix from Single-Cell Omics Data through Quantized Optimal Transport.
Journal
bioRxiv : the preprint server for biology
Titre abrégé: bioRxiv
Pays: United States
ID NLM: 101680187
Informations de publication
Date de publication:
06 Feb 2024
06 Feb 2024
Historique:
medline:
19
2
2024
pubmed:
19
2
2024
entrez:
19
2
2024
Statut:
epublish
Résumé
Single-cell technologies have emerged as a transformative technology enabling high-dimensional characterization of cell populations at an unprecedented scale. The data's innate complexity and voluminous nature pose significant computational and analytical challenges, especially in comparative studies delineating cellular architectures across various biological conditions (i.e., generation of sample level distance matrices). Optimal Transport (OT) is a mathematical tool that captures the intrinsic structure of data geometrically and has been applied to many bioinformatics tasks. In this paper, we propose QOT (Quantized Optimal Transport), a new method enables efficient computation of sample level distance matrix from large-scale single-cell omics data through a quantization step. We apply our algorithm to real-world single-cell genomics and pathomics datasets, aiming to extrapolate cell-level insights to inform sample level categorizations. Our empirical study shows that QOT outperforms OT-based algorithms in terms of accuracy and robustness when obtaining a distance matrix at the sample level from high throughput single-cell measures. Moreover, the sample level distance matrix could be used in downstream analysis (i.e. uncover the trajectory of disease progression), highlighting its usage in biomedical informatics and data science.
Identifiants
pubmed: 38370767
doi: 10.1101/2024.02.06.578032
pmc: PMC10871252
pii:
doi:
Types de publication
Preprint
Langues
eng