Detection of circulating tumor cells by means of machine learning using Smart-Seq2 sequencing.
Artificial intelligence
CTC
Circulating tumor cells
Machine learning
Metastatic cancer
Single-cell sequencing
scRNA-seq
Journal
Scientific reports
ISSN: 2045-2322
Titre abrégé: Sci Rep
Pays: England
ID NLM: 101563288
Informations de publication
Date de publication:
14 05 2024
14 05 2024
Historique:
received:
15
12
2023
accepted:
06
05
2024
medline:
15
5
2024
pubmed:
15
5
2024
entrez:
14
5
2024
Statut:
epublish
Résumé
Circulating tumor cells (CTCs) are tumor cells that separate from the solid tumor and enter the bloodstream, which can cause metastasis. Detection and enumeration of CTCs show promising potential as a predictor for prognosis in cancer patients. Furthermore, single-cells sequencing is a technique that provides genetic information from individual cells and allows to classify them precisely and reliably. Sequencing data typically comprises thousands of gene expression reads per cell, which artificial intelligence algorithms can accurately analyze. This work presents machine-learning-based classifiers that differentiate CTCs from peripheral blood mononuclear cells (PBMCs) based on single cell RNA sequencing data. We developed four tree-based models and we trained and tested them on a dataset consisting of Smart-Seq2 sequenced data from primary tumor sections of breast cancer patients and PBMCs and on a public dataset with manually annotated CTC expression profiles from 34 metastatic breast patients, including triple-negative breast cancer. Our best models achieved about 95% balanced accuracy on the CTC test set on per cell basis, correctly detecting 133 out of 138 CTCs and CTC-PBMC clusters. Considering the non-invasive character of the liquid biopsy examination and our accurate results, we can conclude that our work has potential application value.
Identifiants
pubmed: 38744942
doi: 10.1038/s41598-024-61378-8
pii: 10.1038/s41598-024-61378-8
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
11057Subventions
Organisme : Narodowe Centrum Nauki
ID : OPUS/2020/37/B/NZ7/020
Organisme : Narodowe Centrum Nauki
ID : OPUS/2020/37/B/NZ7/020
Organisme : Narodowe Centrum Badań i Rozwoju
ID : No. 0059/L-11/2019
Organisme : Narodowe Centrum Badań i Rozwoju
ID : No. 0059/L-11/2019
Informations de copyright
© 2024. The Author(s).
Références
de Wit, S. et al. The detection of EpCAM(+) and EpCAM(−) circulating tumor cells. Sci. Rep. 5, 12270 (2015).
doi: 10.1038/srep12270
pubmed: 26184843
pmcid: 4505332
Franken, A. et al. Comparative analysis of EpCAM high-expressing and low-expressing circulating tumour cells with regard to their clonal relationship and clinical value. Br. J. Cancer 128, 1742–1752 (2023).
doi: 10.1038/s41416-023-02179-0
pubmed: 36823365
pmcid: 10133238
Pantel, K. & Alix-Panabières, C. Crucial roles of circulating tumor cells in the metastatic cascade and tumor immune escape: Biology and clinical translation. J. Immunother. Cancer 10, e005615 (2022).
doi: 10.1136/jitc-2022-005615
pubmed: 36517082
pmcid: 9756199
Sfakianakis, S., Bei, E. S. & Zervakis, M. Exploratory analysis of local gene groups in breast cancer guided by biological networks. Health Technol. 7, 119–132 (2017).
doi: 10.1007/s12553-016-0155-1
Lannin, T. B., Thege, F. I. & Kirby, B. J. Comparison and optimization of machine learning methods for automated classification of circulating tumor cells. Cytometry A 89, 922–931 (2016).
doi: 10.1002/cyto.a.22993
pubmed: 27754580
Tsuji, K. et al. Detection of circulating tumor cells in fluorescence microscopy images based on ANN classifier. Mob. Netw. Appl. 25, 1–10 (2020).
doi: 10.1007/s11036-018-1121-0
Tang, D., Chen, M., Han, Y., Xiang, N. & Ni, Z. Asymmetric serpentine microchannel based impedance cytometer enabling consistent transit and accurate characterization of tumor cells and blood cells. Sens. Actuators B Chem. 336, 129719 (2021).
doi: 10.1016/j.snb.2021.129719
Iyer, A. et al. Integrative analysis and machine learning based characterization of single circulating tumor cells. J. Clin. Med. 9, 1206 (2020).
doi: 10.3390/jcm9041206
pubmed: 32331451
pmcid: 7230872
He, B. et al. A new method for CTC images recognition based on machine learning. Front. Bioeng. Biotechnol. 8, 897 (2020).
doi: 10.3389/fbioe.2020.00897
pubmed: 32850745
pmcid: 7423836
Karaayvaz, M. et al. Unravelling subclonal heterogeneity and aggressive disease states in TNBC through single-cell RNA-seq. Nat. Commun. 9, 3588 (2018).
doi: 10.1038/s41467-018-06052-0
pubmed: 30181541
pmcid: 6123496
Szczerba, B. M. et al. Neutrophils escort circulating tumour cells to enable cell cycle progression. Nature 566, 553–557 (2019).
doi: 10.1038/s41586-019-0915-y
pubmed: 30728496
Picelli, S. et al. Full-length RNA-seq from single cells using Smart-seq2. Nat. Protoc. 9, 171–181 (2014).
doi: 10.1038/nprot.2014.006
pubmed: 24385147
Moreno, P. et al. Expression Atlas update: Gene and protein expression in multiple species. Nucleic Acids Res. 50, D129–D140 (2022).
doi: 10.1093/nar/gkab1030
pubmed: 34850121
Hao, Y. et al. Integrated analysis of multimodal single-cell data. Cell 184, 3573-3587.e29 (2021).
doi: 10.1016/j.cell.2021.04.048
pubmed: 34062119
pmcid: 8238499
Grinsztajn, L., Oyallon, E. & Varoquaux, G. Why do tree-based models still outperform deep learning on typical tabular data? In Advances in Neural Information Processing Systems Vol. 35 (eds Koyejo, S. et al.) 507–520 (Curran Associates Inc, 2022).
Chen, T. & Guestrin, C. XGBoost: A scalable tree boosting system 794. https://doi.org/10.1145/2939672.2939785 (2016).
Ke, G. et al. LightGBM: A highly efficient gradient boosting decision tree. In Proceedings of the 31st International Conference on Neural Information Processing Systems 3149–3157 (Curran Associates Inc., Red Hook, NY, USA, 2017).
Ho, T. K. Random decision forests. In Proceedings of 3rd International Conference on Document Analysis and Recognition vol. 1 278–282 (1995).
Chen, C. & Breiman, L. Using Random Forest to Learn Imbalanced Data (University of California, 2004).
Zhao, L., Wu, X., Li, T., Luo, J. & Dong, D. ctcRbase: The gene expression database of circulating tumor cells and microemboli. Database (Oxford) 2020, baaa020 (2020).
doi: 10.1093/database/baaa020
pubmed: 32294193
Ashburner, M. et al. Gene Ontology: tool for the unification of biology. Nat. Genet. 25, 25–29 (2000).
doi: 10.1038/75556
pubmed: 10802651
pmcid: 3037419
Gillespie, M. et al. The reactome pathway knowledgebase 2022. Nucleic Acids Res. 50, D687–D692 (2022).
doi: 10.1093/nar/gkab1028
pubmed: 34788843
Pereira-Veiga, T., Schneegans, S., Pantel, K. & Wikman, H. Circulating tumor cell-blood cell crosstalk: Biology and clinical relevance. Cell Rep. 40, 111298 (2022).
doi: 10.1016/j.celrep.2022.111298
pubmed: 36044866
Zhong, X. et al. Circulating tumor cells in cancer patients: developments and clinical applications for immunotherapy. Mol. Cancer 19, 15 (2020).
doi: 10.1186/s12943-020-1141-9
pubmed: 31980023
pmcid: 6982393
Balcik-Ercin, P., Cayrefourcq, L., Soundararajan, R., Mani, S. A. & Alix-Panabières, C. Epithelial-to-mesenchymal plasticity in circulating tumor cell lines sequentially derived from a patient with colorectal cancer. Cancers 13, 5408 (2021).
doi: 10.3390/cancers13215408
pubmed: 34771571
pmcid: 8582537
Sfakianakis, S., Bei, E. S., Zervakis, M., Vassou, D. & Kafetzopoulos, D. On the identification of circulating tumor cells in breast cancer. IEEE J. Biomed. Health Inform. 18, 773–782 (2014).
doi: 10.1109/JBHI.2013.2295262
pubmed: 24808221
Smirnov, D. A. et al. Global gene expression profiling of circulating tumor cells. Cancer Res. 65, 4993–4997 (2005).
doi: 10.1158/0008-5472.CAN-04-4330
pubmed: 15958538
Lin, H.-I. & Chang, Y.-C. Colorectal cancer detection by immunofluorescence images of circulating tumor cells. Ain Shams Eng. J. 12, 2673–2683 (2021).
doi: 10.1016/j.asej.2021.01.013
Zhang, Y., Mi, X., Tan, X. & Xiang, R. Recent progress on liquid biopsy analysis using surface-enhanced Raman spectroscopy. Theranostics 9, 491–525 (2019).
doi: 10.7150/thno.29875
pubmed: 30809289
pmcid: 6376192
Li, S. et al. Noninvasive prostate cancer screening based on serum surface-enhanced Raman spectroscopy and support vector machine. Appl. Phys. Lett. 105, 091104 (2014).
doi: 10.1063/1.4892667
Li, D. et al. Label-free detection of blood plasma using silver nanoparticle based surface-enhanced Raman spectroscopy for esophageal cancer screening. J. Biomed. Nanotechnol. 10, 478–484 (2014).
doi: 10.1166/jbn.2014.1750
pubmed: 24730243
Yap, K., Cohen, E. N., Reuben, J. M. & Khoury, J. D. Circulating tumor cells: State-of-the-art update on technologies and clinical applications. Curr. Hematol. Malig. Rep. 14, 353–357 (2019).
doi: 10.1007/s11899-019-00531-x
pubmed: 31364034
Danila, D. C. et al. Clinical validity of detecting circulating tumor cells by AdnaTest assay compared with direct detection of tumor mRNA in stabilized whole blood, as a biomarker predicting overall survival for metastatic castration-resistant prostate cancer patients. Cancer J. 22, 315–320 (2016).
doi: 10.1097/PPO.0000000000000220
pubmed: 27749322
pmcid: 5108569
Sever, R. & Brugge, J. S. Signal transduction in cancer. Cold Spring Harb. Perspect. Med. 5, a006098 (2015).
doi: 10.1101/cshperspect.a006098
pubmed: 25833940
pmcid: 4382731