Detection of circulating tumor cells by means of machine learning using Smart-Seq2 sequencing.


Journal

Scientific reports
ISSN: 2045-2322
Titre abrégé: Sci Rep
Pays: England
ID NLM: 101563288

Informations de publication

Date de publication:
14 05 2024
Historique:
received: 15 12 2023
accepted: 06 05 2024
medline: 15 5 2024
pubmed: 15 5 2024
entrez: 14 5 2024
Statut: epublish

Résumé

Circulating tumor cells (CTCs) are tumor cells that separate from the solid tumor and enter the bloodstream, which can cause metastasis. Detection and enumeration of CTCs show promising potential as a predictor for prognosis in cancer patients. Furthermore, single-cells sequencing is a technique that provides genetic information from individual cells and allows to classify them precisely and reliably. Sequencing data typically comprises thousands of gene expression reads per cell, which artificial intelligence algorithms can accurately analyze. This work presents machine-learning-based classifiers that differentiate CTCs from peripheral blood mononuclear cells (PBMCs) based on single cell RNA sequencing data. We developed four tree-based models and we trained and tested them on a dataset consisting of Smart-Seq2 sequenced data from primary tumor sections of breast cancer patients and PBMCs and on a public dataset with manually annotated CTC expression profiles from 34 metastatic breast patients, including triple-negative breast cancer. Our best models achieved about 95% balanced accuracy on the CTC test set on per cell basis, correctly detecting 133 out of 138 CTCs and CTC-PBMC clusters. Considering the non-invasive character of the liquid biopsy examination and our accurate results, we can conclude that our work has potential application value.

Identifiants

pubmed: 38744942
doi: 10.1038/s41598-024-61378-8
pii: 10.1038/s41598-024-61378-8
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

11057

Subventions

Organisme : Narodowe Centrum Nauki
ID : OPUS/2020/37/B/NZ7/020
Organisme : Narodowe Centrum Nauki
ID : OPUS/2020/37/B/NZ7/020
Organisme : Narodowe Centrum Badań i Rozwoju
ID : No. 0059/L-11/2019
Organisme : Narodowe Centrum Badań i Rozwoju
ID : No. 0059/L-11/2019

Informations de copyright

© 2024. The Author(s).

Références

de Wit, S. et al. The detection of EpCAM(+) and EpCAM(−) circulating tumor cells. Sci. Rep. 5, 12270 (2015).
doi: 10.1038/srep12270 pubmed: 26184843 pmcid: 4505332
Franken, A. et al. Comparative analysis of EpCAM high-expressing and low-expressing circulating tumour cells with regard to their clonal relationship and clinical value. Br. J. Cancer 128, 1742–1752 (2023).
doi: 10.1038/s41416-023-02179-0 pubmed: 36823365 pmcid: 10133238
Pantel, K. & Alix-Panabières, C. Crucial roles of circulating tumor cells in the metastatic cascade and tumor immune escape: Biology and clinical translation. J. Immunother. Cancer 10, e005615 (2022).
doi: 10.1136/jitc-2022-005615 pubmed: 36517082 pmcid: 9756199
Sfakianakis, S., Bei, E. S. & Zervakis, M. Exploratory analysis of local gene groups in breast cancer guided by biological networks. Health Technol. 7, 119–132 (2017).
doi: 10.1007/s12553-016-0155-1
Lannin, T. B., Thege, F. I. & Kirby, B. J. Comparison and optimization of machine learning methods for automated classification of circulating tumor cells. Cytometry A 89, 922–931 (2016).
doi: 10.1002/cyto.a.22993 pubmed: 27754580
Tsuji, K. et al. Detection of circulating tumor cells in fluorescence microscopy images based on ANN classifier. Mob. Netw. Appl. 25, 1–10 (2020).
doi: 10.1007/s11036-018-1121-0
Tang, D., Chen, M., Han, Y., Xiang, N. & Ni, Z. Asymmetric serpentine microchannel based impedance cytometer enabling consistent transit and accurate characterization of tumor cells and blood cells. Sens. Actuators B Chem. 336, 129719 (2021).
doi: 10.1016/j.snb.2021.129719
Iyer, A. et al. Integrative analysis and machine learning based characterization of single circulating tumor cells. J. Clin. Med. 9, 1206 (2020).
doi: 10.3390/jcm9041206 pubmed: 32331451 pmcid: 7230872
He, B. et al. A new method for CTC images recognition based on machine learning. Front. Bioeng. Biotechnol. 8, 897 (2020).
doi: 10.3389/fbioe.2020.00897 pubmed: 32850745 pmcid: 7423836
Karaayvaz, M. et al. Unravelling subclonal heterogeneity and aggressive disease states in TNBC through single-cell RNA-seq. Nat. Commun. 9, 3588 (2018).
doi: 10.1038/s41467-018-06052-0 pubmed: 30181541 pmcid: 6123496
Szczerba, B. M. et al. Neutrophils escort circulating tumour cells to enable cell cycle progression. Nature 566, 553–557 (2019).
doi: 10.1038/s41586-019-0915-y pubmed: 30728496
Picelli, S. et al. Full-length RNA-seq from single cells using Smart-seq2. Nat. Protoc. 9, 171–181 (2014).
doi: 10.1038/nprot.2014.006 pubmed: 24385147
Moreno, P. et al. Expression Atlas update: Gene and protein expression in multiple species. Nucleic Acids Res. 50, D129–D140 (2022).
doi: 10.1093/nar/gkab1030 pubmed: 34850121
Hao, Y. et al. Integrated analysis of multimodal single-cell data. Cell 184, 3573-3587.e29 (2021).
doi: 10.1016/j.cell.2021.04.048 pubmed: 34062119 pmcid: 8238499
Grinsztajn, L., Oyallon, E. & Varoquaux, G. Why do tree-based models still outperform deep learning on typical tabular data? In Advances in Neural Information Processing Systems Vol. 35 (eds Koyejo, S. et al.) 507–520 (Curran Associates Inc, 2022).
Chen, T. & Guestrin, C. XGBoost: A scalable tree boosting system 794. https://doi.org/10.1145/2939672.2939785 (2016).
Ke, G. et al. LightGBM: A highly efficient gradient boosting decision tree. In Proceedings of the 31st International Conference on Neural Information Processing Systems 3149–3157 (Curran Associates Inc., Red Hook, NY, USA, 2017).
Ho, T. K. Random decision forests. In Proceedings of 3rd International Conference on Document Analysis and Recognition vol. 1 278–282 (1995).
Chen, C. & Breiman, L. Using Random Forest to Learn Imbalanced Data (University of California, 2004).
Zhao, L., Wu, X., Li, T., Luo, J. & Dong, D. ctcRbase: The gene expression database of circulating tumor cells and microemboli. Database (Oxford) 2020, baaa020 (2020).
doi: 10.1093/database/baaa020 pubmed: 32294193
Ashburner, M. et al. Gene Ontology: tool for the unification of biology. Nat. Genet. 25, 25–29 (2000).
doi: 10.1038/75556 pubmed: 10802651 pmcid: 3037419
Gillespie, M. et al. The reactome pathway knowledgebase 2022. Nucleic Acids Res. 50, D687–D692 (2022).
doi: 10.1093/nar/gkab1028 pubmed: 34788843
Pereira-Veiga, T., Schneegans, S., Pantel, K. & Wikman, H. Circulating tumor cell-blood cell crosstalk: Biology and clinical relevance. Cell Rep. 40, 111298 (2022).
doi: 10.1016/j.celrep.2022.111298 pubmed: 36044866
Zhong, X. et al. Circulating tumor cells in cancer patients: developments and clinical applications for immunotherapy. Mol. Cancer 19, 15 (2020).
doi: 10.1186/s12943-020-1141-9 pubmed: 31980023 pmcid: 6982393
Balcik-Ercin, P., Cayrefourcq, L., Soundararajan, R., Mani, S. A. & Alix-Panabières, C. Epithelial-to-mesenchymal plasticity in circulating tumor cell lines sequentially derived from a patient with colorectal cancer. Cancers 13, 5408 (2021).
doi: 10.3390/cancers13215408 pubmed: 34771571 pmcid: 8582537
Sfakianakis, S., Bei, E. S., Zervakis, M., Vassou, D. & Kafetzopoulos, D. On the identification of circulating tumor cells in breast cancer. IEEE J. Biomed. Health Inform. 18, 773–782 (2014).
doi: 10.1109/JBHI.2013.2295262 pubmed: 24808221
Smirnov, D. A. et al. Global gene expression profiling of circulating tumor cells. Cancer Res. 65, 4993–4997 (2005).
doi: 10.1158/0008-5472.CAN-04-4330 pubmed: 15958538
Lin, H.-I. & Chang, Y.-C. Colorectal cancer detection by immunofluorescence images of circulating tumor cells. Ain Shams Eng. J. 12, 2673–2683 (2021).
doi: 10.1016/j.asej.2021.01.013
Zhang, Y., Mi, X., Tan, X. & Xiang, R. Recent progress on liquid biopsy analysis using surface-enhanced Raman spectroscopy. Theranostics 9, 491–525 (2019).
doi: 10.7150/thno.29875 pubmed: 30809289 pmcid: 6376192
Li, S. et al. Noninvasive prostate cancer screening based on serum surface-enhanced Raman spectroscopy and support vector machine. Appl. Phys. Lett. 105, 091104 (2014).
doi: 10.1063/1.4892667
Li, D. et al. Label-free detection of blood plasma using silver nanoparticle based surface-enhanced Raman spectroscopy for esophageal cancer screening. J. Biomed. Nanotechnol. 10, 478–484 (2014).
doi: 10.1166/jbn.2014.1750 pubmed: 24730243
Yap, K., Cohen, E. N., Reuben, J. M. & Khoury, J. D. Circulating tumor cells: State-of-the-art update on technologies and clinical applications. Curr. Hematol. Malig. Rep. 14, 353–357 (2019).
doi: 10.1007/s11899-019-00531-x pubmed: 31364034
Danila, D. C. et al. Clinical validity of detecting circulating tumor cells by AdnaTest assay compared with direct detection of tumor mRNA in stabilized whole blood, as a biomarker predicting overall survival for metastatic castration-resistant prostate cancer patients. Cancer J. 22, 315–320 (2016).
doi: 10.1097/PPO.0000000000000220 pubmed: 27749322 pmcid: 5108569
Sever, R. & Brugge, J. S. Signal transduction in cancer. Cold Spring Harb. Perspect. Med. 5, a006098 (2015).
doi: 10.1101/cshperspect.a006098 pubmed: 25833940 pmcid: 4382731

Auteurs

Krzysztof Pastuszak (K)

Faculty of Electronics, Telecommunication and Informatics, Gdańsk University of Technology, Gabriela Narutowicza 11/12, 80-233, Gdańsk, Poland. krzpastu@pg.edu.pl.
Laboratory of Translational Oncology, Intercollegiate Faculty of Biotechnology, Medical University of Gdańsk, Marii Skłodowskiej-Curie 3a, 80-210, Gdańsk, Poland. krzpastu@pg.edu.pl.
Centre of Biostatistics and Bioinformatics, Medical University of Gdańsk, Marii Skłodowskiej-Curie 3a, 80-210, Gdańsk, Poland. krzpastu@pg.edu.pl.

Michał Sieczczyński (M)

Centre of Biostatistics and Bioinformatics, Medical University of Gdańsk, Marii Skłodowskiej-Curie 3a, 80-210, Gdańsk, Poland.

Marta Dzięgielewska (M)

Faculty of Electronics, Telecommunication and Informatics, Gdańsk University of Technology, Gabriela Narutowicza 11/12, 80-233, Gdańsk, Poland.

Rafał Wolniak (R)

Faculty of Electronics, Telecommunication and Informatics, Gdańsk University of Technology, Gabriela Narutowicza 11/12, 80-233, Gdańsk, Poland.

Agata Drewnowska (A)

Faculty of Electronics, Telecommunication and Informatics, Gdańsk University of Technology, Gabriela Narutowicza 11/12, 80-233, Gdańsk, Poland.

Marcel Korpal (M)

Faculty of Electronics, Telecommunication and Informatics, Gdańsk University of Technology, Gabriela Narutowicza 11/12, 80-233, Gdańsk, Poland.

Laura Zembrzuska (L)

Faculty of Electronics, Telecommunication and Informatics, Gdańsk University of Technology, Gabriela Narutowicza 11/12, 80-233, Gdańsk, Poland.

Anna Supernat (A)

Laboratory of Translational Oncology, Intercollegiate Faculty of Biotechnology, Medical University of Gdańsk, Marii Skłodowskiej-Curie 3a, 80-210, Gdańsk, Poland.
Centre of Biostatistics and Bioinformatics, Medical University of Gdańsk, Marii Skłodowskiej-Curie 3a, 80-210, Gdańsk, Poland.

Anna J Żaczek (AJ)

Laboratory of Translational Oncology, Intercollegiate Faculty of Biotechnology, Medical University of Gdańsk, Marii Skłodowskiej-Curie 3a, 80-210, Gdańsk, Poland. azaczek@gumed.edu.pl.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH