Single-cell gene fusion detection by scFusion.
Journal
Nature communications
ISSN: 2041-1723
Titre abrégé: Nat Commun
Pays: England
ID NLM: 101528555
Informations de publication
Date de publication:
28 02 2022
28 02 2022
Historique:
received:
23
03
2021
accepted:
03
02
2022
entrez:
1
3
2022
pubmed:
2
3
2022
medline:
13
4
2022
Statut:
epublish
Résumé
Gene fusions can play important roles in tumor initiation and progression. While fusion detection so far has been from bulk samples, full-length single-cell RNA sequencing (scRNA-seq) offers the possibility of detecting gene fusions at the single-cell level. However, scRNA-seq data have a high noise level and contain various technical artifacts that can lead to spurious fusion discoveries. Here, we present a computational tool, scFusion, for gene fusion detection based on scRNA-seq. We evaluate the performance of scFusion using simulated and five real scRNA-seq datasets and find that scFusion can efficiently and sensitively detect fusions with a low false discovery rate. In a T cell dataset, scFusion detects the invariant TCR gene recombinations in mucosal-associated invariant T cells that many methods developed for bulk data fail to detect; in a multiple myeloma dataset, scFusion detects the known recurrent fusion IgH-WHSC1, which is associated with overexpression of the WHSC1 oncogene. Our results demonstrate that scFusion can be used to investigate cellular heterogeneity of gene fusions and their transcriptional impact at the single-cell level.
Identifiants
pubmed: 35228538
doi: 10.1038/s41467-022-28661-6
pii: 10.1038/s41467-022-28661-6
pmc: PMC8885711
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
1084Subventions
Organisme : National Natural Science Foundation of China (National Science Foundation of China)
ID : 11971039
Informations de copyright
© 2022. The Author(s).
Références
Rowley, J. D. Identificaton of a translocation with quinacrine fluorescence in a patient with acute leukemia. Annal. Genetique 16, 109–112 (1973).
Nowell, P. C. & Hungerford, D. A. Chromosome studies on normal and leukemic human leukocytes. J. Natl Cancer Inst. 25, 85–109 (1960).
pubmed: 14427847
Demichelis, F. et al. TMPRSS2:ERG gene fusion associated with lethal prostate cancer in a watchful waiting cohort. Oncogene 26, 4596–4599 (2007).
pubmed: 17237811
doi: 10.1038/sj.onc.1210237
Choi, Y. L. et al. EML4-ALK mutations in lung cancer that confer resistance to ALK inhibitors. N. Engl. J. Med. 363, 1734–1739 (2010).
pubmed: 20979473
doi: 10.1056/NEJMoa1007478
O’Hare, T. et al. In vitro activity of Bcr-Abl inhibitors AMN107 and BMS-354825 against clinically relevant imatinib-resistant Abl kinase domain mutants. Cancer Res. 65, 4500–4505 (2005).
pubmed: 15930265
doi: 10.1158/0008-5472.CAN-05-0259
Shaw, A. T. et al. Crizotinib versus chemotherapy in advanced ALK-positive lung cancer. N. Engl. J. Med. 368, 2385–2394 (2013).
pubmed: 23724913
doi: 10.1056/NEJMoa1214886
Laetsch, T. W. et al. Larotrectinib for paediatric solid tumours harbouring NTRK gene fusions: phase 1 results from a multicentre, open-label, phase 1/2 study. Lancet Oncol. 19, 705–714 (2018).
pubmed: 29606586
pmcid: 5949072
doi: 10.1016/S1470-2045(18)30119-0
Picelli, S. et al. Smart-seq2 for sensitive full-length transcriptome profiling in single cells. Nat. Methods 10, 1096–1098 (2013).
pubmed: 24056875
doi: 10.1038/nmeth.2639
Picelli, S. et al. Full-length RNA-seq from single cells using Smart-seq2. Nat. Protoc. 9, 171–181 (2014).
pubmed: 24385147
doi: 10.1038/nprot.2014.006
Ramskold, D. et al. Full-length mRNA-Seq from single-cell levels of RNA and individual circulating tumor cells. Nat. Biotechnol. 30, 777–782 (2012).
pubmed: 22820318
pmcid: 3467340
doi: 10.1038/nbt.2282
Kharchenko, P. V. The triumphs and limitations of computational methods for scRNA-seq. Nat. Methods 18, 723–732 (2021).
pubmed: 34155396
doi: 10.1038/s41592-021-01171-x
Chen, K. et al. BreakFusion: targeted assembly-based identification of gene fusions in whole transcriptome paired-end sequencing data. Bioinformatics 28, 1923–1924 (2012).
pubmed: 22563071
pmcid: 3389765
doi: 10.1093/bioinformatics/bts272
Nicorici, D. et al. FusionCatcher—a tool for finding somatic fusion genes in paired-end RNA-sequencing data. BioRxiv https://doi.org/10.1101/011650 (2014).
Davidson, N. M., Majewski, I. J. & Oshlack, A. JAFFA: High sensitivity transcriptome-focused fusion gene detection. Genome Med. 7, 43 (2015).
pubmed: 26019724
pmcid: 4445815
doi: 10.1186/s13073-015-0167-x
Francis, R. W. et al. FusionFinder: a software tool to identify expressed gene fusion candidates from RNA-Seq data. PLoS ONE 7, e39987 (2012).
pubmed: 22761941
pmcid: 3384600
doi: 10.1371/journal.pone.0039987
Li, Y., Chien, J., Smith, D. I. & Ma, J. FusionHunter: identifying fusion transcripts in cancer using paired-end RNA-seq. Bioinformatics 27, 1708–1710 (2011).
pubmed: 21546395
doi: 10.1093/bioinformatics/btr265
McPherson, A. et al. deFuse: an algorithm for gene fusion discovery in tumor RNA-Seq data. PLoS Comput. Biol. 7, e1001138 (2011).
pubmed: 21625565
pmcid: 3098195
doi: 10.1371/journal.pcbi.1001138
Benelli, M. et al. Discovering chimeric transcripts in paired-end RNA-seq data by using EricScript. Bioinformatics 28, 3232–3239 (2012).
pubmed: 23093608
doi: 10.1093/bioinformatics/bts617
Uhrig, S. et al. Accurate and efficient detection of gene fusions from RNA sequencing data. Genome Res. 31, 448–460 (2021).
pubmed: 33441414
pmcid: 7919457
doi: 10.1101/gr.257246.119
Haas, B. J. et al. Accuracy assessment of fusion transcript detection via read-mapping and de novo fusion transcript assembly-based methods. Genome Biol. 20, 213 (2019).
pubmed: 31639029
pmcid: 6802306
doi: 10.1186/s13059-019-1842-9
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
pubmed: 23104886
doi: 10.1093/bioinformatics/bts635
Ashurst, J. L. et al. The vertebrate genome annotation (Vega) database. Nucleic Acids Res. 33, D459–D465 (2005).
pubmed: 15608237
doi: 10.1093/nar/gki135
Zhang, Q. et al. Landscape and dynamics of single immune. Cells Hepatocell. Carcinoma Cell 179, 829–845.e820 (2019).
Sun, T., Song, D., Li, W. V. & Li, J. J. scDesign2: a transparent simulator that generates high-fidelity single-cell gene expression count data with gene correlations captured. Genome Biol. 22, 1–37 (2021).
Risso, D., Perraudeau, F., Gribkova, S., Dudoit, S. & Vert, J. P. A general and flexible method for signal extraction from single-cell RNA-seq data. Nat. Commun. 9, 284 (2018).
pubmed: 29348443
pmcid: 5773593
doi: 10.1038/s41467-017-02554-5
Townes, F. W., Hicks, S. C., Aryee, M. J. & Irizarry, R. A. Feature selection and dimension reduction for single-cell RNA-Seq based on a multinomial model. Genome Biol. 20, 1–16 (2019).
doi: 10.1186/s13059-019-1861-6
Sarkar, A. & Stephens, M. Separating measurement and expression models clarifies confusion in single-cell RNA sequencing analysis. Nat. Genet. 53, 770–777 (2021).
pubmed: 34031584
pmcid: 8370014
doi: 10.1038/s41588-021-00873-4
Schuster, M. & Paliwal, K. K. Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 45, 2673–2681 (1997).
doi: 10.1109/78.650093
Quang, D. & Xie, X. DanQ: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences. Nucleic Acids Res. 44, e107–e107 (2016).
pubmed: 27084946
pmcid: 4914104
doi: 10.1093/nar/gkw226
Yang, L. et al. Single-cell RNA-seq of esophageal squamous cell carcinoma cell line with fractionated irradiation reveals radioresistant gene expression patterns. BMC Genomics 20, 611 (2019).
pubmed: 31345182
pmcid: 6659267
doi: 10.1186/s12864-019-5970-0
Horning, A. M. et al. Single-Cell RNA-seq reveals a subpopulation of prostate cancer cells with enhanced cell-Cycle–Related transcription and attenuated androgen response. Cancer Res. 78, 853–864 (2018).
pubmed: 29233929
doi: 10.1158/0008-5472.CAN-17-1924
Fan, J. et al. Linking transcriptional and genetic tumor heterogeneity through allele analysis of single-cell RNA-seq data. Genome Res. 28, 1217–1227 (2018).
pubmed: 29898899
pmcid: 6071640
doi: 10.1101/gr.228080.117
Jang, J. S. et al. Molecular signatures of multiple myeloma progression through single cell RNA-Seq. Blood Cancer J. 9, 2 (2019).
pubmed: 30607001
pmcid: 6318319
doi: 10.1038/s41408-018-0160-x
Krivtsov, A. V. et al. A menin-MLL inhibitor induces specific chromatin changes and eradicates disease in models of MLL-rearranged leukemia. Cancer Cell 36, 660–673. e611 (2019).
pubmed: 31821784
pmcid: 7227117
doi: 10.1016/j.ccell.2019.11.001
Calabrese, C. et al. Genomic basis for RNA alterations in cancer. Nature 578, 129–136 (2020).
pubmed: 32025019
pmcid: 7054216
doi: 10.1038/s41586-020-1970-0
Campbell, P. J. et al. Pan-cancer analysis of whole genomes. Nature 578, 82–93 (2020).
doi: 10.1038/s41586-020-1969-6
Haas, B. J. et al. Chimeric 16S rRNA sequence formation and detection in Sanger and 454-pyrosequenced PCR amplicons. Genome Res. 21, 494–504 (2011).
pubmed: 21212162
pmcid: 3044863
doi: 10.1101/gr.112730.110
He, M. X. et al. Transcriptional mediators of treatment resistance in lethal prostate cancer. Nat. Med. 27, 426–433 (2021).
pubmed: 33664492
pmcid: 7960507
doi: 10.1038/s41591-021-01244-6
Zheng, C. et al. Landscape of infiltrating T cells in liver cancer revealed by single-cell sequencing. Cell 169, 1342–1356 e1316 (2017).
pubmed: 28622514
doi: 10.1016/j.cell.2017.05.035
Rudak, P. T., Yao, T., Richardson, C. D. & Haeryfar, S. Measles virus infects and programs MAIT cells for apoptosis. J. Infect. Dis. 223, 667–672 (2020).
Godfrey, D. I., Koay, H.-F., McCluskey, J. & Gherardin, N. A. The biology and functional importance of MAIT cells. Nat. Immunol. 20, 1110–1128 (2019).
pubmed: 31406380
doi: 10.1038/s41590-019-0444-8
Barwick, B. G. et al. Multiple myeloma immunoglobulin lambda translocations portend poor prognosis. Nat. Commun. 10, 1911 (2019).
pubmed: 31015454
pmcid: 6478743
doi: 10.1038/s41467-019-09555-6
Bergsagel, P. L. et al. Promiscuous translocations into immunoglobulin heavy chain switch regions in multiple myeloma. Proc. Natl Acad. Sci. USA 93, 13931–13936 (1996).
pubmed: 8943038
pmcid: 19472
doi: 10.1073/pnas.93.24.13931
Forbes, S. A. et al. COSMIC: exploring the world’s knowledge of somatic mutations in human cancer. Nucleic Acids Res. 43, D805–D811 (2015).
pubmed: 25355519
doi: 10.1093/nar/gku1075
Stec, I. et al. WHSC1, a 90 kb SET domain-containing gene, expressed in early development and homologous to a Drosophila dysmorphy gene maps in the Wolf-Hirschhorn syndrome critical region and is fused to IgH in t (1; 14) multiple myeloma. Hum. Mol. Genet. 7, 1071–1082 (1998).
pubmed: 9618163
doi: 10.1093/hmg/7.7.1071
Santra, M., Zhan, F., Tian, E., Barlogie, B. & Shaughnessy, J. Jr A subset of multiple myeloma harboring the t (4; 14)(p16; q32) translocation lacks FGFR3 expression but maintains an IGH/MMSET fusion transcript. Blood J. Am. Soc. Hematol. 101, 2374–2376 (2003).
Malgeri, U. et al. Detection of t (4; 14)(p16. 3; q32) chromosomal translocation in multiple myeloma by reverse transcription-polymerase chain reaction analysis of IGH-MMSET fusion transcripts. Cancer Res. 60, 4058–4061 (2000).
pubmed: 10945609
Kuo, A. J. et al. NSD2 links dimethylation of histone H3 at lysine 36 to oncogenic programming. Mol. Cell 44, 609–620 (2011).
pubmed: 22099308
pmcid: 3222870
doi: 10.1016/j.molcel.2011.08.042
Keats, J. J., Reiman, T., Belch, A. R. & Pilarski, L. M. Ten years and counting: so what do we know about t(4;14)(p16;q32) multiple myeloma. Leuk. Lymphoma 47, 2289–2300 (2006).
pubmed: 17107900
doi: 10.1080/10428190600822128
Mahajan, N., Weber, J. D., Maggi, L. B. & Tomasson, M. H. ACA11, a small nucleolar RNA activated in multiple myeloma, stimulates proliferation by inactivating NRF2 and increasing redox signaling. FASEB J. 30, 1054.1057–1054.1057 (2016).
Mani, R.-S. et al. TMPRSS2–ERG-mediated feed-forward regulation of wild-type ERG in human prostate cancers. Cancer Res. 71, 5387–5392 (2011).
pubmed: 21676887
pmcid: 3156376
doi: 10.1158/0008-5472.CAN-11-0876
Adamo, P. & Ladomery, M. R. The oncogene ERG: a key factor in prostate cancer. Oncogene 35, 403–414 (2016).
pubmed: 25915839
doi: 10.1038/onc.2015.109
Semaan, L., Mander, N., Cher, M. L. & Chinni, S. R. TMPRSS2-ERG fusions confer efficacy of enzalutamide in an in vivo bone tumor growth model. BMC Cancer 19, 972 (2019).
pubmed: 31638934
pmcid: 6802314
doi: 10.1186/s12885-019-6185-0
Zimmermann, S. et al. ALPK1- and TIFA-dependent innate immune response triggered by the Helicobacter pylori type IV secretion system. Cell Rep. 20, 2384–2395 (2017).
pubmed: 28877472
doi: 10.1016/j.celrep.2017.08.039
Keats, J. J. et al. Overexpression of transcripts originating from the MMSET locus characterizes all t(4;14)(p16;q32)-positive multiple myeloma patients. Blood 105, 4060–4069 (2005).
pubmed: 15677557
pmcid: 1895072
doi: 10.1182/blood-2004-09-3704
Sims, D., Sudbery, I., Ilott, N. E., Heger, A. & Ponting, C. P. Sequencing depth and coverage: key considerations in genomic analyses. Nat. Rev. Genet. 15, 121–132 (2014).
pubmed: 24434847
doi: 10.1038/nrg3642
Kingma, D. P. & Ba, J. L. Adam: a method for stochastic optimization. 3rd Int. Conf. Learn. Represent. ICLR 2015 - Conf. Track Proc. 1–15 (2015).
Alipanahi, B., Delong, A., Weirauch, M. T. & Frey, B. J. Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning. Nat. Biotechnol. 33, 831–838 (2015).
pubmed: 26213851
doi: 10.1038/nbt.3300
Butler, A., Hoffman, P., Smibert, P., Papalexi, E. & Satija, R. Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat. Biotechnol. 36, 411–420 (2018).
pubmed: 29608179
pmcid: 6700744
doi: 10.1038/nbt.4096
Stuart, T. et al. Comprehensive Integration of Single-. Cell Data. Cell 177, 1888–1902.e1821 (2019).
pubmed: 31178118
Jin, Z. et al. Single cell gene fusion detection by scFusion. GitHub https://doi.org/10.5281/zenodo.5879110 (2022)