De novo detection of somatic mutations in high-throughput single-cell profiling data sets.
Journal
Nature biotechnology
ISSN: 1546-1696
Titre abrégé: Nat Biotechnol
Pays: United States
ID NLM: 9604648
Informations de publication
Date de publication:
06 Jul 2023
06 Jul 2023
Historique:
received:
23
11
2022
accepted:
07
06
2023
medline:
7
7
2023
pubmed:
7
7
2023
entrez:
6
7
2023
Statut:
aheadofprint
Résumé
Characterization of somatic mutations at single-cell resolution is essential to study cancer evolution, clonal mosaicism and cell plasticity. Here, we describe SComatic, an algorithm designed for the detection of somatic mutations in single-cell transcriptomic and ATAC-seq (assay for transposase-accessible chromatin sequence) data sets directly without requiring matched bulk or single-cell DNA sequencing data. SComatic distinguishes somatic mutations from polymorphisms, RNA-editing events and artefacts using filters and statistical tests parameterized on non-neoplastic samples. Using >2.6 million single cells from 688 single-cell RNA-seq (scRNA-seq) and single-cell ATAC-seq (scATAC-seq) data sets spanning cancer and non-neoplastic samples, we show that SComatic detects mutations in single cells accurately, even in differentiated cells from polyclonal tissues that are not amenable to mutation detection using existing methods. Validated against matched genome sequencing and scRNA-seq data, SComatic achieves F1 scores between 0.6 and 0.7 across diverse data sets, in comparison to 0.2-0.4 for the second-best performing method. In summary, SComatic permits de novo mutational signature analysis, and the study of clonal heterogeneity and mutational burdens at single-cell resolution.
Identifiants
pubmed: 37414936
doi: 10.1038/s41587-023-01863-z
pii: 10.1038/s41587-023-01863-z
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Subventions
Organisme : NHLBI NIH HHS
ID : R01 HL158269
Pays : United States
Informations de copyright
© 2023. The Author(s).
Références
Neftel, C. et al. An integrative model of cellular states, plasticity, and genetics for glioblastoma. Cell 178, 835–849.e21 (2019).
pubmed: 31327527
pmcid: 6703186
doi: 10.1016/j.cell.2019.06.024
Kakiuchi, N. & Ogawa, S. Clonal expansion in non-cancer tissues. Nat. Rev. Cancer 21, 239–256 (2021).
pubmed: 33627798
doi: 10.1038/s41568-021-00335-3
Nam, A. S., Chaligne, R. & Landau, D. A. Integrating genetic and non-genetic determinants of cancer evolution by single-cell multi-omics. Nat. Rev. Genet. 22, 3–18 (2021).
pubmed: 32807900
doi: 10.1038/s41576-020-0265-5
Lim, B., Lin, Y. & Navin, N. Advancing cancer research and medicine with single-cell genomics. Cancer Cell 37, 456–470 (2020).
pubmed: 32289270
pmcid: 7899145
doi: 10.1016/j.ccell.2020.03.008
Gawad, C., Koh, W. & Quake, S. R. Single-cell genome sequencing: current state of the science. Nat. Rev. Genet. 17, 175–188 (2016).
pubmed: 26806412
doi: 10.1038/nrg.2015.16
Lee-Six, H. et al. Population dynamics of normal human blood inferred from somatic mutations. Nature 561, 473–478 (2018).
pubmed: 30185910
pmcid: 6163040
doi: 10.1038/s41586-018-0497-0
Moore, L. et al. The mutational landscape of human somatic and germline cells. Nature 597, 381–386 (2021).
pubmed: 34433962
doi: 10.1038/s41586-021-03822-7
Van Egeren, D. et al. Reconstructing the lineage histories and differentiation trajectories of individual cancer cells in myeloproliferative neoplasms. Cell Stem Cell 28, 514–523.e9 (2021).
pubmed: 33621486
pmcid: 7939520
doi: 10.1016/j.stem.2021.02.001
Zhang, C.-Z. et al. Calibrating genomic and allelic coverage bias in single-cell sequencing. Nat. Commun. 6, 6822 (2015).
pubmed: 25879913
doi: 10.1038/ncomms7822
Xing, D., Tan, L., Chang, C.-H., Li, H. & Xie, X. S. Accurate SNV detection in single cells by transposon-based whole-genome amplification of complementary strands. Proc. Natl Acad. Sci. USA 118, e2013106118 (2021).
pubmed: 33593904
pmcid: 7923680
doi: 10.1073/pnas.2013106118
Abascal, F. et al. Somatic mutation landscapes at single-molecule resolution. Nature 593, 405–410 (2021).
pubmed: 33911282
doi: 10.1038/s41586-021-03477-4
van Galen, P. et al. Single-cell RNA-seq reveals AML hierarchies relevant to disease progression and immunity. Cell 176, 1265–1281.e24 (2019).
pubmed: 30827681
pmcid: 6515904
doi: 10.1016/j.cell.2019.01.031
Li, R. et al. Mapping single-cell transcriptomes in the intra-tumoral and associated territories of kidney cancer. Cancer Cell 40, 1583–1599.e10 (2022).
Macaulay, I. C. et al. G&T-seq: parallel sequencing of single-cell genomes and transcriptomes. Nat. Methods 12, 519–522 (2015).
pubmed: 25915121
doi: 10.1038/nmeth.3370
Nam, A. S. et al. Somatic mutations and cell identity linked by Genotyping of Transcriptomes. Nature 571, 355–360 (2019).
pubmed: 31270458
pmcid: 6782071
doi: 10.1038/s41586-019-1367-0
Reuter, J. A., Spacek, D. V., Pai, R. K. & Snyder, M. P. Simul-seq: combined DNA and RNA sequencing for whole-genome and transcriptome profiling. Nat. Methods 13, 953–958 (2016).
pubmed: 27723755
pmcid: 5734913
doi: 10.1038/nmeth.4028
Fan, J. et al. Linking transcriptional and genetic tumor heterogeneity through allele analysis of single-cell RNA-seq data. Genome Res. 28, 1217–1227 (2018).
pubmed: 29898899
pmcid: 6071640
doi: 10.1101/gr.228080.117
Petti, A. A. et al. A general approach for detecting expressed mutations in AML cells using single cell RNA-sequencing. Nat. Commun. 10, 3660 (2019).
pubmed: 31413257
pmcid: 6694122
doi: 10.1038/s41467-019-11591-1
Kharchenko, P. V., Silberstein, L. & Scadden, D. T. Bayesian approach to single-cell differential expression analysis. Nat. Methods 11, 740–742 (2014).
pubmed: 24836921
pmcid: 4112276
doi: 10.1038/nmeth.2967
Huang, A. Y. et al. Parallel RNA and DNA analysis after deep sequencing (PRDD-seq) reveals cell type-specific lineage patterns in human brain. Proc. Natl Acad. Sci. USA 117, 13886–13895 (2020).
pubmed: 32522880
pmcid: 7322034
doi: 10.1073/pnas.2006163117
McCarthy, D. J. et al. Cardelino: computational integration of somatic clonal substructure and single-cell transcriptomes. Nat. Methods 17, 414–421 (2020).
pubmed: 32203388
doi: 10.1038/s41592-020-0766-3
Liu, F. et al. Systematic comparative analysis of single-nucleotide variant detection methods from single-cell RNA sequencing data. Genome Biol. 20, 242 (2019).
pubmed: 31744515
pmcid: 6862814
doi: 10.1186/s13059-019-1863-4
Bizzotto, S. et al. Landmarks of human embryonic development inscribed in somatic mutations. Science 371, 1249–1253 (2021).
pubmed: 33737485
pmcid: 8170505
doi: 10.1126/science.abe1544
Coorens, T. H. H. et al. Extensive phylogenies of human development inferred from somatic mutations. Nature 597, 387–392 (2021).
pubmed: 34433963
doi: 10.1038/s41586-021-03790-y
Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443 (2020).
pubmed: 32461654
pmcid: 7334197
doi: 10.1038/s41586-020-2308-7
Ji, A. L. et al. Multimodal analysis of composition and spatial architecture in human squamous cell carcinoma. Cell 182, 497–514.e22 (2020).
pubmed: 32579974
pmcid: 7391009
doi: 10.1016/j.cell.2020.05.039
Martincorena, I. et al. High burden and pervasive positive selection of somatic mutations in normal human skin. Science 348, 880–886 (2015).
pubmed: 25999502
pmcid: 4471149
doi: 10.1126/science.aaa6806
Reble, E., Castellani, C. A., Melka, M. G., O’Reilly, R. & Singh, S. M. VarScan2 analysis of de novo variants in monozygotic twins discordant for schizophrenia. Psychiatr. Genet. 27, 62–70 (2017).
pubmed: 28125460
doi: 10.1097/YPG.0000000000000162
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
pubmed: 19451168
pmcid: 2705234
doi: 10.1093/bioinformatics/btp324
Kim, S. et al. Strelka2: fast and accurate calling of germline and somatic variants. Nat. Methods 15, 591–594 (2018).
pubmed: 30013048
doi: 10.1038/s41592-018-0051-x
Zafar, H., Wang, Y., Nakhleh, L., Navin, N. & Chen, K. Monovar: single-nucleotide variant detection in single cells. Nat. Methods 13, 505–507 (2016).
pubmed: 27088313
pmcid: 4887298
doi: 10.1038/nmeth.3835
Prashant, N. M. et al. SCReadCounts: estimation of cell-level SNVs expression from scRNA-seq data. BMC Genomics 22, 689 (2021).
pubmed: 34551708
pmcid: 8459565
doi: 10.1186/s12864-021-07974-8
Vázquez-García, I. et al. Ovarian cancer mutational processes drive site-specific immune evasion. Nature 612, 778–786 (2022).
pubmed: 36517593
pmcid: 9771812
doi: 10.1038/s41586-022-05496-1
Li, R. et al. Mapping single-cell transcriptomes in the intra-tumoral and associated territories of kidney cancer. Cancer Cell 40, 1583–1599.e10 (2022).
pubmed: 36423636
pmcid: 9767677
doi: 10.1016/j.ccell.2022.11.001
Nik-Zainal, S. et al. Landscape of somatic mutations in 560 breast cancer whole-genome sequences. Nature 534, 47–54 (2016).
pubmed: 27135926
pmcid: 4910866
doi: 10.1038/nature17676
Gulhan, D. C., Lee, J. J.-K., Melloni, G. E. M., Cortés-Ciriano, I. & Park, P. J. Detecting the mutational signature of homologous recombination deficiency in clinical samples. Nat. Genet. 51, 912–919 (2019).
pubmed: 30988514
doi: 10.1038/s41588-019-0390-2
Pelka, K. et al. Spatially organized multicellular immune hubs in human colorectal cancer. Cell 184, 4734–4752.e20 (2021).
pubmed: 34450029
pmcid: 8772395
doi: 10.1016/j.cell.2021.08.003
Lee, H.-O. et al. Lineage-dependent gene expression programs influence the immune landscape of colorectal cancer. Nat. Genet. 52, 594–603 (2020).
pubmed: 32451460
doi: 10.1038/s41588-020-0636-z
Cortes-Ciriano, I., Lee, S., Park, W.-Y., Kim, T.-M. & Park, P. J. A molecular portrait of microsatellite instability across multiple cancers. Nat. Commun. 8, 15180 (2017).
pubmed: 28585546
pmcid: 5467167
doi: 10.1038/ncomms15180
Bailey, M. H. et al. Comprehensive characterization of cancer driver genes and mutations. Cell 174, 1034–1035 (2018).
pubmed: 30096302
pmcid: 8045146
doi: 10.1016/j.cell.2018.07.034
Haradhvala, N. J. et al. Distinct mutational signatures characterize concurrent loss of polymerase proofreading and mismatch repair. Nat. Commun. 9, 1746 (2018).
pubmed: 29717118
pmcid: 5931517
doi: 10.1038/s41467-018-04002-4
Alexandrov, L. B. et al. The repertoire of mutational signatures in human cancer. Nature 578, 94–101 (2020).
pubmed: 32025018
pmcid: 7054213
doi: 10.1038/s41586-020-1943-3
Osorio, F. G. et al. Somatic mutations reveal lineage relationships and age-related mutagenesis in human hematopoiesis. Cell Rep. 25, 2308–2316.e4 (2018).
pubmed: 30485801
pmcid: 6289083
doi: 10.1016/j.celrep.2018.11.014
Williams, N. et al. Life histories of myeloproliferative neoplasms inferred from phylogenies. Nature 602, 162–168 (2022).
pubmed: 35058638
doi: 10.1038/s41586-021-04312-6
Litviňuková, M. et al. Cells of the adult human heart. Nature 588, 466–472 (2020).
pubmed: 32971526
pmcid: 7681775
doi: 10.1038/s41586-020-2797-4
Choudhury, S. et al. Somatic mutations in single human cardiomyocytes reveal age-associated DNA damage and widespread oxidative genotoxicity. Nat. Aging 2, 714–725 (2022).
pubmed: 36051457
pmcid: 9432807
doi: 10.1038/s43587-022-00261-5
Eraslan, G. et al. Single-nucleus cross-tissue molecular reference maps toward understanding disease gene function. Science 376, eabl4290 (2022).
pubmed: 35549429
pmcid: 9383269
doi: 10.1126/science.abl4290
Zhang, K. et al. A single-cell atlas of chromatin accessibility in the human genome. Cell 184, 5985–6001.e19 (2021).
pubmed: 34774128
pmcid: 8664161
doi: 10.1016/j.cell.2021.10.024
Ng, S. W. K. et al. Convergent somatic mutations in metabolism genes in chronic liver disease. Nature 598, 473–478 (2021).
pubmed: 34646017
doi: 10.1038/s41586-021-03974-6
Gao, T. et al. Haplotype-aware analysis of somatic copy number variations from single-cell transcriptomes. Nat. Biotechnol. 41, 417–426 (2023).
pubmed: 36163550
doi: 10.1038/s41587-022-01468-y
Van Egeren, D. et al. Transcriptional differences between JAK2-V617F and wild-type bone marrow cells in patients with myeloproliferative neoplasms. Exp. Hematol. 107, 14–19 (2022).
pubmed: 34921959
doi: 10.1016/j.exphem.2021.12.364
Regev, A. et al. The Human Cell Atlas. eLife 6, e27041 (2017).
pubmed: 29206104
pmcid: 5762154
doi: 10.7554/eLife.27041
Rozenblatt-Rosen, O. et al. The Human Tumor Atlas Network: charting tumor transitions across space and time at single-cell resolution. Cell 181, 236–249 (2020).
pubmed: 32302568
pmcid: 7376497
doi: 10.1016/j.cell.2020.03.053
Zheng, G. X. Y. et al. Massively parallel digital transcriptional profiling of single cells. Nat. Commun. 8, 14049 (2017).
pubmed: 28091601
pmcid: 5241818
doi: 10.1038/ncomms14049
Li, H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. Preprint at arXiv https://doi.org/10.48550/arXiv.1303.3997 (2013).
Van der Auwera, G. A. & O’Connor, B. D. Genomics in the cloud: using Docker, GATK, and WDL in Terra (O’Reilly Media, 2020).
Muyas, F., Zapata, L., Guigó, R. & Ossowski, S. The rate and spectrum of mosaic mutations during embryogenesis revealed by RNA sequencing of 49 tissues. Genome Med. 12, 49 (2020).
pubmed: 32460841
pmcid: 7254727
doi: 10.1186/s13073-020-00746-1
Bonfield, J. K. et al. HTSlib: C library for reading/writing high-throughput sequencing data. Gigascience 10, giab007 (2021). https://github.com/pysam-developers/pysam
Lo Giudice, C., Tangaro, M. A., Pesole, G. & Picardi, E. Investigating RNA editing in deep transcriptome datasets with REDItools and REDIportal. Nat. Protoc. 15, 1098–1131 (2020).
pubmed: 31996844
doi: 10.1038/s41596-019-0279-7
Kiran, A. & Baranov, P. V. DARNED: a DAtabase of RNa EDiting in humans. Bioinformatics 26, 1772–1776 (2010).
pubmed: 20547637
doi: 10.1093/bioinformatics/btq285
Nakamura, K. et al. Sequence-specific error profile of Illumina sequencers. Nucleic Acids Res. 39, e90 (2011).
pubmed: 21576222
pmcid: 3141275
doi: 10.1093/nar/gkr344
Blokzijl, F., Janssen, R., van Boxtel, R. & Cuppen, E. MutationalPatterns: comprehensive genome-wide analysis of mutational processes. Genome Med. 10, 33 (2018).
pubmed: 29695279
pmcid: 5922316
doi: 10.1186/s13073-018-0539-0
DePristo, M. A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491–498 (2011).
pubmed: 21478889
pmcid: 3083463
doi: 10.1038/ng.806
Fan, Y. et al. MuSE: accounting for tumor heterogeneity using a sample-specific error model improves sensitivity and specificity in mutation calling from sequencing data. Genome Biol. 17, 178 (2016).
pubmed: 27557938
pmcid: 4995747
doi: 10.1186/s13059-016-1029-6
Dentro, S. C. et al. Characterizing genetic intra-tumor heterogeneity across 2,658 human cancer genomes. Cell 184, 2239–2254.e39 (2021).
pubmed: 33831375
pmcid: 8054914
doi: 10.1016/j.cell.2021.03.009
Li, H. et al. The sequence alignment/map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
pubmed: 19505943
pmcid: 2723002
doi: 10.1093/bioinformatics/btp352
Koboldt, D. C. et al. VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res. 22, 568–576 (2012).
pubmed: 22300766
pmcid: 3290792
doi: 10.1101/gr.129684.111
Karczewski, K. J. et al. The ExAC browser: displaying reference data information from over 60 000 exomes. Nucleic Acids Res. 45, D840–D845 (2017).
pubmed: 27899611
doi: 10.1093/nar/gkw971
Huang, X. & Huang, Y. Cellsnp-lite: an efficient tool for genotyping single cells. Bioinformatics 37, 4569–4571 (2021).
pubmed: 33963851
doi: 10.1093/bioinformatics/btab358
Loh, P.-R. et al. Reference-based phasing using the Haplotype Reference Consortium panel. Nat. Genet. 48, 1443–1448 (2016).
pubmed: 27694958
pmcid: 5096458
doi: 10.1038/ng.3679
Sondka, Z. et al. The COSMIC Cancer Gene Census: describing genetic dysfunction across all human cancers. Nat. Rev. Cancer 18, 696–705 (2018).
pubmed: 30293088
pmcid: 6450507
doi: 10.1038/s41568-018-0060-1
Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164 (2010).
pubmed: 20601685
pmcid: 2938201
doi: 10.1093/nar/gkq603