Using sigLASSO to optimize cancer mutation signatures jointly with sampling likelihood.


Journal

Nature communications
ISSN: 2041-1723
Titre abrégé: Nat Commun
Pays: England
ID NLM: 101528555

Informations de publication

Date de publication:
17 07 2020
Historique:
received: 07 01 2020
accepted: 22 06 2020
entrez: 19 7 2020
pubmed: 19 7 2020
medline: 20 9 2020
Statut: epublish

Résumé

Multiple mutational processes drive carcinogenesis, leaving characteristic signatures in tumor genomes. Determining the active signatures from a full repertoire of potential ones helps elucidate mechanisms of cancer development. This involves optimally decomposing the counts of cancer mutations, tabulated according to their trinucleotide context, into a linear combination of known signatures. Here, we develop sigLASSO (a software tool at github.com/gersteinlab/siglasso) to carry out this optimization efficiently. sigLASSO has four key aspects: (1) It jointly optimizes the likelihood of sampling and signature fitting, by explicitly factoring multinomial sampling into the objective function. This is particularly important when mutation counts are low and sampling variance is high (e.g., in exome sequencing). (2) sigLASSO uses L1 regularization to parsimoniously assign signatures, leading to sparse and interpretable solutions. (3) It fine-tunes model complexity, informed by data scale and biological priors. (4) Consequently, sigLASSO can assess model uncertainty and abstain from making assignments in low-confidence contexts.

Identifiants

pubmed: 32681003
doi: 10.1038/s41467-020-17388-x
pii: 10.1038/s41467-020-17388-x
pmc: PMC7368050
doi:

Types de publication

Evaluation Study Journal Article Research Support, N.I.H., Extramural Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

3575

Subventions

Organisme : NHGRI NIH HHS
ID : R01 HG008126
Pays : United States
Organisme : NICHD NIH HHS
ID : DP2 HD091799
Pays : United States

Références

Nik-Zainal, S. et al. Mutational processes molding the genomes of 21 breast cancers. Cell 149, 979–993 (2012).
doi: 10.1016/j.cell.2012.04.024
Alexandrov, L. B., Nik-Zainal, S., Wedge, D. C., Campbell, P. J. & Stratton, M. R. Deciphering signatures of mutational processes operative in human cancer. Cell Rep. 3, 246–259 (2013).
doi: 10.1016/j.celrep.2012.12.008
Alexandrov, L. B. et al. Signatures of mutational processes in human cancer. Nature 500, 415 (2013).
doi: 10.1038/nature12477
Helleday, T., Eshtad, S. & Nik-Zainal, S. Mechanisms underlying mutational signatures in human cancers. Nat. Rev. Genet. 15, 585 (2014).
doi: 10.1038/nrg3729
Alexandrov, L. B. & Stratton, M. R. Mutational signatures: the patterns of somatic mutations hidden in cancer genomes. Curr. Opin. Genet. Dev. 24, 52–60 (2014).
doi: 10.1016/j.gde.2013.11.014
Petljak, M. & Alexandrov, L. B. Understanding mutagenesis through delineation of mutational signatures in human cancer. Carcinogenesis 37, 531–540 (2016).
doi: 10.1093/carcin/bgw055
Covington, K., Shinbrot, E. & Wheeler, D. A. Mutation signatures reveal biological processes in human cancer. bioRxiv036541 (2016).
Rosenthal, R., McGranahan, N., Herrero, J., Taylor, B. S. & Swanton, C. Deconstructsigs: delineating mutational processes in single tumors distinguishes dna repair deficiencies and patterns of carcinoma evolution. Genome Biol. 17, 31 (2016).
doi: 10.1186/s13059-016-0893-4
Alexandrov, L. B., Nik-Zainal, S., Siu, H. C., Leung, S. Y. & Stratton, M. R. A mutational signature in gastric cancer suggests therapeutic strategies. Nat. Commun. 6, 8683 (2015).
doi: 10.1038/ncomms9683
Rahbari, R. et al. Timing, rates and spectra of human germline mutation. Nat. Genet. 48, 126 (2016).
doi: 10.1038/ng.3469
Alexandrov, L. B. et al. The repertoire of mutational signatures in human cancer. Nature 578, 94–101 (2020).
doi: 10.1038/s41586-020-1943-3
Ramazzotti, D., Lal, A., Liu, K., Tibshirani, R. & Sidow, A. De novo mutational signature discovery in tumor genomes using sparsesignatures. bioRxiv384834 (2018).
Tibshirani, R. Regression shrinkage and selection via the lasso. J. R. Statist. Soc. B 58, 267–288 (1996).
Friedman, J., Hastie, T. & Tibshirani, R. glmnet: Lasso and elastic-net regularized generalized linear models. R package version1 (2009).
Alexandrov, L. B. et al. Mutational signatures associated with tobacco smoking in human cancer. Science 354, 618–622 (2016).
doi: 10.1126/science.aag0299
Viel, A. et al. A specific mutational signature associated with dna 8-oxoguanine persistence in mutyh-defective colorectal cancer. EBioMedicine 20, 39–49 (2017).
doi: 10.1016/j.ebiom.2017.04.022
Schulze, K. et al. Exome sequencing of hepatocellular carcinomas identifies new mutational signatures and potential therapeutic targets. Nat. Genet. 47, 505 (2015).
doi: 10.1038/ng.3252
Davies, H. et al. Hrdetect is a predictor of brca1 and brca2 deficiency based on mutational signatures. Nat. Med. 23, 517 (2017).
doi: 10.1038/nm.4292
Park, T. & Casella, G. The bayesian lasso. J. Am. Stat. Assoc. 103, 681–686 (2008).
doi: 10.1198/016214508000000337
Zou, H. The adaptive lasso and its oracle properties. J. Am. Stat. Assoc. 101, 1418–1429 (2006).
doi: 10.1198/016214506000000735
Li, S., Shuch, B. M. & Gerstein, M. B. Whole-genome analysis of papillary kidney cancer finds significant noncoding alterations. PLoS Genet. 13, e1006685 (2017).
doi: 10.1371/journal.pgen.1006685
Bailey, M. H. et al. Comprehensive characterization of cancer driver genes and mutations. Cell 173, 371–385 (2018).
doi: 10.1016/j.cell.2018.02.060
Efron, B., Hastie, T., Johnstone, I. & Tibshirani, R. et al. Least angle regression. Ann. Stat. 32, 407–499 (2004).
doi: 10.1214/009053604000000067
Muir, P. et al. The real cost of sequencing: scaling computation to keep pace with data generation. Genome Biol. 17, 53 (2016).
doi: 10.1186/s13059-016-0917-0
Li, L., Littman, M. L., Walsh, T. J. & Strehl, A. L. Knows what it knows: a framework for self-aware learning. Mach. Learn. 82, 399–443 (2011).
doi: 10.1007/s10994-010-5225-4
Zou, H. & Hastie, T. Regularization and variable selection via the elastic net. J. R. Stat. Soc.: Ser. B (Stat. Methodol.) 67, 301–320 (2005).
doi: 10.1111/j.1467-9868.2005.00503.x
Gehring, J. S., Fischer, B., Lawrence, M. & Huber, W. Somaticsignatures: inferring mutational signatures from single-nucleotide variants. Bioinformatics 31, 3673–3675 (2015).
pubmed: 4817139 pmcid: 4817139
Qiu, X. et al. Single-cell mrna quantification and differential analysis with census. Nat. Methods 14, 309 (2017).
doi: 10.1038/nmeth.4150
Zhu, L., Lei, J., Devlin, B. & Roeder, K. A unified statistical framework for single cell and bulk rna sequencing data. Ann. Appl. Stat. 12, 609 (2018).
doi: 10.1214/17-AOAS1110
Fischer, A., Illingworth, C. J., Campbell, P. J. & Mustonen, V. Emu: probabilistic inference of mutational processes and their localization in the cancer genome. Genome Biol. 14, R39 (2013).
doi: 10.1186/gb-2013-14-4-r39
Gorski, J., Pfeuffer, F. & Klamroth, K. Biconvex sets and optimization with biconvex functions: a survey and extensions. Math. Methods Oper. Res. 66, 373–407 (2007).
doi: 10.1007/s00186-007-0161-1
Reid, S., Tibshirani, R. & Friedman, J. A study of error variance estimation in lasso regression. Stat. Sin. 26, 35–67 (2016).

Auteurs

Shantao Li (S)

Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA.
Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA.

Forrest W Crawford (FW)

Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA.
Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT, USA.
Yale School of Management, New Haven, CT, USA.
Department of Statistics and Data Science, Yale University, New Haven, CT, USA.

Mark B Gerstein (MB)

Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA. mark@gersteinlab.org.
Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA. mark@gersteinlab.org.
Department of Statistics and Data Science, Yale University, New Haven, CT, USA. mark@gersteinlab.org.
Department of Computer Science, Yale University, New Haven, CT, USA. mark@gersteinlab.org.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH