Accuracy of mutational signature software on correlated signatures.


Journal

Scientific reports
ISSN: 2045-2322
Titre abrégé: Sci Rep
Pays: England
ID NLM: 101563288

Informations de publication

Date de publication:
10 01 2022
Historique:
received: 28 09 2021
accepted: 17 12 2021
entrez: 11 1 2022
pubmed: 12 1 2022
medline: 23 2 2022
Statut: epublish

Résumé

Mutational signatures are characteristic patterns of mutations generated by exogenous mutagens or by endogenous mutational processes. Mutational signatures are important for research into DNA damage and repair, aging, cancer biology, genetic toxicology, and epidemiology. Unsupervised learning can infer mutational signatures from the somatic mutations in large numbers of tumors, and separating correlated signatures is a notable challenge for this task. To investigate which methods can best meet this challenge, we assessed 18 computational methods for inferring mutational signatures on 20 synthetic data sets that incorporated varying degrees of correlated activity of two common mutational signatures. Performance varied widely, and four methods noticeably outperformed the others: hdp (based on hierarchical Dirichlet processes), SigProExtractor (based on multiple non-negative matrix factorizations over resampled data), TCSM (based on an approach used in document topic analysis), and mutSpec.NMF (also based on non-negative matrix factorization). The results underscored the complexities of mutational signature extraction, including the importance and difficulty of determining the correct number of signatures and the importance of hyperparameters. Our findings indicate directions for improvement of the software and show a need for care when interpreting results from any of these methods, including the need for assessing sensitivity of the results to input parameters.

Identifiants

pubmed: 35013428
doi: 10.1038/s41598-021-04207-6
pii: 10.1038/s41598-021-04207-6
pmc: PMC8748538
doi:

Substances chimiques

Biomarkers, Tumor 0

Types de publication

Comparative Study Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

390

Subventions

Organisme : Singapore National Medical Research Council
ID : NMRC/CIRG/1422/2015

Informations de copyright

© 2022. The Author(s).

Références

Nature. 2019 Oct;574(7779):532-537
pubmed: 31645730
Bioinformatics. 2018 Jan 15;34(2):330-337
pubmed: 29028923
Science. 2020 Oct 2;370(6512):82-89
pubmed: 33004515
Science. 2015 May 22;348(6237):880-6
pubmed: 25999502
BMC Genomics. 2018 Nov 28;19(1):845
pubmed: 30486787
Nat Genet. 2015 Dec;47(12):1402-7
pubmed: 26551669
Genome Biol. 2013 Apr 29;14(4):R39
pubmed: 23628380
Genome Med. 2014 Mar 31;6(3):24
pubmed: 25031618
Nature. 2013 Feb 21;494(7437):366-70
pubmed: 23389445
Bioinformatics. 2019 Jul 15;35(14):i492-i500
pubmed: 31510643
PLoS Comput Biol. 2019 Feb 22;15(2):e1006799
pubmed: 30794536
PLoS Genet. 2021 May 4;17(5):e1009557
pubmed: 33945534
Nat Genet. 2013 Sep;45(9):977-83
pubmed: 23852168
Bioinformatics. 2017 Jan 1;33(1):8-16
pubmed: 27591080
PLoS Comput Biol. 2021 Jun 28;17(6):e1009119
pubmed: 34181655
Nat Genet. 2013 Sep;45(9):970-6
pubmed: 23852170
PLoS One. 2019 Sep 12;14(9):e0221235
pubmed: 31513583
Genome Res. 2017 Sep;27(9):1475-1486
pubmed: 28739859
Science. 2018 Nov 23;362(6417):911-917
pubmed: 30337457
Proc Natl Acad Sci U S A. 2004 Mar 23;101(12):4164-9
pubmed: 15016911
Genome Res. 2018 Nov;28(11):1747-1756
pubmed: 30341162
Hepatology. 2020 Mar;71(3):929-942
pubmed: 31692012
Cell. 2012 May 25;149(5):979-93
pubmed: 22608084
Science. 2020 Oct 2;370(6512):75-82
pubmed: 33004514
Cell Rep. 2013 Jan 31;3(1):246-59
pubmed: 23318258
Bioinformatics. 2015 Nov 15;31(22):3673-5
pubmed: 26163694
Nat Cancer. 2020 Feb;1(2):249-263
pubmed: 32118208
Nature. 2019 Oct;574(7779):538-542
pubmed: 31645727
Nat Commun. 2015 Apr 23;6:6997
pubmed: 25904160
BMC Bioinformatics. 2010 Jul 02;11:367
pubmed: 20598126
Sci Rep. 2020 Oct 26;10(1):18217
pubmed: 33106540
Nature. 2020 Feb;578(7793):94-101
pubmed: 32025018
Sci Transl Med. 2017 Oct 18;9(412):
pubmed: 29046434
Nature. 2020 Feb;578(7794):266-272
pubmed: 31996850
Cell. 2019 May 2;177(4):821-836.e16
pubmed: 30982602
Genome Med. 2018 Apr 25;10(1):33
pubmed: 29695279
BMC Bioinformatics. 2016 Apr 18;17:170
pubmed: 27091472
Nature. 2013 Aug 22;500(7463):415-21
pubmed: 23945592
Nat Genet. 2017 Oct;49(10):1476-1486
pubmed: 28825726
Genome Res. 2018 May;28(5):654-665
pubmed: 29632087
Nat Med. 2017 Apr;23(4):517-525
pubmed: 28288110

Auteurs

Yang Wu (Y)

Programme in Cancer and Stem Cell Biology, Duke-NUS Medical School, Singapore, 169857, Singapore.
Centre for Computational Biology, Duke-NUS Medical School, Singapore, 169857, Singapore.

Ellora Hui Zhen Chua (EHZ)

Department of Biological Sciences, National University of Singapore, Singapore, 117558, Singapore.

Alvin Wei Tian Ng (AWT)

Programme in Cancer and Stem Cell Biology, Duke-NUS Medical School, Singapore, 169857, Singapore.
Centre for Computational Biology, Duke-NUS Medical School, Singapore, 169857, Singapore.

Arnoud Boot (A)

Programme in Cancer and Stem Cell Biology, Duke-NUS Medical School, Singapore, 169857, Singapore.
Centre for Computational Biology, Duke-NUS Medical School, Singapore, 169857, Singapore.

Steven G Rozen (SG)

Programme in Cancer and Stem Cell Biology, Duke-NUS Medical School, Singapore, 169857, Singapore. steverozen@gmail.com.
Centre for Computational Biology, Duke-NUS Medical School, Singapore, 169857, Singapore. steverozen@gmail.com.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C

Classifications MeSH