Accurate Ensemble Prediction of Somatic Mutations with SMuRF2.

Bioinformatics tools Cancer genomics Next-generation sequencing Somatic mutation calling Supervised machine-learning

Journal

Methods in molecular biology (Clifton, N.J.)
ISSN: 1940-6029
Titre abrégé: Methods Mol Biol
Pays: United States
ID NLM: 9214969

Informations de publication

Date de publication:
2022
Historique:
entrez: 25 6 2022
pubmed: 26 6 2022
medline: 29 6 2022
Statut: ppublish

Résumé

Accurate identification of somatic mutations is crucial for discovery and identification of driver mutations in cancer tumors. Here, we describe the updated Somatic Mutation calling method using a Random Forest (SMuRF2), an ensemble method that combines the predictions and auxiliary features from individual mutation callers using supervised machine learning. SMuRF2 provides an efficient workflow to predict both somatic point mutations (SNVs) and small insertions/deletions (indels) in cancer genomes and exomes. We describe the latest method and provide a detailed tutorial for running SMuRF2.

Identifiants

pubmed: 35751808
doi: 10.1007/978-1-0716-2293-3_4
doi:

Substances chimiques

SMURF2 protein, human EC 2.3.2.26
Ubiquitin-Protein Ligases EC 2.3.2.27

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

53-66

Informations de copyright

© 2022. The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature.

Références

Hanahan D, Weinberg Robert A (2011) Hallmarks of cancer: the next generation. Cell 144(5):646–674. https://doi.org/10.1016/j.cell.2011.02.013
doi: 10.1016/j.cell.2011.02.013 pubmed: 21376230
Huang W, Guo YA, Muthukumar K, Baruah P, Chang MM, Skanderup AJ (2019) SMuRF: portable and accurate ensemble prediction of somatic mutations. Bioinformatics (Oxford, England) 35(17):3157–3159. https://doi.org/10.1093/bioinformatics/btz018
doi: 10.1093/bioinformatics/btz018
Cibulskis K, Lawrence MS, Carter SL, Sivachenko A, Jaffe D, Sougnez C, Gabriel S, Meyerson M, Lander ES, Getz G (2013) Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat Biotechnol 31:213. https://doi.org/10.1038/nbt.2514
doi: 10.1038/nbt.2514 pubmed: 23396013 pmcid: 3833702
Lai Z, Markovets A, Ahdesmaki M, Chapman B, Hofmann O, McEwen R, Johnson J, Dougherty B, Barrett JC, Dry JR (2016) VarDict: a novel and versatile variant caller for next-generation sequencing in cancer research. Nucleic Acids Res 44(11):e108. https://doi.org/10.1093/nar/gkw227
doi: 10.1093/nar/gkw227 pubmed: 27060149 pmcid: 4914105
Koboldt DC, Zhang Q, Larson DE, Shen D, McLellan MD, Lin L, Miller CA, Mardis ER, Ding L, Wilson RK (2012) VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res 22(3):568–576. https://doi.org/10.1101/gr.129684.111
doi: 10.1101/gr.129684.111 pubmed: 22300766 pmcid: 3290792
Huang W, Guo YA, Chang MM, Skanderup AJ (2020) Ensemble-based somatic mutation calling in cancer genomes. In: Boegel S (ed) Bioinformatics for cancer immunotherapy: methods and protocols. Springer, US, New York, NY, pp 37–46. https://doi.org/10.1007/978-1-0716-0327-7_3
doi: 10.1007/978-1-0716-0327-7_3
Kim S, Scheffler K, Halpern AL, Bekritsky MA, Noh E, Källberg M, Chen X, Kim Y, Beyter D, Krusche P, Saunders CT (2018) Strelka2: fast and accurate calling of germline and somatic variants. Nat Methods 15(8):591–594. https://doi.org/10.1038/s41592-018-0051-x
doi: 10.1038/s41592-018-0051-x pubmed: 30013048
Ewing AD, Houlahan KE, Hu Y, Ellrott K, Caloian C, Yamaguchi TN, Bare JC, P’ng C, Waggott D, Sabelnykova VY, participants I-TDSMCC, Xi L, Dewal N, Fan Y, Wang W, Wheeler D, Wilm A, Ting GH, Li C, Bertrand D, Nagarajan N, Chen Q-R, Hsu C-H, Hu Y, Yan C, Kibbe W, Meerzaman D, Cibulskis K, Rosenberg M, Bergelson L, Kiezun A, Radenbaugh A, Sertier A-S, Ferrari A, Tonton L, Bhutani K, Hansen NF, Wang D, Song L, Lai Z, Liao Y, Shi W, Carbonell-Caballero J, Dopazo J, CCK L, Guinney J, Kellen MR, Norman TC, Haussler D, Friend SH, Stolovitzky G, Margolin AA, Stuart JM, Boutros PC (2015) Combining tumor genome simulation with crowdsourcing to benchmark somatic single-nucleotide-variant detection. Nat Methods 12:623. https://doi.org/10.1038/nmeth.3407
doi: 10.1038/nmeth.3407 pubmed: 25984700 pmcid: 4856034
Cingolani P, Platts A, Wang le L, Coon M, Nguyen T, Wang L, Land SJ, Lu X, Ruden DM (2012) A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly 6(2):80–92. https://doi.org/10.4161/fly.19695
doi: 10.4161/fly.19695 pubmed: 22728672 pmcid: 3679285

Auteurs

Weitai Huang (W)

Laboratory of Computational Cancer Genomics, Genome Institute of Singapore, A*STAR (Agency for Science, Technology and Research), Singapore, Singapore. huangwt@gis.a-star.edu.sg.

Ngak Leng Sim (NL)

Laboratory of Computational Cancer Genomics, Genome Institute of Singapore, A*STAR (Agency for Science, Technology and Research), Singapore, Singapore.

Anders J Skanderup (AJ)

Laboratory of Computational Cancer Genomics, Genome Institute of Singapore, A*STAR (Agency for Science, Technology and Research), Singapore, Singapore.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C

Classifications MeSH