The frequency of somatic mutations in cancer predicts the phenotypic relevance of germline mutations.

cancer gene prioritization human disease machine learning mutations

Journal

Frontiers in genetics
ISSN: 1664-8021
Titre abrégé: Front Genet
Pays: Switzerland
ID NLM: 101560621

Informations de publication

Date de publication:
2022
Historique:
received: 15 09 2022
accepted: 28 12 2022
entrez: 26 1 2023
pubmed: 27 1 2023
medline: 27 1 2023
Statut: epublish

Résumé

Genomic sequence mutations can be pathogenic in both germline and somatic cells. Several authors have observed that often the same genes are involved in cancer when mutated in somatic cells and in genetic diseases when mutated in the germline. Recent advances in high-throughput sequencing techniques have provided us with large databases of both types of mutations, allowing us to investigate this issue in a systematic way. Hence, we applied a machine learning based framework to this problem, comparing multiple models. The models achieved significant predictive power as shown by both cross-validation and their application to recently discovered gene/phenotype associations not used for training. We found that genes characterized by high frequency of somatic mutations in the most common cancers and ancient evolutionary age are most likely to be involved in abnormal phenotypes and diseases. These results suggest that the combination of tolerance for mutations at the cell viability level (measured by the frequency of somatic mutations in cancer) and functional relevance (demonstrated by evolutionary conservation) are the main predictors of disease genes. Our results thus confirm the deep relationship between pathogenic mutations in somatic and germline cells, provide new insight into the common origin of cancer and genetic diseases, and can be used to improve the identification of new disease genes.

Identifiants

pubmed: 36699457
doi: 10.3389/fgene.2022.1045301
pii: 1045301
pmc: PMC9868957
doi:

Types de publication

Journal Article

Langues

eng

Pagination

1045301

Informations de copyright

Copyright © 2023 Draetta, Lazarević, Provero and Cittaro.

Déclaration de conflit d'intérêts

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Références

BMC Biol. 2010 May 21;8:66
pubmed: 20492640
Hum Mol Genet. 2010 Oct 15;19(R2):R125-30
pubmed: 20805107
Mol Biol Evol. 2008 Dec;25(12):2699-707
pubmed: 18820252
Science. 2019 Jan 25;363(6425):
pubmed: 30679340
Nucleic Acids Res. 2005 Jan 1;33(Database issue):D514-7
pubmed: 15608251
BMC Genomics. 2013 Feb 21;14:117
pubmed: 23433480
Nucleic Acids Res. 2021 Jan 8;49(D1):D1207-D1217
pubmed: 33264411
Annu Rev Genomics Hum Genet. 2013;14:355-69
pubmed: 23875798
Nat Genet. 2022 Sep;54(9):1320-1331
pubmed: 35982160
Nature. 2012 Aug 23;488(7412):504-7
pubmed: 22820252
Nature. 2022 Feb;602(7895):101-105
pubmed: 35022609
Cell. 1995 Mar 24;80(6):837-45
pubmed: 7697714
Nat Genet. 2014 Apr;46(4):385-8
pubmed: 24614070
Nat Rev Genet. 2017 Oct;18(10):599-612
pubmed: 28804138
Nature. 2020 May;581(7809):434-443
pubmed: 32461654
Cell. 2014 Nov 20;159(5):1015-1026
pubmed: 25416942
Cell. 2012 Dec 21;151(7):1431-42
pubmed: 23260136
Nat Genet. 2009 Apr;41(4):393-5
pubmed: 19287383
Nat Rev Genet. 2012 Jul 03;13(8):523-36
pubmed: 22751426
Science. 2015 Nov 27;350(6264):1096-101
pubmed: 26472758
Nat Rev Genet. 2013 May;14(5):347-59
pubmed: 23568486
Nucleic Acids Res. 2019 Jul 2;47(W1):W191-W198
pubmed: 31066453
Curr Osteoporos Rep. 2014 Sep;12(3):263-71
pubmed: 24988994
Genome Res. 2014 Feb;24(2):340-8
pubmed: 24162188
J Integr Bioinform. 2019 Sep 9;16(4):
pubmed: 31494632
Nat Commun. 2020 Mar 13;11(1):1363
pubmed: 32170069
Nature. 2020 Oct;586(7831):757-762
pubmed: 33057194
Am J Hum Genet. 2012 Jan 13;90(1):119-24
pubmed: 22197486
Cancer Cell. 2014 Mar 17;25(3):272-81
pubmed: 24651010
Nature. 2016 Apr 14;532(7598):264-7
pubmed: 27075101
Nat Genet. 2010 Sep;42(9):790-3
pubmed: 20711175
Cell. 2017 Nov 16;171(5):1029-1041.e21
pubmed: 29056346
Sci Adv. 2022 Aug 26;8(34):eabo6371
pubmed: 36026442

Auteurs

Edoardo Luigi Draetta (EL)

University of Milan, Milan, Italy.
Center for Omics Sciences, IRCCS San Raffaele Scientific Institute, Milan, Italy.

Dejan Lazarević (D)

Center for Omics Sciences, IRCCS San Raffaele Scientific Institute, Milan, Italy.

Paolo Provero (P)

Center for Omics Sciences, IRCCS San Raffaele Scientific Institute, Milan, Italy.
Department of Neurosciences "Rita Levi Montalcini", University of Turin, Turin, Italy.

Davide Cittaro (D)

Center for Omics Sciences, IRCCS San Raffaele Scientific Institute, Milan, Italy.

Classifications MeSH