Predicting the impact of rare variants on RNA splicing in CAGI6.
Journal
Human genetics
ISSN: 1432-1203
Titre abrégé: Hum Genet
Pays: Germany
ID NLM: 7613873
Informations de publication
Date de publication:
03 Jan 2024
03 Jan 2024
Historique:
received:
15
06
2023
accepted:
18
11
2023
medline:
4
1
2024
pubmed:
4
1
2024
entrez:
3
1
2024
Statut:
aheadofprint
Résumé
Variants which disrupt splicing are a frequent cause of rare disease that have been under-ascertained clinically. Accurate and efficient methods to predict a variant's impact on splicing are needed to interpret the growing number of variants of unknown significance (VUS) identified by exome and genome sequencing. Here, we present the results of the CAGI6 Splicing VUS challenge, which invited predictions of the splicing impact of 56 variants ascertained clinically and functionally validated to determine splicing impact. The performance of 12 prediction methods, along with SpliceAI and CADD, was compared on the 56 functionally validated variants. The maximum accuracy achieved was 82% from two different approaches, one weighting SpliceAI scores by minor allele frequency, and one applying the recently published Splicing Prediction Pipeline (SPiP). SPiP performed optimally in terms of sensitivity, while an ensemble method combining multiple prediction tools and information from databases exceeded all others for specificity. Several challenge methods equalled or exceeded the performance of SpliceAI, with ultimate choice of prediction method likely to depend on experimental or clinical aims. One quarter of the variants were incorrectly predicted by at least 50% of the methods, highlighting the need for further improvements to splicing prediction methods for successful clinical application.
Identifiants
pubmed: 38170232
doi: 10.1007/s00439-023-02624-3
pii: 10.1007/s00439-023-02624-3
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Subventions
Organisme : NIHR
ID : RP-2016-07-011
Organisme : New South Wales Health
ID : Cardiovascular Disease Senior Scientist Grant
Organisme : University of Southampton
ID : Anniversary Fellowship
Informations de copyright
© 2024. The Author(s).
Références
Cheng J et al (2019) MMSplice: modular modeling improves the predictions of genetic variant effects on splicing. Genome Biol 20(1):48
doi: 10.1186/s13059-019-1653-z
pubmed: 30823901
pmcid: 6396468
Danis D, Jacobsen JOB, Carmody LC, Gargano MA, McMurry JA, Hegde A, Haendel MA, Valentini G, Smedley D, Robinson PN (2021) Interpretable prioritization of splice variants in diagnostic next-generation sequencing. Am J Hum Genet 108(9):1564–1577
doi: 10.1016/j.ajhg.2021.06.014
pubmed: 34289339
pmcid: 8456162
Ha C, Kim JW, Jang JH (2021) Performance evaluation of SpliceAI for the prediction of splicing of NF1 variants. Genes (basel) 12:1308
doi: 10.3390/genes12091308
pubmed: 34573290
Jagadeesh KA et al (2019) S-CAP extends pathogenicity prediction to genetic variants that affect RNA splicing. Nat Genet 51(4):755–763
doi: 10.1038/s41588-019-0348-4
pubmed: 30804562
Jaganathan K, Kyriazopoulou Panagiotopoulou S, McRae JF, Darbandi SF, Knowles D, Li YI, Kosmicki JA, Arbelaez J, Cui W, Schwartz GB et al (2019) Predicting splicing from primary sequence with deep learning. Cell 176(3):535–548
doi: 10.1016/j.cell.2018.12.015
pubmed: 30661751
Jian X, Boerwinkle E, Liu X (2014) In silico prediction of splice-altering single nucleotide variants in the human genome. Nucleic Acids Res 42(22):13534–13544
doi: 10.1093/nar/gku1206
pubmed: 25416802
pmcid: 4267638
Karczewski KJ et al (2020) The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581(7809):434–443
doi: 10.1038/s41586-020-2308-7
pubmed: 32461654
pmcid: 7334197
Kircher M, Witten DM, Jain P, O’Roak BJ, Cooper GM, Shendure J (2014) A general framework for estimating the relative pathogenicity of human genetic variants. Nat Genet 46(3):310–315
doi: 10.1038/ng.2892
pubmed: 24487276
pmcid: 3992975
Krawczak M, Reiss J, Cooper DN (1992) The mutational spectrum of single base-pair substitutions in mRNA splice junctions of human genes: causes and consequences. Hum Genet 90:41–54
doi: 10.1007/BF00210743
pubmed: 1427786
Landrum MJ, Lee JM, Benson M, Brown GR, Chao C, Chitipiralla S, Gu B, Hart J, Hoffman D, Jang W et al (2018) ClinVar: improving access to variant interpretations and supporting evidence. Nucleic Acids Res 46(D1):D1062–D1067
doi: 10.1093/nar/gkx1153
pubmed: 29165669
Leman R, Parfait B, Vidaud D, Girodon E, Pacot L, Le Gac G, Ka C, Ferec C, Fichou Y, Quesnelle C et al (2022) SPiP: Splicing Prediction Pipeline, a machine learning tool for massive detection of exonic and intronic variant effects on mRNA splicing. Hum Mutat 43(12):2308–2323
doi: 10.1002/humu.24491
pubmed: 36273432
López-Bigas N, Audit B, Ouzounis C, Parra G, Guigó R (2005) Are splicing mutations the most frequent cause of hereditary disease? FEBS Lett 579:1900–1903
doi: 10.1016/j.febslet.2005.02.047
pubmed: 15792793
Lord J, Baralle D (2021) Splicing in the diagnosis of rare disease: advances and challenges. Front Genet 12:689892
doi: 10.3389/fgene.2021.689892
pubmed: 34276790
pmcid: 8280750
Lord J, Gallone G, Short PJ, McRae JF, Ironfield H, Wynn EH, Gerety SS, He L, Kerr B, Johnson DS et al (2019) Pathogenicity and selective constraint on variation near splice sites. Genome Res 29:159–170
doi: 10.1101/gr.238444.118
pubmed: 30587507
pmcid: 6360807
McLaren W, Gil L, Hunt SE, Riat HS, Ritchie GR, Thormann A, Flicek P, Cunningham F (2016) The ensembl variant effect predictor. Genome Biol 17(1):122
doi: 10.1186/s13059-016-0974-4
pubmed: 27268795
pmcid: 4893825
R Core Team (2018) A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna
Rentzsch P, Schubach M, Shendure J, Kircher M (2021) CADD-Splice-improving genome-wide variant effect prediction using deep learning-derived splice scores. Genome Med 13(1):31
doi: 10.1186/s13073-021-00835-9
pubmed: 33618777
pmcid: 7901104
Richards S, Aziz N, Bale S, Bick D, Das S, Gastier-Foster J, Grody WW, Hegde M, Lyon E, Spector E et al (2015) Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet Med 17:405–424
doi: 10.1038/gim.2015.30
pubmed: 25741868
pmcid: 4544753
Riepe TV, Khan M, Roosing S, Cremers FPM, ‘t Hoen PAC (2020) Benchmarking deep learning splice prediction tools using functional splice assays. Authorea 42:799–810. https://doi.org/10.22541/au.160081230.07101269
doi: 10.22541/au.160081230.07101269
Robin X, Turck N, Hainard A, Tiberti N, Lisacek F, Sanchez JC, Muller M (2011) pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics 12:77
doi: 10.1186/1471-2105-12-77
pubmed: 21414208
pmcid: 3068975
Stranneheim H, Lagerstedt-Robinson K, Magnusson M, Kvarnung M, Nilsson D, Lesko N, Engvall M, Anderlid BM, Arnell H, Johansson CB et al (2021) Integration of whole genome sequencing into a healthcare setting: high diagnostic rates across multiple clinical entities in 3219 rare disease patients. Genome Med 13:40
doi: 10.1186/s13073-021-00855-5
pubmed: 33726816
pmcid: 7968334
Strauch Y, Lord J, Niranjan M, Baralle D (2022) CI-SpliceAI-Improving machine learning predictions of disease causing splicing variants using curated alternative splice sites. PLoS ONE 17:e0269159
doi: 10.1371/journal.pone.0269159
pubmed: 35657932
pmcid: 9165884
Turro E, Astle WJ, Megy K, Graf S, Greene D, Shamardina O, Allen HL, Sanchis-Juan A, Frontini M, Thys C et al (2020) Whole-genome sequencing of patients with rare diseases in a national health system. Nature 583:96–102
doi: 10.1038/s41586-020-2434-2
pubmed: 32581362
pmcid: 7610553
Wai HA, Lord J, Lyon M, Gunning A, Kelly H, Cibin P, Seaby EG, Spiers-Fitzgerald K, Lye J, Ellard S et al (2020) Blood RNA analysis can increase clinical diagnostic rate and resolve variants of uncertain significance. Genet Med 22:1005–1014
doi: 10.1038/s41436-020-0766-9
pubmed: 32123317
pmcid: 7272326
Wickham H (2009) ggplot2 Elegant graphics for data analysis introduction. Use R. Springer, New York. https://doi.org/10.1007/978-0-387-98141-3_1
doi: 10.1007/978-0-387-98141-3_1
Yeo G, Burge CB (2004) Maximum entropy modeling of short sequence motifs with applications to RNA splicing signals. J Comput Biol 11(2–3):377–394
doi: 10.1089/1066527041410418
pubmed: 15285897