Using an integrative machine learning approach utilising homology modelling to clinically interpret genetic variants: CACNA1F as an exemplar.


Journal

European journal of human genetics : EJHG
ISSN: 1476-5438
Titre abrégé: Eur J Hum Genet
Pays: England
ID NLM: 9302235

Informations de publication

Date de publication:
09 2020
Historique:
received: 13 05 2019
accepted: 10 03 2020
revised: 13 01 2020
pubmed: 22 4 2020
medline: 3 6 2021
entrez: 22 4 2020
Statut: ppublish

Résumé

Advances in DNA sequencing technologies have revolutionised rare disease diagnostics and have led to a dramatic increase in the volume of available genomic data. A key challenge that needs to be overcome to realise the full potential of these technologies is that of precisely predicting the effect of genetic variants on molecular and organismal phenotypes. Notably, despite recent progress, there is still a lack of robust in silico tools that accurately assign clinical significance to variants. Genetic alterations in the CACNA1F gene are the commonest cause of X-linked incomplete Congenital Stationary Night Blindness (iCSNB), a condition associated with non-progressive visual impairment. We combined genetic and homology modelling data to produce CACNA1F-vp, an in silico model that differentiates disease-implicated from benign missense CACNA1F changes. CACNA1F-vp predicts variant effects on the structure of the CACNA1F encoded protein (a calcium channel) using parameters based upon changes in amino acid properties; these include size, charge, hydrophobicity, and position. The model produces an overall score for each variant that can be used to predict its pathogenicity. CACNA1F-vp outperformed four other tools in identifying disease-implicated variants (area under receiver operating characteristic and precision recall curves = 0.84; Matthews correlation coefficient = 0.52) using a tenfold cross-validation technique. We consider this protein-specific model to be a robust stand-alone diagnostic classifier that could be replicated in other proteins and could enable precise and timely diagnosis.

Identifiants

pubmed: 32313206
doi: 10.1038/s41431-020-0623-y
pii: 10.1038/s41431-020-0623-y
pmc: PMC7608274
doi:

Substances chimiques

CACNA1F protein, human 0
Calcium Channels, L-Type 0

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

1274-1282

Subventions

Organisme : Medical Research Council
ID : MR/R024952/1
Pays : United Kingdom
Organisme : RCUK | Medical Research Council (MRC)
ID : 1790437
Pays : International

Références

MacArthur DG, Manolio TA, Dimmock DP, Rehm HL, Shendure J, Abecasis GR, et al. Guidelines for investigating causality of sequence variants in human disease. Nature. 2014;508:469.
pubmed: 24759409 pmcid: 4180223
Taylor RL, Parry NRA, Barton SJ, Campbell C, Delaney CM, Ellingford JM, et al. Panel-based clinical genetic testing in 85 children with inherited retinal disease. Ophthalmology. 2017;124:985–91.
pubmed: 28341476
Ellingford JM, Barton S, Bhaskar S, O’Sullivan J, Williams SG, Lamb JA, et al. Molecular findings from 537 individuals with inherited retinal disease. J Med Genet. 2016;53:761–7.
pubmed: 27208204
Sloan-Heggen CM, Bierer AO, Shearer AE, Kolbe DL, Nishimura CJ, Frees KL, et al. Comprehensive genetic testing in the clinical evaluation of 1119 patients with hearing loss. Hum Genet. 2016;135:441–50.
pubmed: 26969326 pmcid: 4796320
Cooper GM, Shendure J. Needles in stacks of needles: finding disease-causal variants in a wealth of genomic data. Nat Rev Genet. 2011;12:628–40.
pubmed: 21850043
Astuti GDN, van den Born LI, Khan MI, Hamel CP, Bocquet B, Manes G, et al. Identification of inherited retinal disease-associated genetic variants in 11 candidate genes. Genes. 2018;9.
Liew G, Michaelides M, Bunce C. A comparison of the causes of blindness certifications in England and Wales in working age adults (16–64 years), 1999–2000 with 2009–2010. BMJ Open. 2014;4:e004015.
Zeitz C, Robson AG, Audo I. Congenital stationary night blindness: an analysis and update of genotype–phenotype correlations and pathogenic mechanisms. Prog Retinal Eye Res. 2015;45 Suppl C:58–110.
Bech-Hansen NT, Naylor MJ, Maybaum TA, Pearce WG, Koop B, Fishman GA, et al. Loss-of-function mutations in a calcium-channel |[alpha]|1-subunitgene in Xp11.23 cause incomplete X-linked congenital stationary night blindness. Nat Genet. 1998;19:264–7.
pubmed: 9662400
Strom TM, Nyakatura G, Apfelstedt-Sylla E, Hellebrand H, Lorenz B, Weber BH, et al. An L-type calcium-channel gene mutated in incomplete X-linked congenital stationary night blindness. Nat Genet. 1998;19:260–3.
pubmed: 9662399
Striessnig J, Hoda JC, Koschak A, Zaghetto F, Mullner C, Sinnegger-Brauns MJ, et al. L-type Ca2+ channels in Ca2+ channelopathies. Biochem Biophys Res Commun. 2004;322:1341–6.
pubmed: 15336981
Stenson PD, Mort M, Ball EV, Evans K, Hayden M, Heywood S, et al. The human gene mutation database: towards a comprehensive repository of inherited mutation data for medical research, genetic diagnosis and next-generation sequencing studies. Hum Genet. 2017;136:665–77.
pubmed: 28349240 pmcid: 5429360
Ng PC, Henikoff S. SIFT: predicting amino acid changes that affect protein function. Nucleic Acids Res. 2003;31:3812–4.
pubmed: 12824425 pmcid: 168916
Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P, et al. A method and server for predicting damaging missense mutations. Nat Methods. 2010;7:248–9.
pubmed: 20354512 pmcid: 2855889
Kircher M, Witten DM, Jain P, O’Roak BJ, Cooper GM, Shendure J. A general framework for estimating the relative pathogenicity of human genetic variants. Nat Genet. 2014;46:310.
pubmed: 24487276 pmcid: 3992975
González-Pérez A, López-Bigas N. Improving the assessment of the outcome of nonsynonymous SNVs with a consensus deleteriousness score, condel. Am J Hum Genet. 2011;88:440–9.
pubmed: 21457909 pmcid: 3071923
Pires AS, Porto WF, Franco OL, Alencar SA. In silico analyses of deleterious missense SNPs of human apolipoprotein E3. Sci Rep. 2017;7:2509.
pubmed: 28559539 pmcid: 5449402
Reva B, Antipin Y, Sander C. Predicting the functional impact of protein mutations: application to cancer genomics. Nucleic Acids Res. 2011;39:e118.
pubmed: 21727090 pmcid: 3177186
Shihab HA, Gough J, Cooper DN, Stenson PD, Barker GL, Edwards KJ, et al. Predicting the functional, molecular, and phenotypic consequences of amino acid substitutions using hidden Markov models. Hum Mutat. 2013;34:57–65.
pubmed: 23033316
Chun, S. and J. C. Fay. “Identification of deleterious mutations within three human genomes.” Genome Research. 2009;19:1553–61
Leong IU, Stuckey A, Lai D, Skinner JR, Love DR. Assessment of the predictive accuracy of five in silico prediction tools, alone or in combination, and two metaservers to classify long QT syndrome gene mutations. BMC Med Genet. 2015;16:34.
pubmed: 25967940 pmcid: 4630850
Landrum MJ, Lee JM, Benson M, Brown G, Chao C, Chitipiralla S, et al. ClinVar: public archive of interpretations of clinically relevant variants. Nucleic Acids Res. 2016;44:D862–8.
pubmed: 26582918
Richards S, Aziz N, Bale S, Bick D, Das S, Gastier-Foster J, et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet Med. 2015;17:405–24.
pubmed: 25741868 pmcid: 4544753
Karczewski KJ, Francioli LC, Tiao G, Cummings BB, Alföldi J, Wang Q, et al. Variation across 141,456 human exomes and genomes reveals the spectrum of loss-of-function intolerance across human protein-coding genes. 2019. bioRxiv: 531210.
Webb B, Sali A. Comparative protein structure modeling using Modeller. Curr Protoc Bioinforma. 2016;54:5.6.1–5.6.37.
Bateman A, O’Donovan C, Magrane M, Alpi E, Antunes R, Bely B, et al. UniProt: the universal protein knowledgebase. Nucleic Acids Res. 2018;45:D158–69.
Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, et al. The protein data bank. Nucleic Acids Res. 2000;28:235–42.
pubmed: 10592235 pmcid: 102472
Sievers F, Wilm A, Dineen D, Gibson TJ, Karplus K, Li W, et al. Fast, scalable generation of high-quality protein multiple sequence alignments using clustal omega. Mol Syst Biol. 2011;7:539.
pubmed: 21988835 pmcid: 3261699
Schrodinger LLC. The PyMOL molecular graphics system. Version. 2015;1:8.
Word JM, Lovell SC, Richardson JS, Richardson DC. Asparagine and glutamine: using hydrogen atom contacts in the choice of side-chain amide orientation. J Mol Biol. 1999;285:1735–47.
pubmed: 9917408
Word JM, Lovell SC, LaBean TH, Taylor HC, Zalis ME, Presley BK, et al. Visualizing and quantifying molecular goodness-of-fit: small-probe contact dots with explicit hydrogen atoms. J Mol Biol. 1999;285:1711–33.
pubmed: 9917407
Chen VB, Davis IW, Richardson DC. KING (Kinemage, next generation): a versatile interactive molecular and scientific visualization program. Protein Sci. 2009;18:2403–9.
pubmed: 19768809 pmcid: 2788294
Richards FM. Areas, Volumes, packing, and protein structure. http://dxdoiorg/101146/annurevbb06060177001055. 1977.
Engelman DM, Steitz TA, Goldman A. Identifying nonpolar transbilayer helices in amino acid sequences of membrane proteins. http://dxdoiorg/101146/annurevbb15060186001541. 1986.
Stevens TA. Python programming for biology, bioinformatics, and beyond. Boucher WA, editor: Cambridge: Cambridge University Press; 2015.
Henikoff S, Henikoff JG. Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci USA. 1992;89:10915–9.
pubmed: 1438297 pmcid: 50453
Le Cessie SVH, Ridge JC. Estimators in logistic regression. J R Stat Soc Ser C (Appl Stat). 1992;41:11.
Witten IH, Frank E, Hall MA, Pal CJ. Data mining, Fourth edition: Practical Machine Learning Tools and Techniques: Morgan Kaufmann Publishers Inc.; 2016. 654 p.
DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988;44:837–45.
pubmed: 3203132
Saito T, Rehmsmeier M. The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLoS ONE. 2015;10:e0118432.
pubmed: 25738806 pmcid: 4349800
Matthews BW. Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim Biophys Acta. 1975;405:442–51.
pubmed: 1180967
Dunn OJ. Multiple comparisons among means. J Am Stat Assoc. 1961;56:52–64.
Wu J, Yan Z, Li Z, Qian X, Lu S, Dong M, et al. Structure of the voltage-gated calcium channel Ca(v)1.1 at 3.6 A resolution. Nature. 2016;537:191–6.
pubmed: 27580036
Lovell SC, Word JM, Richardson JS, Richardson DC. The penultimate rotamer library. Proteins. 2000;40:389–408.
pubmed: 10861930
Hajian-Tilaki K. Receiver operating characteristic (ROC) curve analysis for medical diagnostic test evaluation. Casp J Intern Med. 2013;4:627–35.
Lappalainen T, Scott AJ, Brandt M, Hall IM. Genomic analysis in the age of human genome sequencing. Cell. 2019;177:70–84.
pubmed: 30901550 pmcid: 6532068
Williams S. Analysis of in silico tools for evaluating missense variants. National Genetics Reference Laboratory (Manchester). 2012.
de la Campa E, Padilla N, de la Cruz X Development of pathogenicity predictors specific for variants that do not comply with clinical guidelines for the use of computational evidence. BMC Genomics. 2017;18(Suppl 5):569.
Hess EJ. Migraines in mice? Cell. 1996;87:1149–51.
pubmed: 8980220
Catterall WA. Ion channel voltage sensors: structure, function, and pathophysiology. Neuron. 2010;67:915–28.
pubmed: 20869590 pmcid: 2950829
Striessnig J, Bolz HJ, Koschak A. Channelopathies in Cav1.1, Cav1.3, and Cav1.4 voltage-gated L-type Ca2+ channels. Pflug Arch. 2010;460:361–74.
Zeitz C, Robson AG, Audo I. Congenital stationary night blindness: an analysis and update of genotype-phenotype correlations and pathogenic mechanisms. Prog Retin Eye Res. 2015;45:58–110.
pubmed: 25307992
Zeitz C, Michiels C, Neuille M, Friedburg C, Condroyer C, Boyard F, et al. Where are the missing gene defects in inherited retinal disorders? intronic and synonymous variants contribute at least to 4% of CACNA1F-mediated inherited retinal disorders. Hum Mutat. 2019;40:765–87.
pubmed: 30825406
McNemar Q. Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika. 1947;12:153–7.
pubmed: 20254758
Hove MN, Kilic-Biyik KZ, Trotter A, Grønskov K, Sander B, Larsen M, et al. Clinical characteristics, mutation spectrum, and prevalence of Åland eye disease/incomplete congenital stationary night blindness in Denmark. Invest Ophthalmol Vis Sci. 2016;57:6861–9.

Auteurs

Shalaw R Sallah (SR)

Division of Evolution and Genomic Sciences, School of Biological Sciences, Faculty of Biology, Medicines and Health, University of Manchester, Manchester Academic Health Science Centre, Manchester, UK. Graeme.black@manchester.ac.uk.
Manchester Centre for Genomic Medicine, Central Manchester University Hospitals NHS Foundation Trust, Manchester Academic Health Sciences Centre, St Mary's Hospital, Manchester, UK. Graeme.black@manchester.ac.uk.

Panagiotis I Sergouniotis (PI)

Manchester Centre for Genomic Medicine, Central Manchester University Hospitals NHS Foundation Trust, Manchester Academic Health Sciences Centre, St Mary's Hospital, Manchester, UK.

Stephanie Barton (S)

Manchester Centre for Genomic Medicine, Central Manchester University Hospitals NHS Foundation Trust, Manchester Academic Health Sciences Centre, St Mary's Hospital, Manchester, UK.

Simon Ramsden (S)

Manchester Centre for Genomic Medicine, Central Manchester University Hospitals NHS Foundation Trust, Manchester Academic Health Sciences Centre, St Mary's Hospital, Manchester, UK.

Rachel L Taylor (RL)

Manchester Centre for Genomic Medicine, Central Manchester University Hospitals NHS Foundation Trust, Manchester Academic Health Sciences Centre, St Mary's Hospital, Manchester, UK.

Amro Safadi (A)

Division of Evolution and Genomic Sciences, School of Biological Sciences, Faculty of Biology, Medicines and Health, University of Manchester, Manchester Academic Health Science Centre, Manchester, UK.

Mitra Kabir (M)

Division of Evolution and Genomic Sciences, School of Biological Sciences, Faculty of Biology, Medicines and Health, University of Manchester, Manchester Academic Health Science Centre, Manchester, UK.

Jamie M Ellingford (JM)

Manchester Centre for Genomic Medicine, Central Manchester University Hospitals NHS Foundation Trust, Manchester Academic Health Sciences Centre, St Mary's Hospital, Manchester, UK.

Nick Lench (N)

Congenica Ltd, Biodata Innovation Centre, Wellcome Genome Campus, Hinxton, Cambridge, UK.

Simon C Lovell (SC)

Division of Evolution and Genomic Sciences, School of Biological Sciences, Faculty of Biology, Medicines and Health, University of Manchester, Manchester Academic Health Science Centre, Manchester, UK.

Graeme C M Black (GCM)

Division of Evolution and Genomic Sciences, School of Biological Sciences, Faculty of Biology, Medicines and Health, University of Manchester, Manchester Academic Health Science Centre, Manchester, UK.
Manchester Centre for Genomic Medicine, Central Manchester University Hospitals NHS Foundation Trust, Manchester Academic Health Sciences Centre, St Mary's Hospital, Manchester, UK.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C

Classifications MeSH