MTR3D-AF2: Expanding the coverage of spatially derived missense tolerance scores across the human proteome using AlphaFold2.

missense mutation pathogenicity prediction population variation protein structure prediction selective pressure

Journal

Protein science : a publication of the Protein Society
ISSN: 1469-896X
Titre abrégé: Protein Sci
Pays: United States
ID NLM: 9211750

Informations de publication

Date de publication:
Aug 2024
Historique:
revised: 24 06 2024
received: 17 12 2023
accepted: 26 06 2024
medline: 20 7 2024
pubmed: 20 7 2024
entrez: 20 7 2024
Statut: ppublish

Résumé

The missense tolerance ratio (MTR) was developed as a novel approach to assess the deleteriousness of variants. Its three-dimensional successor, MTR3D, was demonstrated powerful at discriminating pathogenic from benign variants. However, its reliance on experimental structures and homologs limited its coverage of the proteome. We have now utilized AlphaFold2 models to develop MTR3D-AF2, which covers 89.31% of proteins and 85.39% of residues across the human proteome. This work has improved MTR3D's ability to distinguish clinically established pathogenic from benign variants. MTR3D-AF2 is freely available as an interactive web server at https://biosig.lab.uq.edu.au/mtr3daf2/.

Identifiants

pubmed: 39031445
doi: 10.1002/pro.5112
doi:

Substances chimiques

Proteome 0
Proteins 0

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

e5112

Subventions

Organisme : National Health and Medical Research Council
ID : GNT1174405
Organisme : Victorian Government's Operational Infrastructure Support Program

Informations de copyright

© 2024 The Author(s). Protein Science published by Wiley Periodicals LLC on behalf of The Protein Society.

Références

Adzhubei I, Jordan DM, Sunyaev SR. Predicting functional effect of human missense mutations using PolyPhen‐2. Curr Protoc Hum Genet. 2013;76. Chapter 7: 7.20.1–7.20.41. https://doi.org/10.1002/0471142905.hg0720s76
Ashkenazy H, Abadi S, Martz E, Chay O, Mayrose I, Pupko T, et al. ConSurf 2016: an improved methodology to estimate and visualize evolutionary conservation in macromolecules. Nucleic Acids Res. 2016;44:W344–350.
Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, et al. The protein data bank. Nucleic Acids Res. 2000;28:235–242.
Bienert S, Waterhouse A, De Beer TA, Tauriello G, Studer G, Bordoli L, et al. The SWISS‐MODEL repository—new features and functionality. Nucleic Acids Res. 2017;45:D313–D319.
Brandes N, Goldman G, Wang CH, Ye CJ, Ntranos V. Genome‐wide prediction of disease variant effects with a deep protein language model. Nat Genet. 2023;55:1512–1522.
Cheng J, Novati G, Pan J, Bycroft C, Žemgulytė A, Applebaum T, et al. Accurate proteome‐wide missense variant effect prediction with AlphaMissense. Science. 2023;381:eadg7492.
Consortium GP. A global reference for human genetic variation. Nature. 2015;526:68–74.
Cubuk C, Garrett A, Choi S, King L, Loveday C, Torr B, et al. Clinical likelihood ratios and balanced accuracy for 44 in silico tools against multiple large‐scale functional assays of cancer susceptibility genes. Genet Med. 2021;23:2096–2104.
David A, Sternberg MJ. The contribution of missense mutations in core and rim residues of protein–protein interfaces to human disease. J Mol Biol. 2015;427:2886–2898.
Dewey FE, Murray MF, Overton JD, Habegger L, Leader JB, Fetterolf SN, et al. Distribution and clinical impact of functional variants in 50,726 whole‐exome sequences from the DiscovEHR study. Science. 2016;354:aaf6814.
Ernst C, Hahnen E, Engel C, Nothnagel M, Weber J, Schmutzler RK, et al. Performance of in silico prediction tools for the classification of rare BRCA1/2 missense variants in clinical diagnostics. BMC Med Genomics. 2018;11:1–10.
Frankish A, Diekhans M, Jungreis I, Lagarde J, Loveland JE, Mudge JM, et al. GENCODE 2021. Nucleic Acids Res. 2021;49:D916–D923.
Frazer J, Notin P, Dias M, Gomez A, Min JK, Brock K, et al. Disease variant prediction with deep generative models of evolutionary data. Nature. 2021;599:91–95.
Haeussler M, Zweig AS, Tyner C, Speir ML, Rosenbloom KR, Raney BJ, et al. The UCSC genome browser database: 2019 update. Nucleic Acids Res. 2019;47:D853–D858.
Henrie A, Hemphill SE, Ruiz‐Schultz N, Cushman B, DiStefano MT, Azzariti D, et al. ClinVar miner: demonstrating utility of a web‐based tool for viewing and filtering ClinVar data. Hum Mutat. 2018;39:1051–1060.
Iqbal S, Pérez‐Palma E, Jespersen JB, May P, Hoksza D, Heyne HO, et al. Comprehensive characterization of amino acid positions in protein structures reveals molecular effect of missense variants. Proc Natl Acad Sci U S A. 2020;117:28201–28211.
Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596:583–589.
Karczewski KJ, Francioli LC, Tiao G, Cummings BB, Alföldi J, Wang Q, et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature. 2020;581:434–443.
Kryshtafovych A, Schwede T, Topf M, Fidelis K, Moult J. Critical assessment of methods of protein structure prediction (CASP)—round XIV. Proteins Struct Funct Bioinformatics. 2021;89:1607–1617.
Landrum MJ, Lee JM, Benson M, Brown GR, Chao C, Chitipiralla S, et al. ClinVar: improving access to variant interpretations and supporting evidence. Nucleic Acids Res. 2018;46:D1062–D1067.
Li B, Roden DM, Capra JA. The 3D mutational constraint on amino acid sites in the human proteome. Nat Commun. 2022;13:3273.
McLaren W, Gil L, Hunt SE, Riat HS, Ritchie GR, Thormann A, et al. The ensembl variant effect predictor. Genome Biol. 2016;17:1–14.
Ng PC, Henikoff S. SIFT: predicting amino acid changes that affect protein function. Nucleic Acids Res. 2003;31:3812–3814.
Nishi H, Nakata J, Kinoshita K. Distribution of single‐nucleotide variants on protein–protein interaction sites and its relationship with minor allele frequency. Protein Sci. 2016;25:316–321.
Rainer J, Gatto L, Weichenberger CX. Ensembldb: an R package to create and use Ensembl‐based annotation resources. Bioinformatics. 2019;35:3151–3153.
Richards S, Aziz N, Bale S, Bick D, Das S, Gastier‐Foster J, et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet Med. 2015;17:405–423.
Sahni N, Yi S, Taipale M, Bass JIF, Coulombe‐Huntington J, Yang F, et al. Widespread macromolecular interaction perturbations in human genetic disorders. Cell. 2015;161:647–660.
Schwarz JM, Cooper DN, Schuelke M, Seelow D. MutationTaster2: mutation prediction for the deep‐sequencing age. Nat Methods. 2014;11:361–362.
Schwede T, Kopp J, Guex N, Peitsch MC. SWISS‐MODEL: an automated protein homology‐modeling server. Nucleic Acids Res. 2003;31:3381–3385.
Silk M, Petrovski S, Ascher DB. MTR‐viewer: identifying regions within genes under purifying selection. Nucleic Acids Res. 2019;47:W121–W126.
Silk M, Pires DE, Rodrigues CH, D'Souza EN, Olshansky M, Thorne N, et al. MTR3D: identifying regions within protein tertiary structures under purifying selection. Nucleic Acids Res. 2021;49:W438–W445.
Stefl S, Nishi H, Petukh M, Panchenko AR, Alexov E. Molecular mechanisms of disease‐causing missense mutations. J Mol Biol. 2013;425:3919–3936.
Sudlow C, Gallacher J, Allen N, Beral V, Burton P, Danesh J, et al. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 2015;12:e1001779.
Traynelis J, Silk M, Wang Q, Berkovic SF, Liu L, Ascher DB, et al. Optimizing genomic medicine in epilepsy through a gene‐customized approach to missense variant interpretation. Genome Res. 2017;27:1715–1729.
Vacic V, Markwick PR, Oldfield CJ, Zhao X, Haynes C, Uversky VN, et al. Disease‐associated mutations disrupt functionally important regions of intrinsic protein disorder. PLoS Comput Biol. 2012;8(10):e1002709.
Varadi M, Anyango S, Deshpande M, Nair S, Natassia C, Yordanova G, et al. AlphaFold protein structure database: massively expanding the structural coverage of protein‐sequence space with high‐accuracy models. Nucleic Acids Res. 2022;50:D439–D444.

Auteurs

Aaron S Kovacs (AS)

The Australian Center for Ecogenomics, School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Queensland, Australia.

Stephanie Portelli (S)

The Australian Center for Ecogenomics, School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Queensland, Australia.
Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Australia.

Michael Silk (M)

Centre for Population Genomics, Murdoch Children's Research Institute, Melbourne, Australia.
Systems and Computational Biology, Bio21 Institute, The University of Melbourne, Melbourne, Australia.

Carlos H M Rodrigues (CHM)

The Australian Center for Ecogenomics, School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Queensland, Australia.
Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Australia.

David B Ascher (DB)

The Australian Center for Ecogenomics, School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Queensland, Australia.
Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Australia.
Systems and Computational Biology, Bio21 Institute, The University of Melbourne, Melbourne, Australia.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH