Scalable approaches for generating, validating and incorporating data from high-throughput functional assays to improve clinical variant classification.
MAVE
Machine learning
Variant classification
Journal
Human genetics
ISSN: 1432-1203
Titre abrégé: Hum Genet
Pays: Germany
ID NLM: 7613873
Informations de publication
Date de publication:
01 Aug 2024
01 Aug 2024
Historique:
received:
22
04
2024
accepted:
12
07
2024
medline:
1
8
2024
pubmed:
1
8
2024
entrez:
31
7
2024
Statut:
aheadofprint
Résumé
As the adoption and scope of genetic testing continue to expand, interpreting the clinical significance of DNA sequence variants at scale remains a formidable challenge, with a high proportion classified as variants of uncertain significance (VUSs). Genetic testing laboratories have historically relied, in part, on functional data from academic literature to support variant classification. High-throughput functional assays or multiplex assays of variant effect (MAVEs), designed to assess the effects of DNA variants on protein stability and function, represent an important and increasingly available source of evidence for variant classification, but their potential is just beginning to be realized in clinical lab settings. Here, we describe a framework for generating, validating and incorporating data from MAVEs into a semi-quantitative variant classification method applied to clinical genetic testing. Using single-cell gene expression measurements, cellular evidence models were built to assess the effects of DNA variation in 44 genes of clinical interest. This framework was also applied to models for an additional 22 genes with previously published MAVE datasets. In total, modeling data was incorporated from 24 genes into our variant classification method. These data contributed evidence for classifying 4043 observed variants in over 57,000 individuals. Genetic testing laboratories are uniquely positioned to generate, analyze, validate, and incorporate evidence from high-throughput functional data and ultimately enable the use of these data to provide definitive clinical variant classifications for more patients.
Identifiants
pubmed: 39085601
doi: 10.1007/s00439-024-02691-0
pii: 10.1007/s00439-024-02691-0
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Informations de copyright
© 2024. The Author(s).
Références
Amorosi CJ, Chiasson MA, McDonald MG, Wong LH, Sitko KA, Boyle G, Kowalski JP, Rettie AE, Fowler DM, Dunham MJ (2021) Massively parallel characterization of CYP2C9 variant enzyme activity and abundance. Am J Hum Genet 108:1735–1751. https://doi.org/10.1016/j.ajhg.2021.07.001
doi: 10.1016/j.ajhg.2021.07.001
pubmed: 34314704
pmcid: 8456167
Araya CL, Fowler DM, Chen W, Muniez I, Kelly JW, Fields S (2012) A fundamental protein property, thermodynamic stability, revealed solely from large-scale measurements of protein function. Proc Natl Acad Sci U S A 109:16858–16863. https://doi.org/10.1073/pnas.1209751109
doi: 10.1073/pnas.1209751109
pubmed: 23035249
pmcid: 3479514
Ardui S, Ameur A, Vermeesch JR, Hestand MS (2018) Single molecular real-time (SMRT) sequencing comes of age: applications and utilities for medical diagnostics. Nucleic Acids Res 46:2159–2168. https://doi.org/10.1093/nar/gky066
doi: 10.1093/nar/gky066
pubmed: 29401301
pmcid: 5861413
Bandaru P, Shah NH, Bhattacharyya M, Barton JP, Kondo Y, Cofsky JC, Gee CL, Chakraborty AK, Kortemme T, Ranganathan R et al (2017) Deconstruction of the Ras switching cycle through saturation mutagenesis. Elife Jul 7:6e27810. https://doi.org/10.7554/eLife.27810
doi: 10.7554/eLife.27810
Brenan L, Andreev A, Cohen O, Pantel S, Kamburov A, Cacchiarelli D, Persky NS, Zhu C, Bagul M, Goetz EM et al (2016) Phenotypic characterization of a Comprehensive Set of MAPK1/ERK2 missense mutants. Cell Rep 17:1171–1183. https://doi.org/10.1016/j.celrep.2016.09.061
doi: 10.1016/j.celrep.2016.09.061
pubmed: 27760319
pmcid: 5120861
Brnich SE, Abou Tayoun AN, Couch FJ, Cutting GR, Greenblatt MS, Heinen CD, Kanavy DM, Luo X, McNulty SM, Starita LM et al (2020) Recommendations for application of the functional evidence PS3/BS3 criterion using the ACMG/AMP sequence variant interpretation framework. Genome Med 21:3. https://doi.org/10.1186/s13073-019-0690-2
doi: 10.1186/s13073-019-0690-2
Burke W, Parens E, Chung WK, Berger SM, Appelbaum PS (2022) The challenge of genetic variants of Uncertain Clinical significance: a narrative review. Ann Intern Med 175:994–1000. https://doi.org/10.7326/M21-4109
doi: 10.7326/M21-4109
pubmed: 35436152
pmcid: 10555957
Chen E, Facio FM, Aradhya KW, Rojahn S, Hatchell KE, Aguilar S, Ouyang K, Saitta S, Hanson-Kwan AK, Capurro NN et al (2023) Rates and classification of variants of Uncertain significance in Hereditary Disease Genetic Testing. JAMA Netw Open 6:e2339571. https://doi.org/10.1001/jamanetworkopen.2023.39571
doi: 10.1001/jamanetworkopen.2023.39571
pubmed: 37878314
pmcid: 10600581
Chiasson MA, Rollins NJ, Stephany JJ, Sitko KA, Matreyek KA, Verby M, Song S, Roth PR, DeSloover D, Marks DS et al (2020) Multiplex measurement of variant abundance and activity reveals VKOR topology, active site and human variant impact. Elife Sep 1:9:e58026. https://doi.org/10.7554/eLife.58026
doi: 10.7554/eLife.58026
Farrar M (2007) Striped Smith-Waterman speeds database searches six times over other SIMD implementations. Bioinformatics 23:156–161. https://doi.org/10.1093/bioinformatics/btl582
doi: 10.1093/bioinformatics/btl582
pubmed: 17110365
Fayer S, Horton C, Dines JN, Rubin AF, Richardson ME, McGoldrick K, Hernandez F, Pesaran T, Karam R, Shirts BH et al (2021) Closing the gap: systematic integration of multiplexed functional data resolves variants of uncertain significance in BRCA1, TP53, and PTEN. Am J Hum Genet 108:2248–2258. https://doi.org/10.1016/j.ajhg.2021.11.001
doi: 10.1016/j.ajhg.2021.11.001
pubmed: 34793697
pmcid: 8715144
Findlay GM, Daza RM, Martin B, Zhang MD, Leith AP, Gasperini M, Janizek JD, Huang X, Starita LM, Shendure J (2018) Accurate classification of BRCA1 variants with saturation genome editing. Nature 562:217–222. https://doi.org/10.1038/s41586-018-0461-z
doi: 10.1038/s41586-018-0461-z
pubmed: 30209399
pmcid: 6181777
Fortuno C, Lee K, Olivier M, Pesaran T, Mai PL, de Andrade KC, Attardi LD, Crowley S, Evans DG, Feng BJ et al (2021) Specifications of the ACMG/AMP variant interpretation guidelines for germline TP53 variants. Hum Mutat 42:223–236. https://doi.org/10.1002/humu.24152
doi: 10.1002/humu.24152
pubmed: 33300245
Fowler DM, Rehm HL (2024) Will variants of uncertain significance still exist in 2030? Am J Hum Genet 111:5–10. https://doi.org/10.1016/j.ajhg.2023.11.005
doi: 10.1016/j.ajhg.2023.11.005
pubmed: 38086381
Giacomelli AO, Yang X, Lintner RE, McFarland JM, Duby M, Kim J, Howard TP, Takeda DY, Ly SH, Kim E et al (2018) Mutational processes shape the landscape of TP53 mutations in human cancer. Nat Genet 50:1381–1387. https://doi.org/10.1038/s41588-018-0204-y
doi: 10.1038/s41588-018-0204-y
pubmed: 30224644
pmcid: 6168352
Glazer AM, Wada Y, Muhammad A, Kalash OR, O’Neill MJ, Shields T, Hall L, Short L, Blair MA, Kroncke BM et al (2020) High-throughput reclassification of SCN5A variants. Am J Hum Genet 107:111–123. https://doi.org/10.1016/j.ajhg.2020.05.015
doi: 10.1016/j.ajhg.2020.05.015
pubmed: 32533946
pmcid: 7332654
Hasle N, Matreyek KA, Fowler DM (2019) The Impact of Genetic Variants on PTEN Molecular Functions and Cellular phenotypes. Cold Spring Harb Perspect Med 9:a036228. https://doi.org/10.1101/cshperspect.a036228
doi: 10.1101/cshperspect.a036228
pubmed: 31451538
pmcid: 6824405
Jia X, Burungula BB, Chen V, Lemons RM, Jayakody S, Maksutova M, Kitzman JO (2021) Massively parallel functional testing of MSH2 missense variants conferring Lynch syndrome risk. Am J Hum Genet 108:163–175. https://doi.org/10.1016/j.ajhg.2020.12.003
doi: 10.1016/j.ajhg.2020.12.003
pubmed: 33357406
Kato S, Han SY, Liu W, Otsuka K, Shibata H, Kanamaru R, Ishioka C (2003) Understanding the function-structure and function-mutation relationships of p53 tumor suppressor protein by high-resolution missense mutation analysis. Proc Natl Acad Sci U S A 100:8424–8429. https://doi.org/10.1073/pnas.1431692100
doi: 10.1073/pnas.1431692100
pubmed: 12826609
pmcid: 166245
Kim HK, Lee EJ, Lee YJ, Kim J, Kim Y, Kim K, Lee SW, Chang S, Lee YJ, Lee JW et al (2020) Impact of proactive high-throughput functional assay data on BRCA1 variant interpretation in 2684 patients with breast or ovarian cancer. J Hum Genet 65:209–220. https://doi.org/10.1038/s10038-019-0713-2
doi: 10.1038/s10038-019-0713-2
pubmed: 31907386
Kotler E, Shani O, Goldfeld G, Lotan-Pompan M, Tarcic O, Gershoni A, Hopf TA, Marks DS, Oren M, Segal E (2018) A systematic p53 mutation Library Links Differential Functional Impact to Cancer Mutation Pattern and Evolutionary Conservation. Mol Cell 71:178–190. https://doi.org/10.1016/j.molcel.2018.06.012
doi: 10.1016/j.molcel.2018.06.012
pubmed: 29979965
Kwasnieski JC, Mogno I, Myers CA, Corbo JC, Cohen BA (2012) Complex effects of nucleotide variants in a mammalian cis-regulatory element. Proc Natl Acad Sci U S A 109:19498–19503. https://doi.org/10.1073/pnas.1210678109
doi: 10.1073/pnas.1210678109
pubmed: 23129659
pmcid: 3511131
Majithia AR, Tsuda B, Agostini M, Gnanapradeepan K, Rice R, Peloso G, Patel KA, Zhang X, Broekema MF, Patterson N et al (2016) Prospective functional classification of all possible missense variants in PPARG. Nat Genet 48:1570–1755. https://doi.org/10.1038/ng.3700
doi: 10.1038/ng.3700
pubmed: 27749844
pmcid: 5131844
Matreyek KA, Starita LM, Stephany JJ, Martin B, Chiasson MA, Gray VE, Kircher M, Khechaduri A, Dines JN, Hause RJ et al (2018) Multiplex assessment of protein variant abundance by massively parallel sequencing. Nat Genet 50:874–882. https://doi.org/10.1038/s41588-018-0122-z
doi: 10.1038/s41588-018-0122-z
pubmed: 29785012
pmcid: 5980760
McInnes L, Healy J (2017) Accelerated Hierarchical Density Based Clustering. IEEE International Conference on Data Mining Workshop (ICDMW), New Orleans, LA, USA, 32–42. https://doi.org/10.1109/ICDMW.2017.12
Melamed D, Young DL, Gamble CE, Miller CR, Fields S (2013) Deep mutation scanning of an RRM domain of the Saccharomyces cerevisiae poly(A)-binding protein. RNA 19:1537–1551. https://doi.org/10.1261/rna.040709.113
doi: 10.1261/rna.040709.113
pubmed: 24064791
pmcid: 3851721
Mighell TL, Evans-Dutson S, O’Roark BJ (2018) A saturation Mutagenesis Approach to understanding PTEN lipid phosphatase activity and genotype-phenotype relationships. Am J Hum Genet 102:943–955. https://doi.org/10.1016/j.ajhg.2018.03.018
doi: 10.1016/j.ajhg.2018.03.018
pubmed: 29706350
pmcid: 5986715
Newberry RW, Arhar T, Costello J, Hartoularos GC, Maxwell AM, Naing ZZC, Pittman M, Reddy NR, Schwarz DMC, Wassarman DR et al (2020) Robust sequence determinants of alpha-synuclein toxicity in yeast implicate membrane binding. ACS Chem Biol 15:2137–2153. https://doi.org/10.1021/acschembio.0c00339
doi: 10.1021/acschembio.0c00339
pubmed: 32786289
pmcid: 7442712
Nykamp K, Anderson M, Powers M, Garcia J, Herrera B, Ho YY, Kobayashi Y, Patil N, Thusberg J, Westbrook M et al (2017) Sherloc: a comprehensive refinement of the ACMG-AMP variant classification criteria. Genet Med 19:1105–1117. https://doi.org/10.1038/gim.2017.37
doi: 10.1038/gim.2017.37
pubmed: 28492532
pmcid: 5632818
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830
Raraigh KS, Han ST, Davis E, Evans TA, Pellicore MJ, McCague AF, Joynt AT, Lu Z, Atalar M, Sharma N, Sheridan MB, Sosnay PR, Cutting GR (2018) Functional assays are essential for interpretation of Missense Variants Associated with Variable Expressivity. Am J Hum Genet 102(6):1062–1077. https://doi.org/10.1016/j.ajhg.2018.04.003
doi: 10.1016/j.ajhg.2018.04.003
pubmed: 29805046
pmcid: 5992123
Richards S, Aziz N, Bale S, Bick D, Das S, Gastier-Foster J, Grody WW, Hegde M, Lyon E, Spector E et al (2015) Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet Med 17:405–424. https://doi.org/10.1038/gim.2015.30
doi: 10.1038/gim.2015.30
pubmed: 25741868
pmcid: 4544753
Richardson ME, Hu C, Lee KY, LaDuca H, Fulk K, Durda KM, Deckman AM, Goldgar DE, Monteiro ANA, Gnanaolivu R et al (2021) Strong functional data for pathogenicity or neutrality classify BRCA2 DNA-binding-domain variants of uncertain significance. Am J Hum Genet 108:458–468. https://doi.org/10.1016/j.ajhg.2021.02.005
doi: 10.1016/j.ajhg.2021.02.005
pubmed: 33609447
pmcid: 8008494
Romero PA, Tran TM, Abate AR (2015) Dissecting enzyme function with microfluidic-based deep mutational scanning. Proc Natl Acad Sci U S A 112:7159–7164. https://doi.org/10.1073/pnas.1422285112
doi: 10.1073/pnas.1422285112
pubmed: 26040002
pmcid: 4466731
Scott A, Hernandez F, Chamberlain A, Smith C, Karam R, Kitzman JO (2022) Saturation-scale functional evidence supports clinical variant interpretation in Lynch syndrome. Genome Biol 23:266. https://doi.org/10.1186/s13059-022-02839-z
doi: 10.1186/s13059-022-02839-z
pubmed: 36550560
pmcid: 9773515
Smith T, Heger A, Sudbery I (2017) UMI-tools: modeling sequencing errors in Unique Molecular identifiers to improve quantification accuracy. Genome Res 27:491–499. https://doi.org/10.1101/gr.209601.116
doi: 10.1101/gr.209601.116
pubmed: 28100584
pmcid: 5340976
Starita LM, Pruneda JN, Lo RS, Fowler DM, Kim HJ, Hiatt JB, Shendure J, Brzovic PS, Fields S, Klevit RE (2013) Activity-enhancing mutations in an E3 ubiquitin ligase identified by high-throughput mutagenesis. Proc Natl Acad Sci U S A 110:E1263–1272. https://doi.org/10.1073/pnas.1303309110
doi: 10.1073/pnas.1303309110
pubmed: 23509263
pmcid: 3619334
Starita LM, Young DL, Islam M, Kitzman JO, Gullingsrud J, Hause RJ, Fowler DM, Parvin JD, Shendure J, Fields S (2015) Massively parallel functional analysis of BRCA1 RING domain variants. Genetics 200:413–422. https://doi.org/10.1534/genetics.115.175802
doi: 10.1534/genetics.115.175802
pubmed: 25823446
pmcid: 4492368
Sun S, Weile J, Verby M, Wu Y, Wang Y, Cote AG, Fotiadou I, Kitaygorodsky J, Vidal M, Rine J et al (2020) A proactive genotype-to-patient phenotype map for cystathionine beta-synthase. Genome Med 12:13. https://doi.org/10.1186/s13073-020-0711-1
doi: 10.1186/s13073-020-0711-1
pubmed: 32000841
pmcid: 6993387
Ursu O, Neal JT, Shea E, Thakore PI, Jerby-Arnon L, Nguyen L, Dionne D, Diaz C, Bauman J, Mossad MM et al (2022) Massively parallel phenotyping of coding variants in cancer with Perturb-Seq. Nat Biotechnol 40:896–905. https://doi.org/10.1038/s41587-021-01160-7
doi: 10.1038/s41587-021-01160-7
pubmed: 35058622
Weile J, Sun S, Cote AG, Knapp J, Verby M, Mellor JC, Wu Y, Pons C, Wong C, van Lieshout N et al (2017) A framework for exhaustively mapping functional missense variants. Mol Syst Biol 13:957. https://doi.org/10.15252/msb.20177908
doi: 10.15252/msb.20177908
pubmed: 29269382
pmcid: 5740498
Zheng GXY, Terry JM, Belgrader P, Ryvkin P, Bent ZW, Wilson R, Ziraldo SB, Wheeler TD, McDermott GP, Zhu J et al (2017) Massively parallel digital transcriptional profiling of single cells. Nat Commun 8:14049. https://doi.org/10.1038/ncomms14049
doi: 10.1038/ncomms14049
pubmed: 28091601
pmcid: 5241818