Scalable approaches for generating, validating and incorporating data from high-throughput functional assays to improve clinical variant classification.

MAVE Machine learning Variant classification

Journal

Human genetics
ISSN: 1432-1203
Titre abrégé: Hum Genet
Pays: Germany
ID NLM: 7613873

Informations de publication

Date de publication:
01 Aug 2024
Historique:
received: 22 04 2024
accepted: 12 07 2024
medline: 1 8 2024
pubmed: 1 8 2024
entrez: 31 7 2024
Statut: aheadofprint

Résumé

As the adoption and scope of genetic testing continue to expand, interpreting the clinical significance of DNA sequence variants at scale remains a formidable challenge, with a high proportion classified as variants of uncertain significance (VUSs). Genetic testing laboratories have historically relied, in part, on functional data from academic literature to support variant classification. High-throughput functional assays or multiplex assays of variant effect (MAVEs), designed to assess the effects of DNA variants on protein stability and function, represent an important and increasingly available source of evidence for variant classification, but their potential is just beginning to be realized in clinical lab settings. Here, we describe a framework for generating, validating and incorporating data from MAVEs into a semi-quantitative variant classification method applied to clinical genetic testing. Using single-cell gene expression measurements, cellular evidence models were built to assess the effects of DNA variation in 44 genes of clinical interest. This framework was also applied to models for an additional 22 genes with previously published MAVE datasets. In total, modeling data was incorporated from 24 genes into our variant classification method. These data contributed evidence for classifying 4043 observed variants in over 57,000 individuals. Genetic testing laboratories are uniquely positioned to generate, analyze, validate, and incorporate evidence from high-throughput functional data and ultimately enable the use of these data to provide definitive clinical variant classifications for more patients.

Identifiants

pubmed: 39085601
doi: 10.1007/s00439-024-02691-0
pii: 10.1007/s00439-024-02691-0
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Informations de copyright

© 2024. The Author(s).

Références

Amorosi CJ, Chiasson MA, McDonald MG, Wong LH, Sitko KA, Boyle G, Kowalski JP, Rettie AE, Fowler DM, Dunham MJ (2021) Massively parallel characterization of CYP2C9 variant enzyme activity and abundance. Am J Hum Genet 108:1735–1751. https://doi.org/10.1016/j.ajhg.2021.07.001
doi: 10.1016/j.ajhg.2021.07.001 pubmed: 34314704 pmcid: 8456167
Araya CL, Fowler DM, Chen W, Muniez I, Kelly JW, Fields S (2012) A fundamental protein property, thermodynamic stability, revealed solely from large-scale measurements of protein function. Proc Natl Acad Sci U S A 109:16858–16863. https://doi.org/10.1073/pnas.1209751109
doi: 10.1073/pnas.1209751109 pubmed: 23035249 pmcid: 3479514
Ardui S, Ameur A, Vermeesch JR, Hestand MS (2018) Single molecular real-time (SMRT) sequencing comes of age: applications and utilities for medical diagnostics. Nucleic Acids Res 46:2159–2168. https://doi.org/10.1093/nar/gky066
doi: 10.1093/nar/gky066 pubmed: 29401301 pmcid: 5861413
Bandaru P, Shah NH, Bhattacharyya M, Barton JP, Kondo Y, Cofsky JC, Gee CL, Chakraborty AK, Kortemme T, Ranganathan R et al (2017) Deconstruction of the Ras switching cycle through saturation mutagenesis. Elife Jul 7:6e27810. https://doi.org/10.7554/eLife.27810
doi: 10.7554/eLife.27810
Brenan L, Andreev A, Cohen O, Pantel S, Kamburov A, Cacchiarelli D, Persky NS, Zhu C, Bagul M, Goetz EM et al (2016) Phenotypic characterization of a Comprehensive Set of MAPK1/ERK2 missense mutants. Cell Rep 17:1171–1183. https://doi.org/10.1016/j.celrep.2016.09.061
doi: 10.1016/j.celrep.2016.09.061 pubmed: 27760319 pmcid: 5120861
Brnich SE, Abou Tayoun AN, Couch FJ, Cutting GR, Greenblatt MS, Heinen CD, Kanavy DM, Luo X, McNulty SM, Starita LM et al (2020) Recommendations for application of the functional evidence PS3/BS3 criterion using the ACMG/AMP sequence variant interpretation framework. Genome Med 21:3. https://doi.org/10.1186/s13073-019-0690-2
doi: 10.1186/s13073-019-0690-2
Burke W, Parens E, Chung WK, Berger SM, Appelbaum PS (2022) The challenge of genetic variants of Uncertain Clinical significance: a narrative review. Ann Intern Med 175:994–1000. https://doi.org/10.7326/M21-4109
doi: 10.7326/M21-4109 pubmed: 35436152 pmcid: 10555957
Chen E, Facio FM, Aradhya KW, Rojahn S, Hatchell KE, Aguilar S, Ouyang K, Saitta S, Hanson-Kwan AK, Capurro NN et al (2023) Rates and classification of variants of Uncertain significance in Hereditary Disease Genetic Testing. JAMA Netw Open 6:e2339571. https://doi.org/10.1001/jamanetworkopen.2023.39571
doi: 10.1001/jamanetworkopen.2023.39571 pubmed: 37878314 pmcid: 10600581
Chiasson MA, Rollins NJ, Stephany JJ, Sitko KA, Matreyek KA, Verby M, Song S, Roth PR, DeSloover D, Marks DS et al (2020) Multiplex measurement of variant abundance and activity reveals VKOR topology, active site and human variant impact. Elife Sep 1:9:e58026. https://doi.org/10.7554/eLife.58026
doi: 10.7554/eLife.58026
Farrar M (2007) Striped Smith-Waterman speeds database searches six times over other SIMD implementations. Bioinformatics 23:156–161. https://doi.org/10.1093/bioinformatics/btl582
doi: 10.1093/bioinformatics/btl582 pubmed: 17110365
Fayer S, Horton C, Dines JN, Rubin AF, Richardson ME, McGoldrick K, Hernandez F, Pesaran T, Karam R, Shirts BH et al (2021) Closing the gap: systematic integration of multiplexed functional data resolves variants of uncertain significance in BRCA1, TP53, and PTEN. Am J Hum Genet 108:2248–2258. https://doi.org/10.1016/j.ajhg.2021.11.001
doi: 10.1016/j.ajhg.2021.11.001 pubmed: 34793697 pmcid: 8715144
Findlay GM, Daza RM, Martin B, Zhang MD, Leith AP, Gasperini M, Janizek JD, Huang X, Starita LM, Shendure J (2018) Accurate classification of BRCA1 variants with saturation genome editing. Nature 562:217–222. https://doi.org/10.1038/s41586-018-0461-z
doi: 10.1038/s41586-018-0461-z pubmed: 30209399 pmcid: 6181777
Fortuno C, Lee K, Olivier M, Pesaran T, Mai PL, de Andrade KC, Attardi LD, Crowley S, Evans DG, Feng BJ et al (2021) Specifications of the ACMG/AMP variant interpretation guidelines for germline TP53 variants. Hum Mutat 42:223–236. https://doi.org/10.1002/humu.24152
doi: 10.1002/humu.24152 pubmed: 33300245
Fowler DM, Rehm HL (2024) Will variants of uncertain significance still exist in 2030? Am J Hum Genet 111:5–10. https://doi.org/10.1016/j.ajhg.2023.11.005
doi: 10.1016/j.ajhg.2023.11.005 pubmed: 38086381
Giacomelli AO, Yang X, Lintner RE, McFarland JM, Duby M, Kim J, Howard TP, Takeda DY, Ly SH, Kim E et al (2018) Mutational processes shape the landscape of TP53 mutations in human cancer. Nat Genet 50:1381–1387. https://doi.org/10.1038/s41588-018-0204-y
doi: 10.1038/s41588-018-0204-y pubmed: 30224644 pmcid: 6168352
Glazer AM, Wada Y, Muhammad A, Kalash OR, O’Neill MJ, Shields T, Hall L, Short L, Blair MA, Kroncke BM et al (2020) High-throughput reclassification of SCN5A variants. Am J Hum Genet 107:111–123. https://doi.org/10.1016/j.ajhg.2020.05.015
doi: 10.1016/j.ajhg.2020.05.015 pubmed: 32533946 pmcid: 7332654
Hasle N, Matreyek KA, Fowler DM (2019) The Impact of Genetic Variants on PTEN Molecular Functions and Cellular phenotypes. Cold Spring Harb Perspect Med 9:a036228. https://doi.org/10.1101/cshperspect.a036228
doi: 10.1101/cshperspect.a036228 pubmed: 31451538 pmcid: 6824405
Jia X, Burungula BB, Chen V, Lemons RM, Jayakody S, Maksutova M, Kitzman JO (2021) Massively parallel functional testing of MSH2 missense variants conferring Lynch syndrome risk. Am J Hum Genet 108:163–175. https://doi.org/10.1016/j.ajhg.2020.12.003
doi: 10.1016/j.ajhg.2020.12.003 pubmed: 33357406
Kato S, Han SY, Liu W, Otsuka K, Shibata H, Kanamaru R, Ishioka C (2003) Understanding the function-structure and function-mutation relationships of p53 tumor suppressor protein by high-resolution missense mutation analysis. Proc Natl Acad Sci U S A 100:8424–8429. https://doi.org/10.1073/pnas.1431692100
doi: 10.1073/pnas.1431692100 pubmed: 12826609 pmcid: 166245
Kim HK, Lee EJ, Lee YJ, Kim J, Kim Y, Kim K, Lee SW, Chang S, Lee YJ, Lee JW et al (2020) Impact of proactive high-throughput functional assay data on BRCA1 variant interpretation in 2684 patients with breast or ovarian cancer. J Hum Genet 65:209–220. https://doi.org/10.1038/s10038-019-0713-2
doi: 10.1038/s10038-019-0713-2 pubmed: 31907386
Kotler E, Shani O, Goldfeld G, Lotan-Pompan M, Tarcic O, Gershoni A, Hopf TA, Marks DS, Oren M, Segal E (2018) A systematic p53 mutation Library Links Differential Functional Impact to Cancer Mutation Pattern and Evolutionary Conservation. Mol Cell 71:178–190. https://doi.org/10.1016/j.molcel.2018.06.012
doi: 10.1016/j.molcel.2018.06.012 pubmed: 29979965
Kwasnieski JC, Mogno I, Myers CA, Corbo JC, Cohen BA (2012) Complex effects of nucleotide variants in a mammalian cis-regulatory element. Proc Natl Acad Sci U S A 109:19498–19503. https://doi.org/10.1073/pnas.1210678109
doi: 10.1073/pnas.1210678109 pubmed: 23129659 pmcid: 3511131
Majithia AR, Tsuda B, Agostini M, Gnanapradeepan K, Rice R, Peloso G, Patel KA, Zhang X, Broekema MF, Patterson N et al (2016) Prospective functional classification of all possible missense variants in PPARG. Nat Genet 48:1570–1755. https://doi.org/10.1038/ng.3700
doi: 10.1038/ng.3700 pubmed: 27749844 pmcid: 5131844
Matreyek KA, Starita LM, Stephany JJ, Martin B, Chiasson MA, Gray VE, Kircher M, Khechaduri A, Dines JN, Hause RJ et al (2018) Multiplex assessment of protein variant abundance by massively parallel sequencing. Nat Genet 50:874–882. https://doi.org/10.1038/s41588-018-0122-z
doi: 10.1038/s41588-018-0122-z pubmed: 29785012 pmcid: 5980760
McInnes L, Healy J (2017) Accelerated Hierarchical Density Based Clustering. IEEE International Conference on Data Mining Workshop (ICDMW), New Orleans, LA, USA, 32–42. https://doi.org/10.1109/ICDMW.2017.12
Melamed D, Young DL, Gamble CE, Miller CR, Fields S (2013) Deep mutation scanning of an RRM domain of the Saccharomyces cerevisiae poly(A)-binding protein. RNA 19:1537–1551. https://doi.org/10.1261/rna.040709.113
doi: 10.1261/rna.040709.113 pubmed: 24064791 pmcid: 3851721
Mighell TL, Evans-Dutson S, O’Roark BJ (2018) A saturation Mutagenesis Approach to understanding PTEN lipid phosphatase activity and genotype-phenotype relationships. Am J Hum Genet 102:943–955. https://doi.org/10.1016/j.ajhg.2018.03.018
doi: 10.1016/j.ajhg.2018.03.018 pubmed: 29706350 pmcid: 5986715
Newberry RW, Arhar T, Costello J, Hartoularos GC, Maxwell AM, Naing ZZC, Pittman M, Reddy NR, Schwarz DMC, Wassarman DR et al (2020) Robust sequence determinants of alpha-synuclein toxicity in yeast implicate membrane binding. ACS Chem Biol 15:2137–2153. https://doi.org/10.1021/acschembio.0c00339
doi: 10.1021/acschembio.0c00339 pubmed: 32786289 pmcid: 7442712
Nykamp K, Anderson M, Powers M, Garcia J, Herrera B, Ho YY, Kobayashi Y, Patil N, Thusberg J, Westbrook M et al (2017) Sherloc: a comprehensive refinement of the ACMG-AMP variant classification criteria. Genet Med 19:1105–1117. https://doi.org/10.1038/gim.2017.37
doi: 10.1038/gim.2017.37 pubmed: 28492532 pmcid: 5632818
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830
Raraigh KS, Han ST, Davis E, Evans TA, Pellicore MJ, McCague AF, Joynt AT, Lu Z, Atalar M, Sharma N, Sheridan MB, Sosnay PR, Cutting GR (2018) Functional assays are essential for interpretation of Missense Variants Associated with Variable Expressivity. Am J Hum Genet 102(6):1062–1077. https://doi.org/10.1016/j.ajhg.2018.04.003
doi: 10.1016/j.ajhg.2018.04.003 pubmed: 29805046 pmcid: 5992123
Richards S, Aziz N, Bale S, Bick D, Das S, Gastier-Foster J, Grody WW, Hegde M, Lyon E, Spector E et al (2015) Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet Med 17:405–424. https://doi.org/10.1038/gim.2015.30
doi: 10.1038/gim.2015.30 pubmed: 25741868 pmcid: 4544753
Richardson ME, Hu C, Lee KY, LaDuca H, Fulk K, Durda KM, Deckman AM, Goldgar DE, Monteiro ANA, Gnanaolivu R et al (2021) Strong functional data for pathogenicity or neutrality classify BRCA2 DNA-binding-domain variants of uncertain significance. Am J Hum Genet 108:458–468. https://doi.org/10.1016/j.ajhg.2021.02.005
doi: 10.1016/j.ajhg.2021.02.005 pubmed: 33609447 pmcid: 8008494
Romero PA, Tran TM, Abate AR (2015) Dissecting enzyme function with microfluidic-based deep mutational scanning. Proc Natl Acad Sci U S A 112:7159–7164. https://doi.org/10.1073/pnas.1422285112
doi: 10.1073/pnas.1422285112 pubmed: 26040002 pmcid: 4466731
Scott A, Hernandez F, Chamberlain A, Smith C, Karam R, Kitzman JO (2022) Saturation-scale functional evidence supports clinical variant interpretation in Lynch syndrome. Genome Biol 23:266. https://doi.org/10.1186/s13059-022-02839-z
doi: 10.1186/s13059-022-02839-z pubmed: 36550560 pmcid: 9773515
Smith T, Heger A, Sudbery I (2017) UMI-tools: modeling sequencing errors in Unique Molecular identifiers to improve quantification accuracy. Genome Res 27:491–499. https://doi.org/10.1101/gr.209601.116
doi: 10.1101/gr.209601.116 pubmed: 28100584 pmcid: 5340976
Starita LM, Pruneda JN, Lo RS, Fowler DM, Kim HJ, Hiatt JB, Shendure J, Brzovic PS, Fields S, Klevit RE (2013) Activity-enhancing mutations in an E3 ubiquitin ligase identified by high-throughput mutagenesis. Proc Natl Acad Sci U S A 110:E1263–1272. https://doi.org/10.1073/pnas.1303309110
doi: 10.1073/pnas.1303309110 pubmed: 23509263 pmcid: 3619334
Starita LM, Young DL, Islam M, Kitzman JO, Gullingsrud J, Hause RJ, Fowler DM, Parvin JD, Shendure J, Fields S (2015) Massively parallel functional analysis of BRCA1 RING domain variants. Genetics 200:413–422. https://doi.org/10.1534/genetics.115.175802
doi: 10.1534/genetics.115.175802 pubmed: 25823446 pmcid: 4492368
Sun S, Weile J, Verby M, Wu Y, Wang Y, Cote AG, Fotiadou I, Kitaygorodsky J, Vidal M, Rine J et al (2020) A proactive genotype-to-patient phenotype map for cystathionine beta-synthase. Genome Med 12:13. https://doi.org/10.1186/s13073-020-0711-1
doi: 10.1186/s13073-020-0711-1 pubmed: 32000841 pmcid: 6993387
Ursu O, Neal JT, Shea E, Thakore PI, Jerby-Arnon L, Nguyen L, Dionne D, Diaz C, Bauman J, Mossad MM et al (2022) Massively parallel phenotyping of coding variants in cancer with Perturb-Seq. Nat Biotechnol 40:896–905. https://doi.org/10.1038/s41587-021-01160-7
doi: 10.1038/s41587-021-01160-7 pubmed: 35058622
Weile J, Sun S, Cote AG, Knapp J, Verby M, Mellor JC, Wu Y, Pons C, Wong C, van Lieshout N et al (2017) A framework for exhaustively mapping functional missense variants. Mol Syst Biol 13:957. https://doi.org/10.15252/msb.20177908
doi: 10.15252/msb.20177908 pubmed: 29269382 pmcid: 5740498
Zheng GXY, Terry JM, Belgrader P, Ryvkin P, Bent ZW, Wilson R, Ziraldo SB, Wheeler TD, McDermott GP, Zhu J et al (2017) Massively parallel digital transcriptional profiling of single cells. Nat Commun 8:14049. https://doi.org/10.1038/ncomms14049
doi: 10.1038/ncomms14049 pubmed: 28091601 pmcid: 5241818

Auteurs

Samskruthi Reddy Padigepati (SR)

Invitae Corporation, San Francisco, CA, 94103, USA.

David A Stafford (DA)

Invitae Corporation, San Francisco, CA, 94103, USA.

Christopher A Tan (CA)

Invitae Corporation, San Francisco, CA, 94103, USA.

Melanie R Silvis (MR)

Invitae Corporation, San Francisco, CA, 94103, USA.
Epic Bio, South San Francisco, CA, 94080, USA.

Kirsty Jamieson (K)

Invitae Corporation, San Francisco, CA, 94103, USA.
Epic Bio, South San Francisco, CA, 94080, USA.

Andrew Keyser (A)

Invitae Corporation, San Francisco, CA, 94103, USA.
Calico Life Sciences, South San Francisco, CA, 94080, USA.

Paola Alejandra Correa Nunez (PAC)

Invitae Corporation, San Francisco, CA, 94103, USA.
Gilead Life Sciences Inc, Foster City, CA, 94404, USA.

John M Nicoludis (JM)

Invitae Corporation, San Francisco, CA, 94103, USA.
Department of Structural Biology, Genentech, South San Francisco, CA, 94080, USA.

Toby Manders (T)

Invitae Corporation, San Francisco, CA, 94103, USA.

Laure Fresard (L)

Invitae Corporation, San Francisco, CA, 94103, USA.

Yuya Kobayashi (Y)

Invitae Corporation, San Francisco, CA, 94103, USA.

Carlos L Araya (CL)

Invitae Corporation, San Francisco, CA, 94103, USA.
Tapanti.org, Santa Barbara, CA, 93108, USA.

Swaroop Aradhya (S)

Invitae Corporation, San Francisco, CA, 94103, USA.

Britt Johnson (B)

Invitae Corporation, San Francisco, CA, 94103, USA.
GeneDx, Stamford, CT, 06902, USA.

Keith Nykamp (K)

Invitae Corporation, San Francisco, CA, 94103, USA. keith.nykamp@invitae.com.

Jason A Reuter (JA)

Invitae Corporation, San Francisco, CA, 94103, USA. jason.reuter@invitae.com.

Classifications MeSH