PSAP-Genomic-Regions: A Method Leveraging Population Data to Prioritize Coding and Non-Coding Variants in Whole Genome Sequencing for Rare Disease Diagnosis.

non‐coding variants rare diseases variant prioritization whole‐genome sequencing

Journal

Genetic epidemiology
ISSN: 1098-2272
Titre abrégé: Genet Epidemiol
Pays: United States
ID NLM: 8411723

Informations de publication

Date de publication:
24 Sep 2024
Historique:
revised: 30 07 2024
received: 05 06 2024
accepted: 03 09 2024
medline: 25 9 2024
pubmed: 25 9 2024
entrez: 25 9 2024
Statut: aheadofprint

Résumé

The introduction of Next-Generation Sequencing technologies in the clinics has improved rare disease diagnosis. Nonetheless, for very heterogeneous or very rare diseases, more than half of cases still lack molecular diagnosis. Novel strategies are needed to prioritize variants within a single individual. The Population Sampling Probability (PSAP) method was developed to meet this aim but only for coding variants in exome data. Here, we propose an extension of the PSAP method to the non-coding genome called PSAP-genomic-regions. In this extension, instead of considering genes as testing units (PSAP-genes strategy), we use genomic regions defined over the whole genome that pinpoint potential functional constraints. We conceived an evaluation protocol for our method using artificially generated disease exomes and genomes, by inserting coding and non-coding pathogenic ClinVar variants in large data sets of exomes and genomes from the general population. PSAP-genomic-regions significantly improves the ranking of these variants compared to using a pathogenicity score alone. Using PSAP-genomic-regions, more than 50% of non-coding ClinVar variants were among the top 10 variants of the genome. On real sequencing data from six patients with Cerebral Small Vessel Disease and nine patients with male infertility, all causal variants were ranked in the top 100 variants with PSAP-genomic-regions. By revisiting the testing units used in the PSAP method to include non-coding variants, we have developed PSAP-genomic-regions, an efficient whole-genome prioritization tool which offers promising results for the diagnosis of unresolved rare diseases.

Identifiants

pubmed: 39318036
doi: 10.1002/gepi.22593
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Subventions

Organisme : This study was jointly supported by the French Priority Research Program on Rare Diseases "Programme Prioritaire de Recherche Maladies Rares du Programme français d'Investissements d'Avenir" and by the Britanny region through the ARED program to E. Génin, and by the National Institutes of Health of the United States of America grants R01HD078641 and P50HD096723 to D. F. Conrad. This study was also supported by the INSERM GOLD Cross-Cutting program.

Informations de copyright

© 2024 Wiley Periodicals LLC.

Références

Adzhubei, I., D. M. Jordan, and S. R. Sunyaev. 2013. “Predicting Functional Effect of Human Missense Mutations Using PolyPhen‐2.” Current Protocols in Human Genetics 76, no. 1: 7.20.1–7.20.41.
Aloui, C., D. Hervé, G. Marenne, et al. 2021. “End‐Truncated LAMB1 Causes a Hippocampal Memory Defect and a Leukoencephalopathy.” Annals of Neurology 90, no. 6: 962–975.
Amberger, J., C. A. Bocchini, A. F. Scott, and A. Hamosh. 2009 Jan 1. “McKusick's Online Mendelian Inheritance in Man (OMIM(R)).” Nucleic Acids Research 37, no. suppl_1: D793–D796.
Amberger, J. S., C. A. Bocchini, F. Schiettecatte, A. F. Scott, and A. Hamosh. 2015 Jan 28. “OMIM.org: Online Mendelian Inheritance in Man (OMIM®), an Online Catalog of Human Genes and Genetic Disorders.” Nucleic Acids Research 43, no. D1: D789–D798.
Anderson, D., and T. Lassmann. 2022. “An Expanded Phenotype Centric Benchmark of Variant Prioritisation Tools.” Human Mutation 43, no. 5: 539–546.
Auton, A., G. R. Abecasis, D. M. Altshuler, et al. 2015. “A Global Reference for Human Genetic Variation.” Nature 526, no. 7571: 68–74.
Blekhman, R., O. Man, L. Herrmann, et al. 2008. “Natural Selection on Genes That Underlie Human Disease Susceptibility.” Current Biology 18, no. 12: 883–889.
Bocher, O., T. E. Ludwig, M. S. Oglobinsky, et al. 2022. “Testing for Association With Rare Variants in the Coding and Non‐Coding Genome: RAVA‐FIRST, a New Approach Based on CADD Deleteriousness Score.” PLOS Genetics 18, no. 9: e1009923.
Boycott, K. M., A. Rath, J. X. Chong, et al. 2017. “International Cooperation to Enable the Diagnosis of All Rare Genetic Diseases.” American Journal of Human Genetics 100, no. 5: 695–705.
Buniello, A., J. A. L. MacArthur, M. Cerezo, et al. 2019. “The NHGRI‐EBI GWAS Catalog of Published Genome‐Wide Association Studies, Targeted Arrays and Summary Statistics 2019.” Nucleic Acids Research 47, no. D1: D1005–D1012.
Bustamante‐Marin, X. M., A. Horani, M. Stoyanova, et al. 2020. “Mutation of CFAP57, a Protein Required for the Asymmetric Targeting of a Subset of Inner Dynein Arms in Chlamydomonas, Causes Primary Ciliary Dyskinesia.” PLOS Genetics 16, no. 8: e1008691.
Caron, B., Y. Luo, and A. Rausell. 2019. “Ncboost Classifies Pathogenic Non‐Coding Variants in Mendelian Diseases Through Supervised Learning on Purifying Selection Signals in Humans.” Genome Biology 20, no. 1: 32.
Carter, H., C. Douville, P. D. Stenson, D. N. Cooper, and R. Karchin. 2013. “Identifying Mendelian Disease Genes With the Variant Effect Scoring Tool.” Supplement, BMC Genomics 14, no. Suppl 3: S3.
Chen, S., L. C. Francioli, and J. K. Goodrich, et al. 2022. “A Genome‐Wide Mutational Constraint Map Quantified From Variation in 76,156 Human Genomes.” bioRxiv. 2022.03.20.485034. http://www.biorxiv.org/content/10.1101/2022.03.20.485034v2.
Chong, J. X., K. J. Buckingham, S. N. Jhangiani, et al. 2015. “The Genetic Basis of Mendelian Phenotypes: Discoveries, Challenges, and Opportunities.” American Journal of Human Genetics 97, no. 2: 199–215.
Cunningham, F., J. E. Allen, J. Allen, et al. 2022. “Ensembl 2022.” Nucleic Acids Research 50, no. D1: D988–D995.
Davydov, E. V., D. L. Goode, M. Sirota, G. M. Cooper, A. Sidow, and S. Batzoglou. 2010. “Identifying a High Fraction of the Human Genome to Be Under Selective Constraint Using GERP++.” PLoS Computational Biology 6, no. 12: e1001025.
Ehrhart, F., E. L. Willighagen, M. Kutmon, M. van Hoften, L. M. G. Curfs, and C. T. Evelo. 2021. “A Resource to Explore the Discovery of Rare Diseases and Their Causative Genes.” Scientific Data 8, no. 1: 124.
Ellingford, J. M., J. W. Ahn, R. D. Bagnall, et al. 2022. “Recommendations for Clinical Interpretation of Variants Found in Non‐Coding Regions of the Genome.” Genome Medicine 14, no. 1: 73.
Garcia, F. A. de O., E. S. de Andrade, and E. I. Palmero. 2022. “Insights on Variant Analysis In Silico Tools for Pathogenicity Prediction.” Frontiers in Genetics 13: 1010327.
Génin, E., R. Redon, J. Deleuze, et al. 2017. “The French Exome (FREX) Project: A Population‐Based Panel of Exomes to Help Filter Out Common Local Variants.” Genetic Epidemiology 41, no. 7: 691.
Guo, Y., J. Long, J. He, et al. 2012. “Exome Sequencing Generates High Quality Data in Non‐Target Regions.” BMC Genomics 13: 194.
Gussow, A. B., B. R. Copeland, R. S. Dhindsa, et al. 2017. “Orion: Detecting Regions of the Human Non‐Coding Genome That Are Intolerant to Variation Using Population Genetics.” PLoS One 12, no. 8: e0181604.
Hindorff, L. A., P. Sethupathy, H. A. Junkins, et al. 2009. “Potential Etiologic and Functional Implications of Genome‐Wide Association Loci for Human Diseases and Traits.” Proceedings of the National Academy of Sciences of the United States of America 106, no. 23: 9362–9367.
Houston, B. J., A. Riera‐Escamilla, M. J. Wyrwoll, et al. 2022. “A Systematic Review of the Validated Monogenic Causes of Human Male Infertility: 2020 Update and a Discussion of Emerging Gene–Disease Relationships.” Human Reproduction Update 28, no. 1: 15–29.
Huang, Y. F., B. Gulko, and A. Siepel. 2017. “Fast, Scalable Prediction of Deleterious Noncoding Variants From Functional and Population Genomic Data.” Nature Genetics 49, no. 4: 618–624.
Ioannidis, N. M., J. H. Rothstein, V. Pejaver, et al. 2016. “REVEL: An Ensemble Method for Predicting the Pathogenicity of Rare Missense Variants.” American Journal of Human Genetics 99, no. 4: 877–885.
Karczewski, K. J., L. C. Francioli, G. Tiao, et al. 2020. “The Mutational Constraint Spectrum Quantified From Variation in 141,456 Humans.” Nature 581, no. 7809: 434–443.
Kasak, L., M. Punab, L. Nagirnaja, et al. 2018. “Bi‐Allelic Recessive Loss‐of‐Function Variants in FANCM Cause Non‐Obstructive Azoospermia.” American Journal of Human Genetics 103, no. 2: 200–212.
Kasak, L., K. Rull, T. Yang, D. M. Roden, and M. Laan. 2021. “Recurrent Pregnancy Loss and Concealed Long‐QT Syndrome.” Journal of the American Heart Association 10, no. 17: e021236.
Khan, M. R., A. Akbari, T. J. Nicholas, et al. 2023. “Genome Sequencing of Pakistani Families With Male Infertility Identifies Deleterious Genotypes in SPAG6, CCDC9, TKTL1, TUBA3C, and M1AP.” Andrology Published ahead of print, December 10, 2023. https://doi.org/10.1111/andr.13570.
Kircher, M., D. M. Witten, P. Jain, B. J. O'Roak, G. M. Cooper, and J. Shendure. 2014. “A General Framework for Estimating the Relative Pathogenicity of Human Genetic Variants.” Nature Genetics 46, no. 3: 310–315.
Köster, J., and S. Rahmann. 2012. “Snakemake—A Scalable Bioinformatics Workflow Engine.” Bioinformatics 28, no. 19: 2520–2522.
Landrum, M. J., J. M. Lee, M. Benson, et al. 2018. “ClinVar: Improving Access to Variant Interpretations and Supporting Evidence.” Nucleic Acids Research 46, no. D1: D1062–D1067.
Li, J., T. Zhao, Y. Zhang, et al. 2018. “Performance Evaluation of Pathogenicity‐Computation Methods for Missense Variants.” Nucleic Acids Research 46, no. 15: 7793–7804.
Marenne, G., T. E. Ludwig, O. Bocher, et al. 2022. “RAVAQ: An Integrative Pipeline From Quality Control to Region‐Based Rare Variant Association Analysis.” Genetic Epidemiology 46, no. 5–6: 256–265.
McKenna, A., M. Hanna, E. Banks, et al. 2010. “The Genome Analysis Toolkit: A MapReduce Framework for Analyzing Next‐Generation DNA Sequencing Data.” Genome Research 20, no. 9: 1297–1303.
Moyon, L., C. Berthelot, A. Louis, N. T. T. Nguyen, and H. Roest Crollius. 2022. “Classification of Non‐Coding Variants With High Pathogenic Impact.” PLOS Genetics 18, no. 4: e1010191.
Ng, P. C. 2003. “SIFT: Predicting Amino Acid Changes That Affect Protein Function.” Nucleic Acids Research 31, no. 13: 3812–3814.
Pollard, K. S., M. J. Hubisz, K. R. Rosenbloom, and A. Siepel. 2010. “Detection of Nonneutral Substitution Rates on Mammalian Phylogenies.” Genome Research 20, no. 1: 110–121.
Posey, J. E. 2019. “Genome Sequencing and Implications for Rare Disorders.” Orphanet Journal of Rare Diseases 14, no. 1: 153.
Rannikmäe, K., D. E. Henshall, S. Thrippleton, et al. 2020. “Beyond the Brain.” Stroke 51, no. 10: 3007–3017.
Rentzsch, P., M. Schubach, J. Shendure, and M. Kircher. 2021. “Cadd‐Splice—Improving Genome‐Wide Variant Effect Prediction Using Deep Learning‐Derived Splice Scores.” Genome Medicine 13, no. 1: 31.
Rentzsch, P., D. Witten, G. M. Cooper, J. Shendure, and M. Kircher. 2019. “CADD: Predicting the Deleteriousness of Variants Throughout the Human Genome.” Nucleic Acids Research 47, no. D1: D886–D894.
Robinson, P. N., S. Köhler, S. Bauer, D. Seelow, D. Horn, and S. Mundlos. 2008. “The Human Phenotype Ontology: A Tool for Annotating and Analyzing Human Hereditary Disease.” American Journal of Human Genetics 83, no. 5: 610–615.
Salas‐Huetos, A., F. Tüttelmann, M. J. Wyrwoll, et al. 2021. “Disruption of Human Meiotic Telomere Complex Genes TERB1, TERB2 and MAJIN in Men With Non‐Obstructive Azoospermia.” Human Genetics 140, no. 1: 217–227.
Schubach, M., T. Maass, L. Nazaretyan, S. Röner, and M. Kircher. 2024. “CADD v1.7: Using Protein Language Models, Regulatory CNNs and Other Nucleotide‐Level Scores to Improve Genome‐Wide Variant Predictions.” Nucleic Acids Research 52, no. D1: D1143–D1154.
Sequeira, A. R., E. Mentzakis, O. Archangelidi, and F. Paolucci. 2021. “The Economic and Health Impact of Rare Diseases: A Meta‐Analysis.” Health Policy and Technology 10, no. 1: 32–44.
Smedley, D., J. O. B. Jacobsen, M. Jäger, et al. 2015. “Next‐Generation Diagnostics and Disease‐Gene Discovery With the Exomiser.” Nature Protocols 10, no. 12: 2004–2015.
Smedley, D., M. Schubach, J. O. B. Jacobsen, et al. 2016. “A Whole‐Genome Analysis Framework for Effective Identification of Pathogenic Regulatory Variants in Mendelian Disease.” American Journal of Human Genetics 99, no. 3: 595–606.
Vitsios, D., R. S. Dhindsa, L. Middleton, A. B. Gussow, and S. Petrovski. 2021. “Prioritizing Non‐Coding Regions Based on Human Genomic Constraint and Sequence Context With Deep Learning.” Nature Communications 12, no. 1: 1504.
Wilfert, A. B., K. R. Chao, M. Kaushal, et al. 2016. “Genomewide Significance Testing of Variation From Single Case Exomes.” Nature Genetics 48, no. 12: 1455–1461.
Wright, C. F., D. R. FitzPatrick, and H. V. Firth. 2018. “Paediatric Genomics: Diagnosing Rare Disease in Children.” Nature Reviews Genetics 19, no. 5: 253–268.
Wyrwoll, M. J., Ş. G. Temel, L. Nagirnaja, et al. 2020. “Bi‐Allelic Mutations in M1AP Are a Frequent Cause of Meiotic Arrest and Severely Impaired Spermatogenesis Leading to Male Infertility.” American Journal of Human Genetics 107, no. 2: 342–351.

Auteurs

Marie-Sophie C Ogloblinsky (MC)

Univ Brest, Inserm, EFS, UMR 1078, GGB, Brest, France.

Ozvan Bocher (O)

Univ Brest, Inserm, EFS, UMR 1078, GGB, Brest, France.
Institute of Translational Genomics, Helmholtz Zentrum München, Munich, Germany.

Chaker Aloui (C)

Inserm, NeuroDiderot, Unité Mixte de Recherche, Université Paris Cité, Paris, France.

Anne-Louise Leutenegger (AL)

Inserm, NeuroDiderot, Unité Mixte de Recherche, Université Paris Cité, Paris, France.

Ozan Ozisik (O)

INSERM, Marseille Medical Genetics (MMG), Aix Marseille University, Marseille, France.

Anaïs Baudot (A)

INSERM, Marseille Medical Genetics (MMG), Aix Marseille University, Marseille, France.

Elisabeth Tournier-Lasserve (E)

Inserm, NeuroDiderot, Unité Mixte de Recherche, Université Paris Cité, Paris, France.
Assistance Publique-Hôpitaux de Paris, Service de Génétique Moléculaire Neurovasculaire, Hôpital Saint-Louis, Paris, France.

Helen Castillo-Madeen (H)

Division of Genetics, Oregon National Primate Research Center, Oregon Health and Science University, Portland, Oregon, USA.

Daniel Lewinsohn (D)

Division of Genetics, Oregon National Primate Research Center, Oregon Health and Science University, Portland, Oregon, USA.

Donald F Conrad (DF)

Division of Genetics, Oregon National Primate Research Center, Oregon Health and Science University, Portland, Oregon, USA.

Emmanuelle Génin (E)

Univ Brest, Inserm, EFS, UMR 1078, GGB, Brest, France.
Centre Hospitalier Régional Universitaire de Brest, Brest, France.

Gaëlle Marenne (G)

Univ Brest, Inserm, EFS, UMR 1078, GGB, Brest, France.

Classifications MeSH