Gene family information facilitates variant interpretation and identification of disease-associated genes in neurodevelopmental disorders.


Journal

Genome medicine
ISSN: 1756-994X
Titre abrégé: Genome Med
Pays: England
ID NLM: 101475844

Informations de publication

Date de publication:
17 03 2020
Historique:
received: 15 06 2019
accepted: 21 02 2020
entrez: 19 3 2020
pubmed: 19 3 2020
medline: 5 1 2021
Statut: epublish

Résumé

Classifying pathogenicity of missense variants represents a major challenge in clinical practice during the diagnoses of rare and genetic heterogeneous neurodevelopmental disorders (NDDs). While orthologous gene conservation is commonly employed in variant annotation, approximately 80% of known disease-associated genes belong to gene families. The use of gene family information for disease gene discovery and variant interpretation has not yet been investigated on a genome-wide scale. We empirically evaluate whether paralog-conserved or non-conserved sites in human gene families are important in NDDs. Gene family information was collected from Ensembl. Paralog-conserved sites were defined based on paralog sequence alignments; 10,068 NDD patients and 2078 controls were statistically evaluated for de novo variant burden in gene families. We demonstrate that disease-associated missense variants are enriched at paralog-conserved sites across all disease groups and inheritance models tested. We developed a gene family de novo enrichment framework that identified 43 exome-wide enriched gene families including 98 de novo variant carrying genes in NDD patients of which 28 represent novel candidate genes for NDD which are brain expressed and under evolutionary constraint. This study represents the first method to incorporate gene family information into a statistical framework to interpret variant data for NDDs and to discover new NDD-associated genes.

Sections du résumé

BACKGROUND
Classifying pathogenicity of missense variants represents a major challenge in clinical practice during the diagnoses of rare and genetic heterogeneous neurodevelopmental disorders (NDDs). While orthologous gene conservation is commonly employed in variant annotation, approximately 80% of known disease-associated genes belong to gene families. The use of gene family information for disease gene discovery and variant interpretation has not yet been investigated on a genome-wide scale. We empirically evaluate whether paralog-conserved or non-conserved sites in human gene families are important in NDDs.
METHODS
Gene family information was collected from Ensembl. Paralog-conserved sites were defined based on paralog sequence alignments; 10,068 NDD patients and 2078 controls were statistically evaluated for de novo variant burden in gene families.
RESULTS
We demonstrate that disease-associated missense variants are enriched at paralog-conserved sites across all disease groups and inheritance models tested. We developed a gene family de novo enrichment framework that identified 43 exome-wide enriched gene families including 98 de novo variant carrying genes in NDD patients of which 28 represent novel candidate genes for NDD which are brain expressed and under evolutionary constraint.
CONCLUSION
This study represents the first method to incorporate gene family information into a statistical framework to interpret variant data for NDDs and to discover new NDD-associated genes.

Identifiants

pubmed: 32183904
doi: 10.1186/s13073-020-00725-6
pii: 10.1186/s13073-020-00725-6
pmc: PMC7079346
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

28

Subventions

Organisme : Medical Research Council
ID : MC_UP_1102/20
Pays : United Kingdom
Organisme : Medical Research Council
ID : MR/N026063/1
Pays : United Kingdom
Organisme : NHGRI NIH HHS
ID : T32 HG002295
Pays : United States
Organisme : NICHD NIH HHS
ID : U54 HD090255
Pays : United States

Investigateurs

Rudi Balling (R)
Nina Barisic (N)
Stéphanie Baulac (S)
Hande Caglayan (H)
Dana C Craiu (DC)
Peter De Jonghe (P)
Christel Depienne (C)
Renzo Guerrini (R)
Ingo Helbig (I)
Helle Hjalgrim (H)
Dorota Hoffman-Zacharska (D)
Johanna Jähn (J)
Karl M Klein (KM)
Bobby P C Koeleman (BPC)
Vladimir Komarek (V)
Roland Krause (R)
Eric Leguern (E)
Anna-Elina Lehesjoki (AE)
Johannes R Lemke (JR)
Holger Lerche (H)
Taria Linnankivi (T)
Carla Marini (C)
Patrick May (P)
Hiltrud Muhle (H)
Deb K Pal (DK)
Aarno Palotie (A)
Felix Rosenow (F)
Susanne Schubert-Bast (S)
Kaia Selmer (K)
Jose M Serratosa (JM)
Ulrich Stephani (U)
Katalin Štěrbová (K)
Pasquale Striano (P)
Arvid Suls (A)
Tina Talvik (T)
Sarah von Spiczak (S)
Yvonne G Weber (YG)
Sarah Weckhuysen (S)
Federico Zara (F)

Références

Nat Genet. 2014 Sep;46(9):944-50
pubmed: 25086666
Nat Rev Genet. 2013 Sep;14(9):645-60
pubmed: 23949544
Genome Res. 2020 Jan;30(1):62-71
pubmed: 31871067
Nature. 2016 Aug 17;536(7616):285-91
pubmed: 27535533
Nucleic Acids Res. 2014 Jan;42(Database issue):D865-72
pubmed: 24217909
Nature. 2015 Mar 12;519(7542):223-8
pubmed: 25533962
N Engl J Med. 2012 Nov 15;367(20):1921-9
pubmed: 23033978
Nature. 2014 Nov 13;515(7526):216-21
pubmed: 25363768
Protein Sci. 2009 Jun;18(6):1306-15
pubmed: 19472362
Bioinformatics. 2005 Jun 1;21(11):2596-603
pubmed: 15713731
Oncogene. 2003 Feb 20;22(7):1002-11
pubmed: 12592387
Nature. 2013 Sep 12;501(7466):217-21
pubmed: 23934111
Comput Appl Biosci. 1993 Dec;9(6):745-56
pubmed: 8143162
Nat Genet. 2017 Apr;49(4):504-510
pubmed: 28191890
Nature. 2014 Feb 13;506(7487):179-84
pubmed: 24463507
Genome Res. 2010 Mar;20(3):301-10
pubmed: 20067941
Genome Res. 2009 Feb;19(2):327-35
pubmed: 19029536
Nat Genet. 2008 May;40(5):676-81
pubmed: 18408719
Mol Biol Evol. 2012 Jan;29(1):61-9
pubmed: 21705381
Nucleic Acids Res. 2010 Sep;38(16):e164
pubmed: 20601685
Nat Genet. 2018 Jul;50(7):1048-1053
pubmed: 29942082
Bioinformatics. 2015 Jul 1;31(13):2202-4
pubmed: 25701572
Nature. 2017 Feb 23;542(7642):433-438
pubmed: 28135719
Nat Genet. 2013 Jun;45(6):580-5
pubmed: 23715323
Am J Hum Genet. 2014 Oct 2;95(4):360-70
pubmed: 25262651
Bioinformatics. 2009 May 1;25(9):1189-91
pubmed: 19151095
Nature. 2014 Nov 13;515(7526):209-15
pubmed: 25363760
Nat Neurosci. 2016 Sep;19(9):1194-6
pubmed: 27479843
Genome Biol. 2016 Mar 14;17:47
pubmed: 26975353
Lancet. 2012 Nov 10;380(9854):1674-82
pubmed: 23020937
PLoS Comput Biol. 2015 Dec 04;11(12):e1004559
pubmed: 26636753
Nucleic Acids Res. 2004 Mar 19;32(5):1792-7
pubmed: 15034147
J Med Genet. 2014 Jan;51(1):35-44
pubmed: 24136861
Database (Oxford). 2011 Jul 23;2011:bar030
pubmed: 21785142
Science. 2015 Dec 4;350(6265):1262-6
pubmed: 26785492

Auteurs

Dennis Lal (D)

Epilepsy Center, Neurological Institute, Cleveland Clinic, Cleveland, OH, USA. lald@ccf.org.
Stanley Center for Psychiatric Research, The Broad Institute of Harvard and M.I.T, Cambridge, MA, USA. lald@ccf.org.
Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, USA. lald@ccf.org.
Cologne Center for Genomics, University of Cologne, Cologne, Germany. lald@ccf.org.
Genomic Medicine Institute, Lerner Research Institute Cleveland Clinic, 9500 Euclid Avenue, Cleveland, OH, 44195, USA. lald@ccf.org.

Patrick May (P)

Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 6, Avenue du Swing, 4367, Belvaux, Luxembourg. patrick.may@uni.lu.

Eduardo Perez-Palma (E)

Cologne Center for Genomics, University of Cologne, Cologne, Germany.
Genomic Medicine Institute, Lerner Research Institute Cleveland Clinic, 9500 Euclid Avenue, Cleveland, OH, 44195, USA.

Kaitlin E Samocha (KE)

Stanley Center for Psychiatric Research, The Broad Institute of Harvard and M.I.T, Cambridge, MA, USA.
Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, USA.
Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK.

Jack A Kosmicki (JA)

Stanley Center for Psychiatric Research, The Broad Institute of Harvard and M.I.T, Cambridge, MA, USA.
Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, USA.

Elise B Robinson (EB)

Stanley Center for Psychiatric Research, The Broad Institute of Harvard and M.I.T, Cambridge, MA, USA.
Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, USA.
Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA.

Rikke S Møller (RS)

The Danish Epilepsy Centre, Dianalund, Denmark.
Institute for Regional Health research, University of Southern Denmark, Odense, Denmark.

Roland Krause (R)

Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 6, Avenue du Swing, 4367, Belvaux, Luxembourg.

Peter Nürnberg (P)

Cologne Center for Genomics, University of Cologne, Cologne, Germany.
Center for Molecular Medicine Cologne, University of Cologne, Cologne, Germany.
Cologne Excellence Cluster on Cellular Stress Responses in Aging-Associated Diseases, University of Cologne, Cologne, Germany.

Sarah Weckhuysen (S)

Division of Neurology, Antwerp University Hospital, Antwerp, Belgium.
Neurogenetics Group, Center for Molecular Neurology, VIB, Antwerp, Belgium.
Laboratory of Neurogenetics, Institute Born-Bunge, University of Antwerp, Antwerp, Belgium.

Peter De Jonghe (P)

Division of Neurology, Antwerp University Hospital, Antwerp, Belgium.

Renzo Guerrini (R)

Pediatric Neurology and Neuroscience Department, Children's Hospital Anna Meyer, University of Florence, Florence, Italy.

Lisa M Niestroj (LM)

Cologne Center for Genomics, University of Cologne, Cologne, Germany.

Juliana Du (J)

Cologne Center for Genomics, University of Cologne, Cologne, Germany.

Carla Marini (C)

Pediatric Neurology and Neuroscience Department, Children's Hospital Anna Meyer, University of Florence, Florence, Italy.

James S Ware (JS)

National Heart & Lung Institute and MRC London Institute of Medical Science, Imperial College London, London, UK.

Mitja Kurki (M)

Stanley Center for Psychiatric Research, The Broad Institute of Harvard and M.I.T, Cambridge, MA, USA.
Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, USA.

Padhraig Gormley (P)

Stanley Center for Psychiatric Research, The Broad Institute of Harvard and M.I.T, Cambridge, MA, USA.
Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, USA.

Sha Tang (S)

Division of Clinical Genomics, Ambry Genetics, Aliso Viejo, CA, USA.

Sitao Wu (S)

Division of Clinical Genomics, Ambry Genetics, Aliso Viejo, CA, USA.

Saskia Biskup (S)

CeGat and Practice for Human Genetics, Tübingen, Germany.

Annapurna Poduri (A)

Epilepsy Genetics Program, Boston Children's Hospital, Boston, MA, USA.

Bernd A Neubauer (BA)

Department of Neuropediatrics UKGM, University of Giessen, Giessen, Germany.

Bobby P C Koeleman (BPC)

Department of Genetics, University Medical Center Utrecht, Utrecht, The Netherlands.

Katherine L Helbig (KL)

Division of Clinical Genomics, Ambry Genetics, Aliso Viejo, CA, USA.
Division of Neurology, Children's Hospital of Philadelphia, Philadelphia, PA, USA.

Yvonne G Weber (YG)

Department of Neurology and Epileptology, Hertie Institute for Clinical Brain Research, University of Tübingen, Tübingen, Germany.
Department of Epileptology and Neurology, University of Aachen, Aachen, Germany.

Ingo Helbig (I)

Division of Neurology, Children's Hospital of Philadelphia, Philadelphia, PA, USA.
The Epilepsy NeuroGenetics Initiative (ENGIN), Children's Hospital of Philadelphia, Philadelphia, PA, USA.
Department of Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, PA, USA.
Department of Neurology, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA, 19104, USA.

Amit R Majithia (AR)

Division of Endocrinology, Department of Medicine, University of California, San Diego, CA, USA.

Aarno Palotie (A)

Stanley Center for Psychiatric Research, The Broad Institute of Harvard and M.I.T, Cambridge, MA, USA.
Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, USA.
Institute for Molecular Medicine Finland, University of Helsinki, Helsinki, Finland.

Mark J Daly (MJ)

Stanley Center for Psychiatric Research, The Broad Institute of Harvard and M.I.T, Cambridge, MA, USA. mjdaly@atgu.mgh.harvard.edu.
Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, USA. mjdaly@atgu.mgh.harvard.edu.
Institute for Molecular Medicine Finland, University of Helsinki, Helsinki, Finland. mjdaly@atgu.mgh.harvard.edu.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing
Animals Hemiptera Insect Proteins Phylogeny Insecticides
Amaryllidaceae Alkaloids Lycoris NADPH-Ferrihemoprotein Reductase Gene Expression Regulation, Plant Plant Proteins
Drought Resistance Gene Expression Profiling Gene Expression Regulation, Plant Gossypium Multigene Family

Classifications MeSH