Contribution of model organism phenotypes to the computational identification of human disease genes.
Disease gene discovery
Machine learning
Model organism
Ontology
Phenotype
Semantic similarity
Journal
Disease models & mechanisms
ISSN: 1754-8411
Titre abrégé: Dis Model Mech
Pays: England
ID NLM: 101483332
Informations de publication
Date de publication:
01 07 2022
01 07 2022
Historique:
received:
24
12
2021
accepted:
13
06
2022
pubmed:
28
6
2022
medline:
5
8
2022
entrez:
27
6
2022
Statut:
ppublish
Résumé
Computing phenotypic similarity helps identify new disease genes and diagnose rare diseases. Genotype-phenotype data from orthologous genes in model organisms can compensate for lack of human data and increase genome coverage. In the past decade, cross-species phenotype comparisons have proven valuble, and several ontologies have been developed for this purpose. The relative contribution of different model organisms to computational identification of disease-associated genes is not fully explored. We used phenotype ontologies to semantically relate phenotypes resulting from loss-of-function mutations in model organisms to disease-associated phenotypes in humans. Semantic machine learning methods were used to measure the contribution of different model organisms to the identification of known human gene-disease associations. We found that mouse genotype-phenotype data provided the most important dataset in the identification of human disease genes by semantic similarity and machine learning over phenotype ontologies. Other model organisms' data did not improve identification over that obtained using the mouse alone, and therefore did not contribute significantly to this task. Our work impacts on the development of integrated phenotype ontologies, as well as for the use of model organism phenotypes in human genetic variant interpretation. This article has an associated First Person interview with the first author of the paper.
Identifiants
pubmed: 35758016
pii: 275986
doi: 10.1242/dmm.049441
pmc: PMC9366895
pii:
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Informations de copyright
© 2022. Published by The Company of Biologists Ltd.
Déclaration de conflit d'intérêts
Competing interests The authors declare no competing or financial interests.
Références
Dis Model Mech. 2009 Jan-Feb;2(1-2):18-22
pubmed: 19132116
J Med Genet. 2002 May;39(5):305-10
pubmed: 12011143
Nucleic Acids Res. 2005 Jan 1;33(Database issue):D476-80
pubmed: 15608241
Brief Bioinform. 2021 Jul 20;22(4):
pubmed: 33049044
Am J Hum Genet. 2009 Apr;84(4):524-33
pubmed: 19344873
Nucleic Acids Res. 2019 Jan 8;47(D1):D759-D765
pubmed: 30364959
Am J Hum Genet. 2016 Sep 1;99(3):595-606
pubmed: 27569544
Nucleic Acids Res. 2020 Jan 8;48(D1):D704-D715
pubmed: 31701156
N Engl J Med. 2021 Nov 11;385(20):1868-1880
pubmed: 34758253
J Biomed Semantics. 2013 Oct 18;4(1):32
pubmed: 24139062
Mamm Genome. 2022 Mar;33(1):123-134
pubmed: 34698892
Hum Mol Genet. 2012 Aug 15;21(16):3535-45
pubmed: 22589248
BMC Bioinformatics. 2008 Apr 23;9:208
pubmed: 18433471
Hum Mutat. 2015 May;36(5):513-23
pubmed: 25684150
Am J Hum Genet. 2009 Oct;85(4):457-64
pubmed: 19800049
Proc Natl Acad Sci U S A. 2010 Apr 6;107(14):6544-9
pubmed: 20308572
Bioinformatics. 2017 Jul 15;33(14):i75-i82
pubmed: 28881964
Genetics. 2017 Sep;207(1):9-27
pubmed: 28874452
PLoS Biol. 2009 Nov;7(11):e1000247
pubmed: 19956802
Hum Genet. 2018 Jan;137(1):39-44
pubmed: 29164333
Bioinformatics. 2018 Jul 1;34(13):i52-i60
pubmed: 29949999
Ned Tijdschr Geneeskd. 2008 Mar 1;152(9):518-9
pubmed: 18389888
Mech Ageing Dev. 2020 Jun;188:111249
pubmed: 32320732
Hum Mol Genet. 2021 Oct 1;30(R2):R274-R284
pubmed: 34089057
Nucleic Acids Res. 2021 Jan 8;49(D1):D899-D907
pubmed: 33219682
Curr Protoc Bioinformatics. 2017 Jun 27;58:1.2.1-1.2.12
pubmed: 28654725
Brief Bioinform. 2018 Sep 28;19(5):1008-1021
pubmed: 28387809
Genes (Basel). 2021 Aug 24;12(9):
pubmed: 34573285
Nat Rev Genet. 2011 Jan;12(1):56-68
pubmed: 21164525
J Biomed Semantics. 2014 Jun 03;5(Suppl 1 Proceedings of the Bio-Ontologies Spec Interest G):S4
pubmed: 25093073
Genome Biol. 2005;6(1):R8
pubmed: 15642100
PLoS Comput Biol. 2009 Jul;5(7):e1000443
pubmed: 19649320
J Biomed Semantics. 2014 Feb 25;5(1):12
pubmed: 24568621
Sci Rep. 2019 Mar 11;9(1):4025
pubmed: 30858527
PLoS Comput Biol. 2020 Nov 18;16(11):e1008453
pubmed: 33206638
Nat Genet. 2017 Aug;49(8):1231-1238
pubmed: 28650483
Genome Med. 2015 Jul 30;7(1):81
pubmed: 26229552
Genetics. 2003 Apr;163(4):1427-38
pubmed: 12702686
J Invest Dermatol. 2013 Nov;133(11):2509-2513
pubmed: 23812235
Nat Rev Genet. 2018 Jun;19(6):357-370
pubmed: 29626206
Nat Genet. 2000 May;25(1):25-9
pubmed: 10802651
Mamm Genome. 2022 Mar;33(1):4-18
pubmed: 34698891
Dis Model Mech. 2019 May 7;12(5):
pubmed: 31064765
Bioinformatics. 2014 Mar 1;30(5):740-2
pubmed: 24108186
Dis Model Mech. 2018 Oct 24;11(10):
pubmed: 30366936
Bioinformatics. 2019 Jun 1;35(12):2133-2140
pubmed: 30407490
Curr Mod Biol. 1974 May;5(4):187-96
pubmed: 4407425
Dis Model Mech. 2019 Feb 22;12(2):
pubmed: 30819728
Genes (Basel). 2020 Apr 23;11(4):
pubmed: 32340307
Genetics. 2022 Apr 4;220(4):
pubmed: 35380658
J Biomed Semantics. 2013 Oct 18;4(1):30
pubmed: 24138933
Brief Bioinform. 2015 Nov;16(6):1069-80
pubmed: 25863278
Genome Biol. 2012 Jan 31;13(1):R5
pubmed: 22293552
PLoS Comput Biol. 2017 Apr 17;13(4):e1005500
pubmed: 28414800
J Biomed Semantics. 2016 Jul 04;7(1):44
pubmed: 27377652
J Biomed Semantics. 2014 Aug 11;5:34
pubmed: 25140222
Nucleic Acids Res. 2019 Jan 8;47(D1):D1018-D1027
pubmed: 30476213
Orphanet J Rare Dis. 2021 May 7;16(1):206
pubmed: 33962631
Database (Oxford). 2013 May 09;2013:bat025
pubmed: 23660285
Mamm Genome. 2019 Jun;30(5-6):143-150
pubmed: 31127358
Nucleic Acids Res. 2021 Jan 8;49(D1):D1207-D1217
pubmed: 33264411
Nature. 2018 Jun;558(7710):354-355
pubmed: 29921859
Front Cell Dev Biol. 2021 May 20;9:662583
pubmed: 34095129
Bioinformatics. 2021 May 5;37(6):853-860
pubmed: 33051643
Nucleic Acids Res. 2011 Oct;39(18):e119
pubmed: 21737429
Nucleic Acids Res. 2020 Jan 8;48(D1):D650-D658
pubmed: 31552413
Int Rev Neurobiol. 2012;103:69-87
pubmed: 23195121
Nucleic Acids Res. 2022 Jan 7;50(D1):D1255-D1261
pubmed: 34755882
Nat Rev Genet. 2011 Mar;12(3):204-13
pubmed: 21331091
Nat Protoc. 2015 Dec;10(12):2004-15
pubmed: 26562621
Clin Genet. 2007 Jan;71(1):1-11
pubmed: 17204041
Bioinformatics. 2018 Jun 15;34(12):2087-2095
pubmed: 29360927
Genome Biol. 2005;6(5):R46
pubmed: 15892874
Nucleic Acids Res. 2017 Jan 4;45(D1):D712-D722
pubmed: 27899636
Bioinformatics. 2013 Jul 01;29(13):1671-8
pubmed: 23658422
Genome Biol. 2010 Jan 08;11(1):R2
pubmed: 20064205
J Biomed Semantics. 2017 Feb 13;8(1):7
pubmed: 28193260
Nucleic Acids Res. 2019 Jan 8;47(D1):D1038-D1043
pubmed: 30445645
Methods Mol Biol. 2017;1488:47-73
pubmed: 27933520
Nat Rev Genet. 2011 Jul 18;12(8):575-82
pubmed: 21765459
Mamm Genome. 2012 Oct;23(9-10):653-68
pubmed: 22961259