Benchmark and Best Practices for Biomedical Knowledge Graph Embeddings.
Journal
Proceedings of the conference. Association for Computational Linguistics. Meeting
ISSN: 0736-587X
Titre abrégé: Proc Conf Assoc Comput Linguist Meet
Pays: United States
ID NLM: 101639983
Informations de publication
Date de publication:
Jul 2020
Jul 2020
Historique:
entrez:
22
3
2021
pubmed:
23
3
2021
medline:
23
3
2021
Statut:
ppublish
Résumé
Much of biomedical and healthcare data is encoded in discrete, symbolic form such as text and medical codes. There is a wealth of expert-curated biomedical domain knowledge stored in knowledge bases and ontologies, but the lack of reliable methods for learning knowledge representation has limited their usefulness in machine learning applications. While text-based representation learning has significantly improved in recent years through advances in natural language processing, attempts to learn biomedical concept embeddings so far have been lacking. A recent family of models called knowledge graph embeddings have shown promising results on general domain knowledge graphs, and we explore their capabilities in the biomedical domain. We train several state-of-the-art knowledge graph embedding models on the SNOMED-CT knowledge graph, provide a benchmark with comparison to existing methods and in-depth discussion on best practices, and make a case for the importance of leveraging the multi-relational nature of knowledge graphs for learning biomedical knowledge representation. The embeddings, code, and materials will be made available to the community.
Identifiants
pubmed: 33746351
doi: 10.18653/v1/2020.bionlp-1.18
pmc: PMC7971091
mid: NIHMS1676481
doi:
Types de publication
Journal Article
Langues
eng
Pagination
167-176Subventions
Organisme : NLM NIH HHS
ID : T15 LM007056
Pays : United States
Organisme : NCATS NIH HHS
ID : UL1 TR001863
Pays : United States
Références
J Am Med Inform Assoc. 2011 Jul-Aug;18(4):441-8
pubmed: 21515544
Nucleic Acids Res. 2004 Jan 1;32(Database issue):D267-70
pubmed: 14681409
KDD. 2016 Aug;2016:855-864
pubmed: 27853626
Pac Symp Biocomput. 2020;25:295-306
pubmed: 31797605
Nucleic Acids Res. 2019 Jan 8;47(D1):D330-D338
pubmed: 30395331