BioGraph: Data Model for Linking and Querying Diverse Biological Metadata.
BioGraph
associations with the diseases
connecting biological data
gene network
metadata
query data properties
Journal
International journal of molecular sciences
ISSN: 1422-0067
Titre abrégé: Int J Mol Sci
Pays: Switzerland
ID NLM: 101092791
Informations de publication
Date de publication:
09 Apr 2023
09 Apr 2023
Historique:
received:
28
02
2023
revised:
30
03
2023
accepted:
06
04
2023
medline:
1
5
2023
pubmed:
28
4
2023
entrez:
28
4
2023
Statut:
epublish
Résumé
Studying the association of gene function, diseases, and regulatory gene network reconstruction demands data compatibility. Data from different databases follow distinct schemas and are accessible in heterogenic ways. Although the experiments differ, data may still be related to the same biological entities. Some entities may not be strictly biological, such as geolocations of habitats or paper references, but they provide a broader context for other entities. The same entities from different datasets can share similar properties, which may or may not be found within other datasets. Joint, simultaneous data fetching from multiple data sources is complicated for the end-user or, in many cases, unsupported and inefficient due to differences in data structures and ways of accessing the data. We propose BioGraph-a new model that enables connecting and retrieving information from the linked biological data that originated from diverse datasets. We have tested the model on metadata collected from five diverse public datasets and successfully constructed a knowledge graph containing more than 17 million model objects, of which 2.5 million are individual biological entity objects. The model enables the selection of complex patterns and retrieval of matched results that can be discovered only by joining the data from multiple sources.
Identifiants
pubmed: 37108117
pii: ijms24086954
doi: 10.3390/ijms24086954
pmc: PMC10138499
pii:
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Subventions
Organisme : Russian Science Foundation
ID : 23-44-00030
Références
BMC Bioinformatics. 2020 Sep 14;21(Suppl 11):228
pubmed: 32921303
Nucleic Acids Res. 2020 Jan 8;48(D1):D704-D715
pubmed: 31701156
Curr Protoc Bioinformatics. 2016 Jun 20;54:1.30.1-1.30.33
pubmed: 27322403
Nat Biotechnol. 2007 Nov;25(11):1251-5
pubmed: 17989687
Curr Opin Struct Biol. 2023 Apr;79:102538
pubmed: 36764042
Nucleic Acids Res. 2023 Jan 6;51(D1):D438-D444
pubmed: 36416266
Nucleic Acids Res. 2019 Jan 8;47(D1):D506-D515
pubmed: 30395287
J Chem Inf Model. 2019 Dec 23;59(12):4968-4973
pubmed: 31769676
Nucleic Acids Res. 2020 Jan 8;48(D1):D845-D855
pubmed: 31680165
Clin Transl Sci. 2022 Aug;15(8):1848-1855
pubmed: 36125173
Sci Data. 2023 Feb 2;10(1):67
pubmed: 36732524
Bioinformatics. 2005 Jan 1;21(1):137-40
pubmed: 15310560
Comput Biol Med. 2023 Feb;153:106524
pubmed: 36623439
Front Mol Biosci. 2020 May 19;7:91
pubmed: 32509801
IEEE Trans Neural Netw Learn Syst. 2022 Feb;33(2):494-514
pubmed: 33900922
DNA Res. 2015 Jun;22(3):233-43
pubmed: 25922535
Int J Mol Sci. 2022 Nov 29;23(23):
pubmed: 36499269
BMC Bioinformatics. 2021 Apr 14;22(Suppl 8):40
pubmed: 33849445
Nucleic Acids Res. 2015 Jan;43(Database issue):D36-42
pubmed: 25355515
Nucleic Acids Res. 2023 Jan 6;51(D1):D1003-D1009
pubmed: 36243972