Integration of background knowledge for automatic detection of inconsistencies in gene ontology annotation.


Journal

Bioinformatics (Oxford, England)
ISSN: 1367-4811
Titre abrégé: Bioinformatics
Pays: England
ID NLM: 9808944

Informations de publication

Date de publication:
28 Jun 2024
Historique:
medline: 28 6 2024
pubmed: 28 6 2024
entrez: 28 6 2024
Statut: ppublish

Résumé

Biological background knowledge plays an important role in the manual quality assurance (QA) of biological database records. One such QA task is the detection of inconsistencies in literature-based Gene Ontology Annotation (GOA). This manual verification ensures the accuracy of the GO annotations based on a comprehensive review of the literature used as evidence, Gene Ontology (GO) terms, and annotated genes in GOA records. While automatic approaches for the detection of semantic inconsistencies in GOA have been developed, they operate within predetermined contexts, lacking the ability to leverage broader evidence, especially relevant domain-specific background knowledge. This paper investigates various types of background knowledge that could improve the detection of prevalent inconsistencies in GOA. In addition, the paper proposes several approaches to integrate background knowledge into the automatic GOA inconsistency detection process. We have extended a previously developed GOA inconsistency dataset with several kinds of GOA-related background knowledge, including GeneRIF statements, biological concepts mentioned within evidence texts, GO hierarchy and existing GO annotations of the specific gene. We have proposed several effective approaches to integrate background knowledge as part of the automatic GOA inconsistency detection process. The proposed approaches can improve automatic detection of self-consistency and several of the most prevalent types of inconsistencies. This is the first study to explore the advantages of utilizing background knowledge and to propose a practical approach to incorporate knowledge in automatic GOA inconsistency detection. We establish a new benchmark for performance on this task. Our methods may be applicable to various tasks that involve incorporating biological background knowledge. https://github.com/jiyuc/de-inconsistency.

Identifiants

pubmed: 38940182
pii: 7700907
doi: 10.1093/bioinformatics/btae246
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

i390-i400

Subventions

Organisme : Australian Research Council Discovery
ID : DP190101350

Informations de copyright

© The Author(s) 2024. Published by Oxford University Press.

Auteurs

Jiyu Chen (J)

School of Computing and Information Systems, The University of Melbourne, Parkville 3010, VIC, Australia.
Data61, The Commonwealth Scientific and Industrial Research Organisation, Marsfield 2122, NSW, Australia.

Benjamin Goudey (B)

School of Computing and Information Systems, The University of Melbourne, Parkville 3010, VIC, Australia.

Nicholas Geard (N)

School of Computing and Information Systems, The University of Melbourne, Parkville 3010, VIC, Australia.

Karin Verspoor (K)

School of Computing Technologies, RMIT University, Melbourne, Victoria 3000, Australia.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH