Analyzing a co-occurrence gene-interaction network to identify disease-gene association.

Epistasis, Genetic Gene Regulatory Networks Genetic Predisposition to Disease Humans Logistic Models Male Prostatic Neoplasms / genetics ROC Curve

Biological NLP Biomedical literature Disease-gene association Genetic network Text mining

Journal

BMC bioinformatics

ISSN: 1471-2105

Titre abrégé: BMC Bioinformatics

Pays: England

ID NLM: 100965194

Informations de publication

Date de publication:
08 Feb 2019

Historique:

received: 23 04 2018

accepted: 17 01 2019

entrez: 10 2 2019

pubmed: 10 2 2019

medline: 19 3 2019

Statut: epublish

Résumé

Understanding the genetic networks and their role in chronic diseases (e.g., cancer) is one of the important objectives of biological researchers. In this work, we present a text mining system that constructs a gene-gene-interaction network for the entire human genome and then performs network analysis to identify disease-related genes. We recognize the interacting genes based on their co-occurrence frequency within the biomedical literature and by employing linear and non-linear rare-event classification models. We analyze the constructed network of genes by using different network centrality measures to decide on the importance of each gene. Specifically, we apply betweenness, closeness, eigenvector, and degree centrality metrics to rank the central genes of the network and to identify possible cancer-related genes. We evaluated the top 15 ranked genes for different cancer types (i.e., Prostate, Breast, and Lung Cancer). The average precisions for identifying breast, prostate, and lung cancer genes vary between 80-100%. On a prostate case study, the system predicted an average of 80% prostate-related genes. The results show that our system has the potential for improving the prediction accuracy of identifying gene-gene interaction and disease-gene associations. We also conduct a prostate cancer case study by using the threshold property in logistic regression, and we compare our approach with some of the state-of-the-art methods.

Sections du résumé

BACKGROUND BACKGROUND

RESULTS RESULTS

We evaluated the top 15 ranked genes for different cancer types (i.e., Prostate, Breast, and Lung Cancer). The average precisions for identifying breast, prostate, and lung cancer genes vary between 80-100%. On a prostate case study, the system predicted an average of 80% prostate-related genes.

CONCLUSIONS CONCLUSIONS

The results show that our system has the potential for improving the prediction accuracy of identifying gene-gene interaction and disease-gene associations. We also conduct a prostate cancer case study by using the threshold property in logistic regression, and we compare our approach with some of the state-of-the-art methods.

Identifiants

DOI: 10.1186/s12859-019-2634-7 PMID: 30736752 PMC: PMC6368766

pubmed: 30736752

doi: 10.1186/s12859-019-2634-7

pii: 10.1186/s12859-019-2634-7

pmc: PMC6368766

doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

Pagination

Subventions

Organisme : AARE

ID : 843401

Références

Genomics. 1990 Feb;6(2):389-91

pubmed: 12134874

Nucleic Acids Res. 2003 Jan 1;31(1):291-3

pubmed: 12520005

Genome Res. 2003 Nov;13(11):2498-504

pubmed: 14597658

Nucleic Acids Res. 2004 Jan 1;32(Database issue):D258-61

pubmed: 14681407

Nat Rev Cancer. 2004 Mar;4(3):177-83

pubmed: 14993899

Nucleic Acids Res. 2005 Jan 1;33(Database issue):D514-7

pubmed: 15608251

Proc IEEE Comput Soc Bioinform Conf. 2002;1:109-17

pubmed: 15838128

AMIA Annu Symp Proc. 2006;:1123

pubmed: 17238742

Bioinformatics. 2008 Jul 1;24(13):i277-85

pubmed: 18586725

Bioinformatics. 2009 Nov 15;25(22):3045-6

pubmed: 19744993

J Biomed Inform. 2013 Apr;46(2):200-11

pubmed: 23159498

Database (Oxford). 2013 Apr 12;2013:bat018

pubmed: 23584832

Nucleic Acids Res. 2013 Jul;41(Web Server issue):W510-7

pubmed: 23761452

J Am Med Inform Assoc. 2014 May-Jun;21(3):399-405

pubmed: 23999671

Drug Discov Today. 2014 Jul;19(7):882-9

pubmed: 24201223

Methods Mol Biol. 2014;1159:11-31

pubmed: 24788259

BMC Bioinformatics. 2014 Sep 17;15:304

pubmed: 25228247

Nucleic Acids Res. 2015 Jan;43(Database issue):D447-52

pubmed: 25352553

Methods. 2015 Mar;74:83-9

pubmed: 25484339

IEEE J Biomed Health Inform. 2015 Nov;19(6):1918-28

pubmed: 25616086

Semin Cancer Biol. 2015 Dec;35 Suppl:S25-S54

pubmed: 25892662

Semin Cancer Biol. 2015 Dec;35 Suppl:S78-S103

pubmed: 25936818

Bioinformatics. 2016 Jan 1;32(1):106-13

pubmed: 26338771

IEEE/ACM Trans Comput Biol Bioinform. 2016 May-Jun;13(3):494-504

pubmed: 26415184

Nucleic Acids Res. 2017 Jan 4;45(D1):D877-D887

pubmed: 27899610

Sci Rep. 2017 Nov 17;7(1):15784

pubmed: 29150626

Analyzing a co-occurrence gene-interaction network to identify disease-gene association.

Journal

Informations de publication

Résumé

Sections du résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Pagination

Subventions

Références

Auteurs

Amira Al-Aamri (A)

Kamal Taha (K)

Yousof Al-Hammadi (Y)

Maher Maalouf (M)

Dirar Homouz (D)

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Smoking Cessation and Incident Cardiovascular Disease.

Evaluation of Low-Value Services Across Major Medicare Advantage Insurers and Traditional Medicare.

Effectiveness of Virtual Yoga for Chronic Low Back Pain: A Randomized Clinical Trial.

Classifications MeSH