Genome-wide prediction and prioritization of human aging genes by data fusion: a machine learning approach.
Genome-wide
Human aging genes
Machine learning
Positive unlabeled learning
Prioritization
Journal
BMC genomics
ISSN: 1471-2164
Titre abrégé: BMC Genomics
Pays: England
ID NLM: 100965258
Informations de publication
Date de publication:
09 Nov 2019
09 Nov 2019
Historique:
received:
22
03
2019
accepted:
25
09
2019
entrez:
11
11
2019
pubmed:
11
11
2019
medline:
18
3
2020
Statut:
epublish
Résumé
Machine learning can effectively nominate novel genes for various research purposes in the laboratory. On a genome-wide scale, we implemented multiple databases and algorithms to predict and prioritize the human aging genes (PPHAGE). We fused data from 11 databases, and used Naïve Bayes classifier and positive unlabeled learning (PUL) methods, NB, Spy, and Rocchio-SVM, to rank human genes in respect with their implication in aging. The PUL methods enabled us to identify a list of negative (non-aging) genes to use alongside the seed (known age-related) genes in the ranking process. Comparison of the PUL algorithms revealed that none of the methods for identifying a negative sample were advantageous over other methods, and their simultaneous use in a form of fusion was critical for obtaining optimal results (PPHAGE is publicly available at https://cbb.ut.ac.ir/pphage). We predict and prioritize over 3,000 candidate age-related genes in human, based on significant ranking scores. The identified candidate genes are associated with pathways, ontologies, and diseases that are linked to aging, such as cancer and diabetes. Our data offer a platform for future experimental research on the genetic and biological aspects of aging. Additionally, we demonstrate that fusion of PUL methods and data sources can be successfully used for aging and disease candidate gene prioritization.
Sections du résumé
BACKGROUND
BACKGROUND
Machine learning can effectively nominate novel genes for various research purposes in the laboratory. On a genome-wide scale, we implemented multiple databases and algorithms to predict and prioritize the human aging genes (PPHAGE).
RESULTS
RESULTS
We fused data from 11 databases, and used Naïve Bayes classifier and positive unlabeled learning (PUL) methods, NB, Spy, and Rocchio-SVM, to rank human genes in respect with their implication in aging. The PUL methods enabled us to identify a list of negative (non-aging) genes to use alongside the seed (known age-related) genes in the ranking process. Comparison of the PUL algorithms revealed that none of the methods for identifying a negative sample were advantageous over other methods, and their simultaneous use in a form of fusion was critical for obtaining optimal results (PPHAGE is publicly available at https://cbb.ut.ac.ir/pphage).
CONCLUSION
CONCLUSIONS
We predict and prioritize over 3,000 candidate age-related genes in human, based on significant ranking scores. The identified candidate genes are associated with pathways, ontologies, and diseases that are linked to aging, such as cancer and diabetes. Our data offer a platform for future experimental research on the genetic and biological aspects of aging. Additionally, we demonstrate that fusion of PUL methods and data sources can be successfully used for aging and disease candidate gene prioritization.
Identifiants
pubmed: 31706268
doi: 10.1186/s12864-019-6140-0
pii: 10.1186/s12864-019-6140-0
pmc: PMC6842548
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
832Références
Biochem Med (Zagreb). 2011;21(2):174-81
pubmed: 22135858
Diabetes. 2008 Nov;57(11):3156-60
pubmed: 18647954
Sci Rep. 2014 Jun 30;4:5501
pubmed: 24975600
Nat Med. 2005 Apr;11(4):394-9
pubmed: 15750601
Genet Epidemiol. 2008 Dec;32(8):779-90
pubmed: 18613097
Age (Dordr). 2013 Aug;35(4):1467-77
pubmed: 22773346
Mol Neurobiol. 2017 Aug;54(6):4015-4020
pubmed: 27311772
Proteomics. 2015 Aug;15(15):2597-601
pubmed: 25921073
Genome Biol. 2013 Jul 26;14(7):R75
pubmed: 23889843
Bioinformatics. 2012 Oct 15;28(20):2640-7
pubmed: 22923290
Cytokine. 2016 May;81:127-36
pubmed: 27002606
Nat Rev Genet. 2011 Jan;12(1):56-68
pubmed: 21164525
Eur J Hum Genet. 2014 Feb;22(2):216-20
pubmed: 23736221
J Exp Clin Cancer Res. 2015 Jul 30;34:73
pubmed: 26223867
J Mol Med (Berl). 2016 Mar;94(3):277-86
pubmed: 26830628
Cell Biochem Funct. 2017 Apr;35(3):178-183
pubmed: 28436142
Nat Rev Genet. 2018 May;19(5):299-310
pubmed: 29479082
FEBS Lett. 2011 Jul 7;585(13):2041-8
pubmed: 21081125
FEBS J. 2012 Mar;279(5):678-96
pubmed: 22221742
Transplantation. 2010 Apr 27;89(8):1001-8
pubmed: 20061995
Nat Genet. 2009 Dec;41(12):1308-12
pubmed: 19915575
Proc Natl Acad Sci U S A. 2015 Oct 6;112(40):12492-7
pubmed: 26401016
PLoS One. 2016 Mar 07;11(3):e0144997
pubmed: 26950853
BMC Med Genet. 2011 Aug 03;12:104
pubmed: 21812969
Clinics (Sao Paulo). 2013 Jun;68(6):876-82
pubmed: 23778495
Physiol Genomics. 2003 Jul 07;14(2):149-59
pubmed: 12783983
Hum Mol Genet. 2014 Mar 1;23(5):1175-85
pubmed: 24135035
Am J Physiol Heart Circ Physiol. 2006 Jul;291(1):H106-13
pubmed: 16461369
Immun Ageing. 2018 Sep 21;15:22
pubmed: 30258468
J Neuroinflammation. 2012 Jul 23;9:179
pubmed: 22824372
Coron Artery Dis. 2008 Nov;19(7):513-9
pubmed: 18923248
Parkinsonism Relat Disord. 2008 Dec;14(8):636-40
pubmed: 18362084
Aging Cell. 2017 Oct;16(5):918-933
pubmed: 28703423
PLoS Comput Biol. 2013 Apr;9(4):e1002902
pubmed: 23633938
J Neurol Sci. 2017 Apr 15;375:18-22
pubmed: 28320126
PLoS Genet. 2011 Jun;7(6):e1002141
pubmed: 21738487
IUBMB Life. 2017 Jul;69(7):522-527
pubmed: 28474494
Genet Med. 2011 May;13(5):392-9
pubmed: 21270637
J Alzheimers Dis. 2015;46(4):837-42
pubmed: 26402623
PLoS One. 2014 Oct 21;9(10):e110134
pubmed: 25335079
Diabetes Res Clin Pract. 2018 Apr;138:187-192
pubmed: 29382585
PLoS One. 2014 Feb 28;9(2):e90215
pubmed: 24587289
J Leukoc Biol. 2015 Feb;97(2):327-39
pubmed: 25420919
Curr Opin Endocrinol Diabetes Obes. 2010 Oct;17(5):472-7
pubmed: 20585247
Mol Cancer. 2013 Jan 03;12:1
pubmed: 23286373
Bioinformatics. 2018 Jul 1;34(13):i447-i456
pubmed: 29949967
Alzheimer Dis Assoc Disord. 2011 Jul-Sep;25(3):283-5
pubmed: 21285854
Inf Fusion. 2019 Oct;50:71-91
pubmed: 30467459
Nucleic Acids Res. 2018 Jan 4;46(D1):D1083-D1090
pubmed: 29121237
Cell Cycle. 2016;15(1):41-51
pubmed: 26636733
Biochem J. 2009 Dec 14;425(1):71-83
pubmed: 19883376
Cancer Res. 1998 Aug 1;58(15):3307-11
pubmed: 9699660
Ann Surg Oncol. 2014 Dec;21 Suppl 4:S743-9
pubmed: 25029990
Isr Med Assoc J. 2001 Aug;3(8):559-62
pubmed: 11519376
Wiley Interdiscip Rev Syst Biol Med. 2012 Sep-Oct;4(5):429-42
pubmed: 22689539
J Exp Clin Cancer Res. 2017 Apr 18;36(1):56
pubmed: 28420432
BMC Bioinformatics. 2016 Sep 23;17(1):393
pubmed: 27663458
Med Phys. 2007 Nov;34(11):4164-72
pubmed: 18072480
Cytokine. 2013 May;62(2):226-31
pubmed: 23541976
J Alzheimers Dis. 2015;46(3):761-9
pubmed: 26402514
Sci Rep. 2016 Feb 02;6:19021
pubmed: 26830320
Eur J Cancer Prev. 2017 Nov;26(6):476-490
pubmed: 28538040
World J Gastroenterol. 2016 Jan 14;22(2):557-66
pubmed: 26811607
Am J Reprod Immunol. 2014 Dec;72(6):527-33
pubmed: 25112392
PLoS One. 2011;6(6):e21137
pubmed: 21731658
Nucleic Acids Res. 2016 Jul 8;44(W1):W90-7
pubmed: 27141961
J Clin Invest. 2016 Jan;126(1):195-206
pubmed: 26619120
Eur J Endocrinol. 2010 Jul;163(1):165-72
pubmed: 20335500
Artif Intell Med. 2012 Jan;54(1):63-71
pubmed: 22000346
Am J Hum Genet. 2011 Jun 10;88(6):827-838
pubmed: 21636066
PLoS One. 2011;6(8):e22920
pubmed: 21857966
PLoS One. 2013 May 28;8(5):e64802
pubmed: 23724096
Arch Neurol. 2009 Feb;66(2):250-4
pubmed: 19204163
Nucleic Acids Res. 2016 Jul 8;44(W1):W117-21
pubmed: 27131783
IEEE Trans Pattern Anal Mach Intell. 2005 Aug;27(8):1226-38
pubmed: 16119262
Clin Exp Med. 2015 Feb;15(1):31-9
pubmed: 24474501
PLoS Genet. 2012;8(3):e1002548
pubmed: 22438815
Acta Neurol Scand. 2011 Sep;124(3):176-81
pubmed: 20880267
Nucleic Acids Res. 2009 Jul;37(Web Server issue):W305-11
pubmed: 19465376