Identification of tissue-specific tumor biomarker using different optimization algorithms.
Biomarker
Machine learning tools
Messenger RNA
Optimization algorithm
Pathway analysis
Journal
Genes & genomics
ISSN: 2092-9293
Titre abrégé: Genes Genomics
Pays: Korea (South)
ID NLM: 101481027
Informations de publication
Date de publication:
04 2019
04 2019
Historique:
received:
16
04
2018
accepted:
03
12
2018
pubmed:
12
12
2018
medline:
21
5
2019
entrez:
12
12
2018
Statut:
ppublish
Résumé
Identification of differentially expressed genes, i.e., genes whose transcript abundance level differs across different biological or physiological conditions, was indeed a challenging task. However, the inception of transcriptome sequencing (RNA-seq) technology revolutionized the simultaneous measurement of the transcript abundance levels for thousands of genes. In this paper, such next-generation sequencing (NGS) data is used to identify biomarker signatures for several of the most common cancer types (bladder, colon, kidney, brain, liver, lung, prostate, skin, and thyroid) METHODS: Here, the problem is mapped into the comparison of optimization algorithms for selecting a set of genes that lead to the highest classification accuracy of a two-class classification task between healthy and tumor samples. As the optimization algorithms Artificial Bee Colony (ABC), Ant Colony Optimization, Differential Evolution, and Particle Swarm Optimization are chosen for this experiment. A standard statistical method called DESeq2 is used to select differentially expressed genes before being feed to the optimization algorithms. Classification of healthy and tumor samples is done by support vector machine RESULTS: Cancer-specific validation yields remarkably good results in terms of accuracy. Highest classification accuracy is achieved by the ABC algorithm for Brain lower grade glioma data is 99.10%. This validation is well supported by a statistical test, gene ontology enrichment analysis, and KEGG pathway enrichment analysis for each cancer biomarker signature CONCLUSION: The current study identified robust genes as biomarker signatures and these identified biomarkers might be helpful to accurately identify tumors of unknown origin.
Sections du résumé
BACKGROUND
Identification of differentially expressed genes, i.e., genes whose transcript abundance level differs across different biological or physiological conditions, was indeed a challenging task. However, the inception of transcriptome sequencing (RNA-seq) technology revolutionized the simultaneous measurement of the transcript abundance levels for thousands of genes.
OBJECTIVE
In this paper, such next-generation sequencing (NGS) data is used to identify biomarker signatures for several of the most common cancer types (bladder, colon, kidney, brain, liver, lung, prostate, skin, and thyroid) METHODS: Here, the problem is mapped into the comparison of optimization algorithms for selecting a set of genes that lead to the highest classification accuracy of a two-class classification task between healthy and tumor samples. As the optimization algorithms Artificial Bee Colony (ABC), Ant Colony Optimization, Differential Evolution, and Particle Swarm Optimization are chosen for this experiment. A standard statistical method called DESeq2 is used to select differentially expressed genes before being feed to the optimization algorithms. Classification of healthy and tumor samples is done by support vector machine RESULTS: Cancer-specific validation yields remarkably good results in terms of accuracy. Highest classification accuracy is achieved by the ABC algorithm for Brain lower grade glioma data is 99.10%. This validation is well supported by a statistical test, gene ontology enrichment analysis, and KEGG pathway enrichment analysis for each cancer biomarker signature CONCLUSION: The current study identified robust genes as biomarker signatures and these identified biomarkers might be helpful to accurately identify tumors of unknown origin.
Identifiants
pubmed: 30535858
doi: 10.1007/s13258-018-0773-2
pii: 10.1007/s13258-018-0773-2
doi:
Substances chimiques
Biomarkers, Tumor
0
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
431-443Références
Forensic Sci Int Genet. 2009 Mar;3(2):80-8
pubmed: 19215876
Bioinformatics. 2008 Jul 1;24(13):i86-95
pubmed: 18586749
Bioinformatics. 2000 Oct;16(10):906-14
pubmed: 11120680
Genome Biol. 2014;15(12):550
pubmed: 25516281
J Biomed Inform. 2011 Aug;44(4):529-35
pubmed: 21241823
BMC Genomics. 2011 Dec 23;12 Suppl 5:S1
pubmed: 22369383
Bioinformatics. 2003 Jan;19(1):37-44
pubmed: 12499291
Proc Natl Acad Sci U S A. 2004 Jan 20;101(3):811-6
pubmed: 14711987
Science. 1999 Jan 1;283(5398):83-7
pubmed: 9872747
Bioinformatics. 2007 Aug 15;23(16):2147-54
pubmed: 17586552
Cancer Res. 2001 Jun 1;61(11):4320-4
pubmed: 11389052
J Forensic Sci. 2007 Nov;52(6):1252-62
pubmed: 17868268
Nucleic Acids Res. 2016 Jul 8;44(W1):W90-7
pubmed: 27141961
PLoS One. 2010 Dec 21;5(12):e14305
pubmed: 21200431
Bioinformatics. 2007 Oct 1;23(19):2507-17
pubmed: 17720704
Forensic Sci Int Genet. 2010 Jul;4(4):244-56
pubmed: 20457026
Nat Genet. 2000 May;25(1):25-9
pubmed: 10802651
Comput Biol Chem. 2005 Feb;29(1):37-46
pubmed: 15680584
J Biomed Inform. 2010 Feb;43(1):15-23
pubmed: 19647098
BMC Bioinformatics. 2012 Nov 13;13:298
pubmed: 23148517
J Theor Biol. 2011 Feb 7;270(1):56-62
pubmed: 21056045
N Engl J Med. 2001 Jun 28;344(26):2028-9
pubmed: 11430337
J Biomed Inform. 2010 Feb;43(1):81-7
pubmed: 19699316
Science. 1999 Oct 15;286(5439):531-7
pubmed: 10521349
J Clin Oncol. 2004 May 1;22(9):1564-71
pubmed: 15051756
Forensic Sci Int Genet. 2012 Jul;6(4):452-60
pubmed: 22001154