ExhauFS: exhaustive search-based feature selection for classification and survival regression.
Classification
ExhauFS
Exhaustive search
Feature selection
Survival regression
Journal
PeerJ
ISSN: 2167-8359
Titre abrégé: PeerJ
Pays: United States
ID NLM: 101603425
Informations de publication
Date de publication:
2022
2022
Historique:
received:
20
01
2022
accepted:
09
03
2022
entrez:
5
4
2022
pubmed:
6
4
2022
medline:
6
4
2022
Statut:
epublish
Résumé
Feature selection is one of the main techniques used to prevent overfitting in machine learning applications. The most straightforward approach for feature selection is an exhaustive search: one can go over all possible feature combinations and pick up the model with the highest accuracy. This method together with its optimizations were actively used in biomedical research, however, publicly available implementation is missing. We present ExhauFS-the user-friendly command-line implementation of the exhaustive search approach for classification and survival regression. Aside from tool description, we included three application examples in the manuscript to comprehensively review the implemented functionality. First, we executed ExhauFS on a toy cervical cancer dataset to illustrate basic concepts. Then, multi-cohort microarray breast cancer datasets were used to construct gene signatures for 5-year recurrence classification. The vast majority of signatures constructed by ExhauFS passed 0.65 threshold of sensitivity and specificity on all datasets, including the validation one. Moreover, a number of gene signatures demonstrated reliable performance on independent RNA-seq dataset without any coefficient re-tuning,
Identifiants
pubmed: 35378930
doi: 10.7717/peerj.13200
pii: 13200
pmc: PMC8976470
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Pagination
e13200Informations de copyright
©2022 Nersisyan et al.
Déclaration de conflit d'intérêts
The authors declare there are no competing interests.
Références
Sci Rep. 2020 Sep 23;10(1):15534
pubmed: 32968196
N Engl J Med. 2004 Dec 30;351(27):2817-26
pubmed: 15591335
Nat Biotechnol. 2005 Dec;23(12):1499-501
pubmed: 16333293
Nature. 2002 Jan 31;415(6871):530-6
pubmed: 11823860
Genome Biol. 2014;15(12):550
pubmed: 25516281
Nature. 2020 Sep;585(7825):357-362
pubmed: 32939066
Nat Methods. 2020 Mar;17(3):261-272
pubmed: 32015543
Breast Cancer Res Treat. 2009 Jul;116(2):303-9
pubmed: 18821012
BMC Med Res Methodol. 2017 Apr 7;17(1):53
pubmed: 28388943
J Clin Oncol. 2010 Sep 20;28(27):4111-9
pubmed: 20697068
Methods. 2016 Dec 1;111:21-31
pubmed: 27592382
BMC Med Genomics. 2018 Feb 13;11(Suppl 1):9
pubmed: 29504916
Brief Bioinform. 2008 Sep;9(5):392-403
pubmed: 18562478
Comput Struct Biotechnol J. 2019 Jun 22;17:854-861
pubmed: 31321001
Front Oncol. 2019 Nov 12;9:1207
pubmed: 31799184
Bioinformatics. 2007 Oct 1;23(19):2507-17
pubmed: 17720704
Genet Med. 2009 Jan;11(1):66-73
pubmed: 19125125
Mol Ther. 2020 Jan 8;28(1):157-170
pubmed: 31636041
Biostatistics. 2007 Jan;8(1):118-27
pubmed: 16632515
J Proteomics. 2013 Dec 06;94:279-88
pubmed: 24125731
PLoS One. 2020 Feb 12;15(2):e0228575
pubmed: 32049961
Sci Rep. 2015 Oct 08;5:14967
pubmed: 26446398
Front Genet. 2021 Dec 06;12:782699
pubmed: 34938324
Nucleic Acids Res. 2015 Oct 30;43(19):9158-75
pubmed: 26400174
Mol Oncol. 2015 Jun;9(6):1218-33
pubmed: 25771305
Semin Cancer Biol. 2017 Aug;45:50-57
pubmed: 27639751
Bioinformatics. 2004 Feb 12;20(3):307-15
pubmed: 14960456
Brief Bioinform. 2021 May 20;22(3):
pubmed: 34020547
RNA Biol. 2021 Oct 15;18(sup1):430-438
pubmed: 34286662
BMC Genomics. 2008 May 22;9:239
pubmed: 18498629
Nature. 2012 Oct 4;490(7418):61-70
pubmed: 23000897
Clin Cancer Res. 2017 Dec 15;23(24):7512-7520
pubmed: 28972043
Genes (Basel). 2019 Jan 28;10(2):
pubmed: 30696086
Comput Struct Biotechnol J. 2014 Nov 15;13:8-17
pubmed: 25750696
Bioinformatics. 2010 Jan 1;26(1):139-40
pubmed: 19910308
PLoS One. 2021 Apr 14;16(4):e0249424
pubmed: 33852600
Stat Methods Med Res. 2016 Aug;25(4):1359-80
pubmed: 23592714
Biochem Biophys Res Commun. 2011 Jun 10;409(3):424-9
pubmed: 21586272
Nature. 2012 Jul 18;487(7407):330-7
pubmed: 22810696
BMC Bioinformatics. 2008 Dec 29;9:559
pubmed: 19114008
BMC Med. 2006 Jun 30;4:16
pubmed: 16813654
Cancer Treat Res. 2002;113:59-70
pubmed: 12613350
Sci Rep. 2018 Feb 5;8(1):2418
pubmed: 29402894
Nat Med. 2019 Apr;25(4):656-666
pubmed: 30833750
Cancer Inform. 2007 Feb 11;2:59-77
pubmed: 19458758
Cancers (Basel). 2021 Aug 26;13(17):
pubmed: 34503106
Stat Med. 1997 Feb 28;16(4):385-95
pubmed: 9044528