Performance comparison of linear and non-linear feature selection methods for the analysis of large survey datasets.
Journal
PloS one
ISSN: 1932-6203
Titre abrégé: PLoS One
Pays: United States
ID NLM: 101285081
Informations de publication
Date de publication:
2019
2019
Historique:
received:
22
02
2018
accepted:
25
02
2019
entrez:
22
3
2019
pubmed:
22
3
2019
medline:
4
12
2019
Statut:
epublish
Résumé
Large survey databases for aging-related analysis are often examined to discover key factors that affect a dependent variable of interest. Typically, this analysis is performed with methods assuming linear dependencies between variables. Such assumptions however do not hold in many cases, wherein data are linked by way of non-linear dependencies. This in turn requires applications of analytic methods, which are more accurate in identifying potentially non-linear dependencies. Here, we objectively compared the feature selection performance of several frequently-used linear selection methods and three non-linear selection methods in the context of large survey data. These methods were assessed using both synthetic and real-world datasets, wherein relationships between the features and dependent variables were known in advance. In contrast to linear methods, we found that the non-linear methods offered better overall feature selection performance than linear methods in all usage conditions. Moreover, the performance of the non-linear methods was more stable, being unaffected by the inclusion or exclusion of variables from the datasets. These properties make non-linear feature selection methods a potentially preferable tool for both hypothesis-driven and exploratory analyses for aging-related datasets.
Identifiants
pubmed: 30897097
doi: 10.1371/journal.pone.0213584
pii: PONE-D-18-05871
pmc: PMC6428288
doi:
Types de publication
Journal Article
Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, Non-P.H.S.
Langues
eng
Sous-ensembles de citation
IM
Pagination
e0213584Déclaration de conflit d'intérêts
The authors have declared that no competing interests exist.
Références
Drugs Today (Barc). 2008 Dec;44(12):895-904
pubmed: 19198699
Lancet. 2002 Jan 26;359(9303):281-6
pubmed: 11830193
N Engl J Med. 2005 Jan 20;352(3):245-53
pubmed: 15659724
J Physiol. 2000 Aug 1;526 Pt 3:695-702
pubmed: 10922269
J Health Soc Behav. 1996 Mar;37(1):104-20
pubmed: 8820314
J Aging Health. 2013 Aug;25(5):758-75
pubmed: 23751894
Appetite. 2008 Jan;50(1):43-9
pubmed: 17614159
Ann Appl Stat. 2009 Jan 1;3(4):1266-1269
pubmed: 20574547
Int J Epidemiol. 2012 Dec;41(6):1729-36
pubmed: 23108707
Health Rep. 1995;7(1):29-38, 31-42
pubmed: 7578995
Brief Bioinform. 2013 May;14(3):315-26
pubmed: 22786785
J Intern Med. 1997 Oct;242(4):313-21
pubmed: 9366810
J Gerontol B Psychol Sci Soc Sci. 2014 Jan;69(1):123-34
pubmed: 24128991
PeerJ. 2017 May 16;5:e3323
pubmed: 28533971
Int J Epidemiol. 2014 Feb;43(1):34-41
pubmed: 24585852
Addiction. 2003 Sep;98(9):1209-28
pubmed: 12930209
Int Psychogeriatr. 2001;13 Supp 1:7-18
pubmed: 11892976
Bioinformatics. 2007 Oct 1;23(19):2507-17
pubmed: 17720704
Public Health Nutr. 1998 Sep;1(3):157-67
pubmed: 10933413