R-HEFS: Rough set based heterogeneous ensemble feature selection method for medical data classification.

Algorithms Bayes Theorem Cluster Analysis Support Vector Machine

Classification Ensemble feature selection Medical data Rough set Stability

Journal

Artificial intelligence in medicine

ISSN: 1873-2860

Titre abrégé: Artif Intell Med

Pays: Netherlands

ID NLM: 8915031

Informations de publication

Date de publication:
04 2021

Historique:

received: 04 06 2020

revised: 11 02 2021

accepted: 21 02 2021

entrez: 20 4 2021

pubmed: 21 4 2021

medline: 19 8 2021

Statut: ppublish

Résumé

Feature selection is one of the trustworthy processes of dimensionality reduction technique to select a subset of relevant and non-redundant features from large datasets. Ensemble feature selection (EFS) approach is a recent technique aiming at accumulating diversity in the subset of selected features. It improves the performance of learning algorithms and obtains more stable and robust results. In this paper, a novel rough set theory (RST) based heterogeneous EFS method (R-HEFS) is proposed for selecting the less redundant and highly relevant features during the aggregation of diverse feature subsets by applying the feature-class, feature-feature rough dependency and feature-significance measures. In R-HEFS five state-of-the-art RST based filter methods are used as a base feature selectors. Experiments are carried out on 10 benchmark medical datasets collected from the UCI repository. For the imputation of the missing values and discretization of the continuous features, k nearest neighbor (kNN) imputation method and RST based discretization techniques are applied. The effectiveness of the proposed R-HEFS method is evaluated and analyzed by using four benchmark classifiers viz., Naïve Bayes (NB), random forest (RF), support vector machine (SVM), and AdaBoost. The proposed R-HEFS method turns out to be effective by removing the non-relevant and redundant features during the process of aggregation of base feature selectors and it assists to increase the classification accuracy. Out of 10 different medical datasets, on 7 datasets, R-HEFS has achieved better average classification accuracy. So, the overall results strongly suggest that the proposed R-HEFS method can reduce the dimension of large medical datasets and may help the physicians or medical experts to diagnose (classify) different diseases with lesser computational complexities.

Identifiants

DOI: 10.1016/j.artmed.2021.102049 PMID: 33875164

pubmed: 33875164

pii: S0933-3657(21)00042-7

doi: 10.1016/j.artmed.2021.102049

pii:

doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

Pagination

102049

R-HEFS: Rough set based heterogeneous ensemble feature selection method for medical data classification.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Pagination

Informations de copyright

Auteurs

Rubul Kumar Bania (RK)

Anindya Halder (A)

Articles similaires

Selecting optimal software code descriptors-The case of Java.

Exploring blood-brain barrier passage using atomic weighted vector and machine learning.

Multilabel SegSRGAN-A framework for parcellation and morphometry of preterm brain in MRI.

An arithmetic operation P system based on symmetric ternary system.

Classifications MeSH