Hashing-Based Undersampling Ensemble for Imbalanced Pattern Classification Problems.
Journal
IEEE transactions on cybernetics
ISSN: 2168-2275
Titre abrégé: IEEE Trans Cybern
Pays: United States
ID NLM: 101609393
Informations de publication
Date de publication:
Feb 2022
Feb 2022
Historique:
pubmed:
1
7
2020
medline:
19
2
2022
entrez:
30
6
2020
Statut:
ppublish
Résumé
Undersampling is a popular method to solve imbalanced classification problems. However, sometimes it may remove too many majority samples which may lead to loss of informative samples. In this article, the hashing-based undersampling ensemble (HUE) is proposed to deal with this problem by constructing diversified training subspaces for undersampling. Samples in the majority class are divided into many subspaces by a hashing method. Each subspace corresponds to a training subset which consists of most of the samples from this subspace and a few samples from surrounding subspaces. These training subsets are used to train an ensemble of classification and regression tree classifiers with all minority class samples. The proposed method is tested on 25 UCI datasets against state-of-the-art methods. Experimental results show that the HUE outperforms other methods and yields good results on highly imbalanced datasets.
Identifiants
pubmed: 32598288
doi: 10.1109/TCYB.2020.3000754
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM