ABSTRACT Learning a classifier from imbalanced data is a challenging problem in Machine learning. A dataset is said to be imbalanced when the number of instances belonging to one class… Click to show full abstract
ABSTRACT Learning a classifier from imbalanced data is a challenging problem in Machine learning. A dataset is said to be imbalanced when the number of instances belonging to one class is much less than the number of instances belonging to the other class. Classifiers that proves efficient on standard data fail when the data is imbalanced as they are over trained by the majority class instances. Since class imbalance is a common characteristic of real-world data, the need for better classifiers becomes essential. This paper proposes a novel instance-based classification algorithm called Weighted Pattern Matching based Classification (PMC+) for classifying imbalanced data. PMC+ classifies unlabelled instances by computing the absolute difference between the feature values of the instances in the dataset and the unlabelled instance. PMC+ employs a simple classification procedure with weights and shows reasonably good performance. To improve the performance of PMC+, Fireworks based Feature and Weight Selection algorithm based on the idea of PMC+ has been proposed. PMC+ is evaluated on 44 binary imbalanced datasets and 15 multiclass imbalanced datasets. Although PMC+ does not employ a resampling or cost-sensitive method, experiments show that PMC+ is effective for classification of imbalanced data. The results of the experiments were validated using various non-parametric statistical tests.
               
Click one of the above tabs to view related content.