DNA microarray data analysis is infamous due to a massive number of features, imbalanced class distribution, and limited available samples. In this paper, we focus on high-dimensional multi-class imbalanced problems.… Click to show full abstract
DNA microarray data analysis is infamous due to a massive number of features, imbalanced class distribution, and limited available samples. In this paper, we focus on high-dimensional multi-class imbalanced problems. The high dimensional and multi-class imbalanced problem has posed acute challenges for the conventional classifiers to effectively perform classification tasks on both the minority and majority classes. Numerous efforts have been devoted to addressing either high dimensionality dataset or class imbalance problems. Nonetheless, few methods have been proposed to address the intersection of multi-class imbalanced and high-dimensional problems concurrently due to their intricate interactions. This paper presents novel hybrid algorithms for feature selection with the high dimensional multi-class imbalanced problem using multiple filter-based rankers (MFR) and hybrid Grasshopper optimization algorithm (GOA). The Simulated Annealing (SA) algorithm is incorporated into GOA. SA is used to enhance the best solution found by the GOA algorithm. The aim of using the SA here is to tackle the slow convergence and improve the exploitation by searching the high-quality regions found by the GOA. The experimental results confirm the effectiveness of the proposed methods in improving the classification performance in terms of area under the curve (AUC) compared to other well-known methods, which guarantees the ability of the proposed methods in searching the feature space and identifying very robust and discriminative features that best predict the minority class.
               
Click one of the above tabs to view related content.