The Evidential K-Nearest Neighbor (EK-NN) classification rule provides a global treatment of uncertainty and imprecision in class labels, and has been widely used in pattern recognition. Nevertheless, EK-NN still suffers… Click to show full abstract
The Evidential K-Nearest Neighbor (EK-NN) classification rule provides a global treatment of uncertainty and imprecision in class labels, and has been widely used in pattern recognition. Nevertheless, EK-NN still suffers from the fixed presupposition of hyper-parameter K without prior knowledge, due to the different spatial distribution of neighbors of each pattern in Euclidean space. More concretely, neighbors of some patterns may provide confusing information and then derive wrong classification results. To address this issue, we propose a sparse reconstructive evidential K-NN (SEK-NN) classifier, appropriately determining an individual K for each pattern and mapping the correlations between patterns from Euclidean space to a sparse reconstructed space. To match with this sparse reconstructed space, SEK-NN supersedes the Euclidean distance by correlation coefficients to measure the dissimilarities between patterns. When handling high-dimensional data, a parallel version of SEK-NN is implemented under the Apache Spark to speed up the parameter estimation. We respectively test SEK-NN and parallel SEK-NN over 19 middle dimensional datasets, 1 middle volume and 4 high-dimensional datasets that are up to 100 thousand of dimensions. Experimental results show that SEK-NN has great prediction performance and parallel SEK-NN is able to appropriately tackle high-dimensional datasets.
               
Click one of the above tabs to view related content.