AbstractNaive Bayes makes an assumption regarding conditional independence, but this assumption rarely holds true in real-world applications, so numerous attempts have been made to relax this assumption. However, to the… Click to show full abstract
AbstractNaive Bayes makes an assumption regarding conditional independence, but this assumption rarely holds true in real-world applications, so numerous attempts have been made to relax this assumption. However, to the best of our knowledge, few studies have assigned different weights to different attribute values. In this study, we propose a new paradigm for a simple, efficient, and effective attribute value weighting approach called the correlation-based attribute value weighting approach (CAVW), which assigns a different weight to each attribute value by computing the difference between the attribute value-class correlation (relevance) and the average attribute value-attribute value intercorrelation (average redundancy). In CAVW, we use the information theoretic method with a strong theoretical background to assign different weights to different attribute values. Two different attribute value weighting measures called the mutual information (MI) measure and the Kullback–Leibler (KL) measure are employed, and thus two different versions are created, which we denote as CAVW-MI and CAVW-KL, respectively. According to extensive empirical studies based on a collection of 36 benchmark datasets from the University of California at Irvine repository, CAVW-MI and CAVW-KL both obtained more satisfactory experimental results compared with the naive Bayesian classifier and other four existing attribute weighting methods, and our methods also maintain the simplicity of the original naive Bayes model.
               
Click one of the above tabs to view related content.