Abstract Clustering Algorithms have recently attracted considerable attention in machine learning applications due to their high efficiency. However, the existing clustering methods still have some issues that need to be… Click to show full abstract
Abstract Clustering Algorithms have recently attracted considerable attention in machine learning applications due to their high efficiency. However, the existing clustering methods still have some issues that need to be further solved. For example, most existing methods convert one type of feature into another type, which ignores the specific properties of data. In addition, most of them consider entire features, which may lead to complexity in computation and result in sub-optimal performance. To address the above problems, this paper proposes a novel method for clustering categorical and numerical features based on Feature Space Instance Cluster Closeness Metric (FSICCM). In the first stage, given training data, FSICCM exploits the similarity metric for numerical features. In the second stage, we design a novel metric for categorical features. Meanwhile, we design a new learning algorithm to cluster mixed datasets. Extensive experimental results on benchmark datasets show that the proposed FSICCM outperforms several state-of-art clustering methods in terms of accuracy and efficiency.
               
Click one of the above tabs to view related content.