"Cluster analysis of mixed data based on Feature Space Instance Cluster Closeness Metric"

Abstract Clustering Algorithms have recently attracted considerable attention in machine learning applications due to their high efficiency. However, the existing clustering methods still have some issues that need to be further solved. For example, most existing methods convert one type of feature into another type, which ignores the specific properties of data. In addition, most of them consider entire features, which may lead to complexity in computation and result in sub-optimal performance. To address the above problems, this paper proposes a novel method for clustering categorical and numerical features based on Feature Space Instance Cluster Closeness Metric (FSICCM). In the first stage, given training data, FSICCM exploits the similarity metric for numerical features. In the second stage, we design a novel metric for categorical features. Meanwhile, we design a new learning algorithm to cluster mixed datasets. Extensive experimental results on benchmark datasets show that the proposed FSICCM outperforms several state-of-art clustering methods in terms of accuracy and efficiency.

Keywords: cluster closeness; feature space; space instance; instance cluster; based feature; cluster

Journal Title: Chemometrics and Intelligent Laboratory Systems
Year Published: 2021

Link to full text (if available)

Share on Social Media: Sign Up to like & get
recommendations!
1

LAUSR

You are not signed in:

Sign Up!

Related content

More Information News Social Media Video Recommended