LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

Graph Enhanced Fuzzy Clustering for Categorical Data Using a Bayesian Dissimilarity Measure

Photo from wikipedia

Categorical data are widely available in many real-world applications, and to discover valuable patterns in such data by clustering is of great importance. However, the lack of a decent quantitative… Click to show full abstract

Categorical data are widely available in many real-world applications, and to discover valuable patterns in such data by clustering is of great importance. However, the lack of a decent quantitative relationship among categorical values makes traditional clustering approaches, which are usually developed for numerical data, perform poorly on categorical datasets. To solve this problem and boost the performance of clustering for categorical data, we propose a novel fuzzy clustering model in this article. At first, by approximating the maximum a posteriori (MAP) estimation of a discrete distribution of data partition, a new fuzzy clustering objective function is designed for categorical data. The Bayesian dissimilarity measure is formulated in this objective to tackle the subtle relationships between categorical values efficiently. Then, to further enhance the performance of clustering, a novel Kullback–Leibler divergence-based graph regularization is integrated into the clustering objective to exploit the prior knowledge on datasets, for example, the information about correlations of data points. The proposed model is solved by the alternative optimization and the experimental results on the synthetic and real-world datasets show that it outperforms the classical and relevant state-of-the-art algorithms. We also present the parameter analysis of our approach, and conduct a comprehensive study on the effectiveness of the Bayesian dissimilarity measure and the KL divergence-based graph regularization.

Keywords: fuzzy clustering; bayesian dissimilarity; categorical data; dissimilarity measure

Journal Title: IEEE Transactions on Fuzzy Systems
Year Published: 2023

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.