LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

An intelligent clustering algorithm for high-dimensional multiview data in big data applications

Photo from wikipedia

Abstract There are many high-dimensional multiview data in various big data applications. It is very difficult to deal with those high-dimensional multiview data for the classic clustering algorithms, which consider… Click to show full abstract

Abstract There are many high-dimensional multiview data in various big data applications. It is very difficult to deal with those high-dimensional multiview data for the classic clustering algorithms, which consider all features of data with equal relevance. To tackle this challenging problem, this paper aims at proposing a novel intelligent weighting k-means clustering (IWKM) algorithm based on swarm intelligence. Firstly, the degree of coupling between clusters is presented in the model of clustering to enlarge the dissimilarity of clusters. Various weights of views and features are used in the weighting distance function to determine the clusters of objects. Secondly, to eliminate the sensitivity of initial cluster centers, swarm intelligence is utilized to find initial cluster centers, weights of views, and weights of features by a global search. Lastly, a precise perturbation is proposed to improve optimization performance of swarm intelligence. To verify the performance of clustering for high-dimensional multiview data, the experiments were performed by the evaluation metrics of Rand Index, Jaccard Coefficient and Folkes Russe in five big data applications on the two different computational platforms of apache spark and single node. The experimental results show that IWKM is effective and efficient in clustering of high-dimensional multiview data, and can obtain better performance than the other 5 kinds of approaches in these complicated data sets with more views and higher dimensions on apache spark and single node.

Keywords: multiview data; dimensional multiview; high dimensional; data applications; big data

Journal Title: Neurocomputing
Year Published: 2020

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.