LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

An extensive empirical comparison of k-means initialisation algorithms

Photo from wikipedia

The k-means clustering algorithm, whilst widely popular, is not without its drawbacks. In this paper, we focus on the sensitivity of k-means to its initial set of centroids. Since the… Click to show full abstract

The k-means clustering algorithm, whilst widely popular, is not without its drawbacks. In this paper, we focus on the sensitivity of k-means to its initial set of centroids. Since the cluster recovery performance of k-means can be improved by better initialisation, numerous algorithms have been proposed aiming at producing good initial centroids. However, it is still unclear which algorithm should be used in any particular clustering scenario. With this in mind, we compare 17 such algorithms on 6,000 synthetic and 28 real-world data sets. The synthetic data sets were produced under different configurations, allowing us to show which algorithm excels in each scenario. Hence, the results of our experiments can be particularly useful for those considering k-means for a non-trivial clustering scenario.

Keywords: comparison means; initialisation algorithms; empirical comparison; initialisation; extensive empirical; means initialisation

Journal Title: IEEE Access
Year Published: 2022

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.