Clustering is an unsupervised learning technique used in data mining for finding groups with increased object similarity within but not between them. However, the absence of a-priori knowledge on the… Click to show full abstract
Clustering is an unsupervised learning technique used in data mining for finding groups with increased object similarity within but not between them. However, the absence of a-priori knowledge on the optimal clustering criterion, and the strong bias of traditional algorithms towards clusters with a specific shape, size, or density, raise the need for more flexible solutions to find the underlying structures of the data. As a solution, clustering has been modeled as an optimization problem using meta-heuristics for generating a search space to favor groups of any desired criterion. F1- ECAC is an evolutionary clustering algorithm with an objective function designed as a supervised learning problem, which evaluates the quality of a partition in terms of its generalization degree, or its capability to train an ensemble of classifiers. This algorithm is named after its previous version, ECAC (Evolutionary Clustering Algorithm Using Supervised Classifiers), considering its main point of difference, which is the inclusion of the F1-score instead of the Area Under the Curve metric in the objective function. F1- ECAC shows a significant increase in performance and efficiency to ECAC and is highly competitive to state-of-the-art clustering algorithms. The results demonstrate F1-ECAC’s benefits in usability in a wide variety of problems due to its innovative clustering criterion.
               
Click one of the above tabs to view related content.