Batch mode active learning, where a batch of samples is simultaneously selected and labeled, is a challenging task. The challenge lies in how to maintain the informativeness and keep the… Click to show full abstract
Batch mode active learning, where a batch of samples is simultaneously selected and labeled, is a challenging task. The challenge lies in how to maintain the informativeness and keep the diversity of selected samples concurrently. We propose a novel batch mode active learning that balances the informativeness and representativeness using multi-set clustering. Our method utilizes a sequential active learner to retain the informativeness by providing a ranking of unlabeled samples and constructing multiple informative sets for the subsequent clusterings. K-means clustering is used to minimize the redundancy among these samples and to improve the representativeness. Finally, the optimal batch chosen is the one minimizing the expected predictive variance on all the data. Our experimental results on a large number of benchmark datasets demonstrate excellent performance of the proposed method in comparison with current state-of-the-art batch mode active learning approaches.
               
Click one of the above tabs to view related content.