Outlier removal is vital in machine learning. As massive unlabeled data are generated rapidly today, eliminating outliers from noisy data in a fast and unsupervised manner is gaining increasing attention… Click to show full abstract
Outlier removal is vital in machine learning. As massive unlabeled data are generated rapidly today, eliminating outliers from noisy data in a fast and unsupervised manner is gaining increasing attention in practical applications. This paper tackles this challenging problem by proposing a novel Recurrent Adaptive Reconstruction Extreme Learning Machine (RAR-ELM). Specifically, with the given noisy data collection, RAR-ELM recurrently learns to reconstruct data and automatically excludes those data with high reconstruction errors as outliers by a novel adaptive labeling mechanism. Compared with existing methods, the proposed RAR-ELM enjoys three major merits: first, RAR-ELM inherits the fast and sound learning property of original extreme learning machine (ELM). RAR-ELM can be implemented at a tens or hundreds of times faster speed while achieving a superior or comparable outlier removal performance to existing methods, which makes RAR-ELM particularly suitable for application scenarios like real-time outlier removal; secondly, instead of priorly specifying a decision threshold, RAR-ELM is able to adaptively find a reasonable decision threshold when processing data with different proportions of outliers, which is vital to the case of unsupervised outlier removal where no prior knowledge of outliers in the data is available; thirdly, we also propose Online Sequential RAR-ELM (OS-RAR-ELM) can be implemented by an online or sequential mode, which makes RAR-ELM easily applicable to massive noisy data or online sequential data. Extensive experiments on various datasets reveal that the proposed RAR-ELM can realize faster and better unsupervised outlier removal in contrast to existing methods.
               
Click one of the above tabs to view related content.