Most existing approaches often utilize the pre-fixed structure and large number of labeled data for training complex deep models, which are difficult to implement on incremental scenarios. As a matter… Click to show full abstract
Most existing approaches often utilize the pre-fixed structure and large number of labeled data for training complex deep models, which are difficult to implement on incremental scenarios. As a matter of fact, real-world data is always in stream form. Thereby, there exits two challenges for building incremental deep models: a) Capacity Scalability. The entire training data is not available before learning the task. It is a challenge to make the deep model structure scale with streaming data for flexible model evolution and faster convergence. b) Capacity Sustainability. The distribution streaming data usually changes in nature (concept drift), thus it is necessary to update the model while preserving previous knowledge for overcoming the catastrophic forgetting. To this end, we develop an incremental deep model (IDM), which expands the network structure according to streaming data and slows down forgetting with the adaptive fisher regularization. However, IDM ignores another significant challenge with streaming data: c) Capacity Demand. Training a deep model always needs a large amount of labeled data, whereas it is almost impossible to label all unlabeled instances in real time. The core problem is to select a small number of the most discriminative instances to label while keeping the predictive accuracy of the model. Thereby, we focus on the online semi-supervised learning scenario with abrupt changes in data distribution, and further improve IDM to a cost-effective incremental deep model (CE-IDM), which can adaptively select the most discriminative newly coming instances for query to reduce the manual labeling costs. Specifically, CE-IDM adopts a novel extensible deep network structure by using an extra attention model for hidden layers. Based on the adaptive attention weights, CE-IDM develops a novel instance selection criterion by jointly estimating unlabeled instances’ representative and informative degree to satisfy the capacity demand. With the newly labeled instances, CE-IDM can quickly update the model with adaptive depth from streaming data and enable capacity scalability. Also, we address capacity sustainability by exploiting the attention based fisher information matrix, which can slow down the forgetting in consequence. Finally, CE-IDM can deal with the three capacity challenges methioned above in a unified framework. We conduct extensive experiments on real-world data and show that CE-IDM outperforms the state-of-the-art methods with a substantial margin.
               
Click one of the above tabs to view related content.