In many large-scale machine learning applications, data are accumulated over time, and thus, an appropriate model should be able to update in an online style. In particular, it would be… Click to show full abstract
In many large-scale machine learning applications, data are accumulated over time, and thus, an appropriate model should be able to update in an online style. In particular, it would be ideal to have a storage independent from the data volume, and scan each data item only once. Meanwhile, the data distribution usually changes during the accumulation procedure, making distribution-free one-pass learning a challenging task. In this paper, we propose a simple yet effective approach for this task, without requiring prior knowledge about the change, where every data item can be discarded once scanned. We also present a variant for high-dimensional situations, by exploiting compressed sensing to reduce computational and storage complexity. Theoretical analysis shows that our proposal converges under mild assumptions, and the performance is validated on both synthetic and real-world datasets.
               
Click one of the above tabs to view related content.