Future machine learning (ML) powered applications, such as autonomous driving and augmented reality, involve training and inference tasks with timeliness requirements and are communication- and computation-intensive, which demands the edge… Click to show full abstract
Future machine learning (ML) powered applications, such as autonomous driving and augmented reality, involve training and inference tasks with timeliness requirements and are communication- and computation-intensive, which demands the edge learning framework. The real-time requirements drive us to go beyond accuracy for ML. In this article, we introduce the concept of timely edge learning, aiming to achieve accurate training and inference while minimizing the communication and computation delay. We discuss key challenges and propose corresponding solutions from data, model, and resource management perspectives to meet the timeliness requirements. In particular, for edge training, we argue that the total training delay rather than rounds should be considered, and propose data or model compression, and joint device scheduling and resource management schemes for both centralized training and federated learning systems. For edge inference, we explore the dependency between accuracy and delay for communication and computation, and propose dynamic data compression and flexible pruning schemes. Two case studies show that the timeliness performance, including the training accuracy under a given delay budget and the completion ratio of inference tasks within deadline, are highly improved with the proposed solutions.
               
Click one of the above tabs to view related content.