Emerging intelligent applications based on accurate and timely stream analytics require real-time CNN inference of massive data continuously generated at the pervasive end devices. Due to the resource constraints, neither… Click to show full abstract
Emerging intelligent applications based on accurate and timely stream analytics require real-time CNN inference of massive data continuously generated at the pervasive end devices. Due to the resource constraints, neither computing locally at end devices nor transmitting to remote servers is competent for computation-intensive CNN inference on large-volume images in real-time. Therefore, Collaborative Inference (CI), which conducts inference sequentially from the local device to the remote server with compressed intermediate inference data, is rapidly promoted. Due to the essential communication in collaboration, the CI efficiency is sensitive to network conditions, and will degrade under the unpredictable network fluctuations in practice, which may cause a severe delay in CI and degrade the responsiveness of stream analytics. For accurate and timely stream analytics in practical fluctuating networks, we present RTCoInfer, the real-time CI framework with run-time transmission adaption considering the network conditions. Specifically, we propose a novel Switchable CNN integrating CNNs with different compression rates on the partition layer for the run-time transmission adjustment, and construct a real-time controller determining the compression rate to maintain the real-time CI for stream analytics. Extensive experiments show that, compared with state-of-the-art methods, RTCoInfer achieves better efficiency and unprecedented resilience in real-time stream analytics.
               
Click one of the above tabs to view related content.