Energy conservation of large data centers for high performance computing workloads, such as deep learning with Big Data, is of critical significance, where cutting down a few percent of electricity… Click to show full abstract
Energy conservation of large data centers for high performance computing workloads, such as deep learning with Big Data, is of critical significance, where cutting down a few percent of electricity translates into million-dollar savings. This work studies energy conservation on emerging CPU-GPU hybrid clusters through dynamic voltage and frequency scaling (DVFS). We aim at minimizing the total energy consumption of processing a batch of offline tasks or a sequence of real-time tasks under deadline constraints. We derive a fast and accurate analytical model to compute the appropriate voltage/frequency setting for each task, and assign multiple tasks to the cluster with heuristic scheduling algorithms. In particular, our model stresses the nonlinear relationship between task execution time and processor speed for GPU-accelerated applications, for more accurately capturing real-world GPU energy consumption. In performance evaluation driven by real-world power measurement traces, our scheduling algorithm shows comparable energy savings to the theoretical upper bound. With a GPU scaling interval where analytically at most 36% of energy can be saved, we record 33-35% of energy savings. Our results are applicable to energy management on modern heterogeneous clusters.
               
Click one of the above tabs to view related content.