Energy is now critical in all aspects of computing. We address a class of programs that includes so-called “stencil computations.” We address energy optimization of such programs. Since optimizing for… Click to show full abstract
Energy is now critical in all aspects of computing. We address a class of programs that includes so-called “stencil computations.” We address energy optimization of such programs. Since optimizing for speed alone already minimizes energy for most components, we seek to further improve the energy consumption by reducing the total number of off-chip memory accesses without sacrificing execution time. Our strategy uses two-level tiling: we first partition the iteration space into “passes,” each of which is tiled and parallelized. Here, the schedules that map the original program to the multi-pass, parametrically tiled code are specified by polynomials. They are more general than affine multidimensional schedules used by state of the art polyhedral compilers, so generating such codes automatically is an important open problem, and goes beyond the motivation of energy efficiency. We develop a parametric tiled code generator supporting our energy-efficient parallelization strategy. We give a simple linear regression model for energy as a function of performance counters. Our experimental validation on three platforms shows a reduction by about 74 percent (resp. 75 and 67 percent) of the dynamic memory energy consumption on an 8-core Xeon E5-2650 v2 (resp. 6-core Xeon E5-2620 v2 and 6-core Xeon E5-2602 v3). This leads to a reduction in the total energy of the program by 2 to 14 percent.
               
Click one of the above tabs to view related content.