A well-known approach for generating custom hardware with high throughput and low resource usage is modulo scheduling, in which the number of clock cycles between successive inputs [the initiation interval… Click to show full abstract
A well-known approach for generating custom hardware with high throughput and low resource usage is modulo scheduling, in which the number of clock cycles between successive inputs [the initiation interval (II)] can be lower than the latency of the computation. The II is traditionally an integer, but in this article, we explore the benefits of allowing it to be a rational number. A rational II can be interpreted as the average number of clock cycles between successive inputs. Since the minimum rational II can be less than the minimum integer II, higher throughput is possible; moreover, allowing rational IIs gives more options in a design-space exploration. We formulate rational-II modulo scheduling as an integer linear programming (ILP) problem that is able to find latency-optimal schedules for a fixed rational II. We also propose two heuristic approaches that make rational-II scheduling more feasible: one based on identifying strongly connected components in the data-flow graph, and one based on iteratively relaxing the target II until a solution is found. We have applied our methods to a standard benchmark of hardware designs, and our results demonstrate an average speedup with respect to II of $1.24\times $ in 35% of the encountered scheduling problems compared to state-of-the-art formulations.
               
Click one of the above tabs to view related content.