Automatic parallelization of sequential programs combined with autotuning is an alternative to manual parallelization. With wider research directions and the increased number of performance tuning tools that have been developed,… Click to show full abstract
Automatic parallelization of sequential programs combined with autotuning is an alternative to manual parallelization. With wider research directions and the increased number of performance tuning tools that have been developed, it becomes harder to choose a particular tuning tool to use. This paper reviews the fundamentals of different performance optimization and tuning techniques. It also surveys many tuning frameworks and classifies them into different groups based on their criteria. Developing benchmarks for HPC and validating their accuracy have been demanding tasks for computer architects, researchers, and application developers. In addition to providing a survey of performance tuning tools, we also performed a detailed review of current benchmarks and discussed the requirements for future benchmarks. We performed a detailed comparison of these tuning tools based on other features such as speedup and infrastructure details. We believe that this paper will be a very useful resource for the parallel computing community, especially for early-stage parallel computing and performance researchers to gain exposure to the existing performance optimization options.
               
Click one of the above tabs to view related content.