LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

Agglomerative Memory and Thread Scheduling for High-Performance Ray-Tracing on GPUs

Ray-tracing rendering has long been considered as a promising technology to enable a higher level of visual experience. The democratization of the ray-tracing rendering to consumer platforms, however, poses significant… Click to show full abstract

Ray-tracing rendering has long been considered as a promising technology to enable a higher level of visual experience. The democratization of the ray-tracing rendering to consumer platforms, however, poses significant challenges to rendering hardware and software due to its highly irregular computing patterns. In fact, modern ray-tracing techniques typically depend on a tree-based acceleration structure to reduce the computing complexity of intersection testing of rays and graphics primitives. The traversal by a massive number of rays on a graphics processing unit (GPU) incurs a significant amount of irregular memory traffic, which turns out to be a major stumbling block for real-time performance. In this work, a scheduling mechanism, so-called Agglomerative Memory and Thread Scheduling, is proposed to unleash the inherence parallelism in the ray-tracing process on GPUs. It is associated with a tile-based ray-tracing framework in which the acceleration structure (i.e., KD-tree in this work) is partitioned into subtrees that can be completely loaded into the on-chip L1 cache inside a streaming multiprocessor. An effective scheduling mechanism collects threads with regard to the subtrees hit by their respective rays and regroup threads into warps for dispatching. In addition, subtrees are dynamically preloaded into the L1 cache of multiprocessors in an on-demand fashion. The proposed scheduler can be integrated on today’s high-end GPUs with only minor overhead. Microarchitecture simulation results prove that the proposed framework significantly improves memory efficiency and outperforms a traditional GPU microarchitecture by 47.4% for average.

Keywords: ray tracing; memory thread; agglomerative memory; ray; thread scheduling

Journal Title: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Year Published: 2022

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.