Published November 2, 2024 | Version v2
Software Open

Generalizing Ray Tracing Accelerators for Tree Traversals on GPUs

  • 1. ROR icon Yonsei University
  • 2. ROR icon University of British Columbia
  • 3. ROR icon University of California, Riverside

Description

This artifact contains a Docker image, which holds the simulator and benchmark workloads for "Generalizing Ray Tracing Accelerators for Tree Traversals on GPUs" published in the MICRO 2024 conference.  The Docker image includes source code for our modified version of Vulkan-Sim and binaries for the benchmarks used in the paper and all necessary dependencies. We also include Python scripts for executing the simulations for our main results and generating the figure in our paper. 

More about the paper:

Tree traversal is a fundamental operation in many applications, such as database indexing and physics simulations. Although tree traversals feature high parallelism, they are inherently divergent and irregular, leading to inefficient performance on GPUs. Tree traversals are also prevalent in ray tracing, which is executed on dedicated Ray-Tracing Accelerators (RTAs) in modern GPUs to mitigate inefficiencies such as control flow divergence and underutilization of memory bandwidth by irregular memory accesses. In this paper, we propose the Tree Traversal Accelerator (TTA) to replicate the success of RTAs in ray tracing for general tree traversal applications. TTAs can handle tree structures and operations beyond those in ray tracing by modifying existing computing units in RTAs to support algorithms such as B-Tree search and radius search. However, TTAs still rely on fixed-function computations, making it challenging to fully support other tree-based applications such as N-Body simulation. Thus, we introduce TTA+ as an alternative design, which modularizes the RTA computing units and makes them programmable, trading some efficiency for flexibility. With less than 1% increase in RTA area, our proposals can achieve up to 5.4× speedup for B-Tree search, 1.7× for N-Body simulation, and 1.2× for select ray-tracing applications.

 

V2 Update:

Version 2 of this artifact includes more configurations of application binaries as well as additional Python scripts that are useful for recreating figures 12-20 in the paper. 

Files

Files (13.8 GB)

Name Size Download all
md5:a336e30a1cbaaf9c4c93aa3502cccb5f
13.8 GB Download