RPU: A Programmable Ray Processing Unit for Realtime
Ray Tracing
Sven Woop, Jörg Schmittler, and Philipp
Slusallek
accepted for: SIGGRAPH 2005
PDF (4,8 MB)
BIB-TEX (full version only in conference proceedings)
The RPU is a fully programmable ray tracing hardware architecture,
with support for programmable material, geometry and lighting. The RPU
combines the efficiency of GPUs with the advantages of ray
tracing. The instruction set of the RPU is GPU like, which is optimal
for shading purposes. In addition the RPU supports fast ray traversal
through an k-D tree using a dedicated hardware unit and recursive
function calls, usefull for recursive ray tracing. To increase
efficiency always 4 rays are handled in a packet and multi-threading
allows for high utilization of the hardware units.
A working prototype of this hardware architecture has been developed
based on FPGA technology. The ray tracing performance of the FPGA
prototype running at 66 MHz is comparable to the OpenRT ray tracing performance of a Pentium 4
clocked at 2.6 GHz, despite the available memory bandwith to our RPU
prototype is only about 350 MB/s. These numbers show the efficiency of
the design, and one might estimate the performance degrees reachable
with todays high end ASIC technology. High end graphics cards from
NVIDIA provide 23 times more programmable floating point performance
and 100 times more memory bandwidth as our prototype. The prototype
can be parallelized to several FPGAs, each holding a copy of the
scene. A setup with two FPGAs delivering twice the performance of a
single FPGA is running in our lab. Scalability to up to 4 FPGA has
been tested.
Screenshots are presented 1024x768 resolution with oversampling turned
on for most scenes. Please note that all lights, shadows and
reflections are calculated and no lightmaps or environmental maps have
been used. Detailed measurements of all scenes can be found in the
paper.
The following video is computed in realtime on two FPGAs cards.
Video : 36 MB, MPEG-4, 512 x 384
Spheres: 15000 triangles, 6 objects
Some spheres bouncing around. The spheres are analytically
intersected by a special geometry shader. The caustic in the center is
not computed physically correct, but approximated by a kind of shadow
shader.
Porsche: 82,836 triangles, 1 object
A Porsche 996 model, with realistic car paint and glass shader. One can see the correct reflection of the environment in the car.
Quake3-p: 52,790 triangles, 17 objects
This scene shows some animated monsters running around casting correct shadows.
Gael: 52,479 triangles, 1 object
Scene taken from UT2003 illuminated by a light source.
Conference: 282,805 triangles, 54 objects
A conference room, each chair is an object and can thus be moved around. The last image shows the edge filtering performed for adaptive oversampling. Only for the detected edges more rays are shoot dynamically to get high image quality.
Scene 6: 806 triangles, 1 object
A quite simple scene, including a single point lightsource.
Sven Woop, 28.06.05