|
Fragment Processing Features Rich instruction set
|
tarix | 23.06.2017 | ölçüsü | 9,66 Mb. |
|
Ray Tracing on Programmable GPUs
Fragment Processing Features Rich instruction set - No branching yet (see PS 3.0 spec)
Floating point - Arithmetic
- Texture memory
Dependent texturing Multipass rendering flow control
The Ray Engine [Carr02]
Ray Engine – Main Idea Ray-traingle intersection done by GPU - CPU-based renderer does everything else
Ray Engine Algorithm Renderer sends ray textures to GPU Renderer sends ‘triangles’ down pipeline - Vertex interpolants of a screen aligned quad
GPU performs ray-triangle intersection tests - Short fragment program
- Framebuffer stores closest hit point
Renderer reads back closest hit
Pixel Shader 1.4 Implementation
Full Precision Simulations
Ray Engine Results - 114 M ray-triangle intersections / s
Full precision simulator
Ray Engine Summary GPU performs ray-triangle intersection - CPU-based renderer does everything else
Raw ray-triangle intersection rate is faster than CPU based approach - Total rays processed per second is slower than CPU
Readback limited
Streaming Ray Tracer [Purcell02]
Streaming Ray Tracer – Main Ideas Entire ray tracing computation can be done efficiently on the GPU Stream processor abstraction for programmable fragment processor
Streaming Ray Tracer
GPU Abstraction Texture memory is memory - Think of dependent texture fetches as pointer dereferencing
Programmable fragment processor is a programmable stream processor - Think of multipass rendering as stream and kernel programming
Texture Memory Organization
Stream Programming Model Programmable fragment processor is essentially a stream processor - Stream is a set of data records
- Kernels operate on records
- Streams connect kernels together
- Kernels can read global memory
Streaming Flow Control
Multiple Rendering Passes
Multiple Rendering Passes
Multiple Rendering Passes
Multiple Rendering Passes
Multiple Rendering Passes
Demos
Demos
Demos
Demos
Streaming Ray Tracer Results Simulations - 50M – 200M ray-triangle intersections/s
Radeon 9700 Pro Implementation - 100M ray-triangle intersections/s
- 300K – 4.0M rays/s
Streaming Ray Tracer Summary Entire ray tracing computation can be mapped efficiently to the GPU Stream processor is a good abstraction for a programmable fragment processor
Dedicated Hardware Ray Tracing
Ray Tracing in Hardware Offline Rendering Interactive Rendering
SaarCOR – Main Idea Scalable and efficient real time hardware ray tracer - Implementation based on Saarland RTRT
SaarCOR Implementation Packet based ray tracer Several custom cores - Computational units
- Traversal, intersection, ray generation and shading
- Memory units
Multithreaded Standard DRAM memory on board Virtual memory support for large scenes Support for programmable shading
SaarCOR Architecture
SaarCOR Test Scenes
Simulated Performance
Simulated Bandwidth Usage
SaarCOR Summary Scalable and efficient - Requires fewer FP units than GeForce3
- Low bandwidth requirements
Fast frame rates
Conclusions for Part I
Conclusions Real time ray tracing advantages - Physically correct renderings
- High geometric complexity
- Shading flexibility
Several options for real time ray tracing
Backup
Acknowledgments Ian Buck, Bill Mark, Pat Hanrahan James ‘RTD’ Percy, Pradeep Sen, Eric Chan Matt Papakipos, Kurt Akeley - NVIDIA Bob Drebin, Mark Peercy – ATI Sponsors - ATI, MERL, NVIDIA, Sony, Sun
- DARPA
Ray-Triangle Intersection as a Crossbar
Rasterization as a Crossbar
Dostları ilə paylaş: |
|
|