Stream Compaction Status #3

WilliamKHo · 2017-11-25T07:46:12Z

branch

Okay, I finally brought my stream-compaction code into Pollux.

Bad news: It slows down rendering and resultant images are noticeably incorrect:

Good news: It can definitely be made to work and we would likely see really good wins

Info:

👍 Inspection of buffers through the GPU Frame Capture in XCode confirm that prefix-sum scan and scatter logic are working. (woot!) On top of that, compared to other kernel calls, the double kernel scan of scans procedure takes virtually no time at all (~1100μs for 1 million elements on my macbook 2013!)
👍 A slight bottleneck in the kern_evaluateRays kernel can be sped up with a little refactoring of the prefix sum scan kernel.
👍 kernComputeIntersections and kernShadeMaterials actually do get faster as a result of stream compaction due to early termination of coherent threads, it's just that the time cost of stream compaction is still too large.
👎 Far and away the major bottleneck is the kernScatterRays and kernCopyBack step, in which I scattered the unterminated rays into the second Ray buffer and copied them back to set all subsequent Rays bounces to 0. This was grossly naive, so this is not surprising. kernCopyBack can probably be replaced with a ping-ponging step between ray bounces. kernScatterRays however is still a read-and-write heavy kernel that will cost time, so the best way to account for that would be to better leverage the work that it allows us to not do.
👎 The visual bug of the black light is caused by the fact that rays that hit lights are terminated, and their remaining bounces are set to 0, which is currently the criteria by which rays get discarded. "Lit" rays thus never actually make it to FINAL_GATHER. RayCompaction could be tweaked to do a full partition on the array of all rays, hopefully a trivial task 😰 . This would probably also solve the striated visual bug above, which I believe is related to errors in the final buffer of arrays passed to FINAL_GATHER as a result of compaction.
👎 One other way to leverage stream compaction is to reduce the number of threads we dispatch for kernComputeIntersections, kernShadeMaterials, and RayCompaction. As far as I can tell, this would require more synchronization between CPU and GPU, and multiple commandEncoders per iteration, since the number of compacted rays can't be known ahead of time. I don't know if this has any significant drawbacks.

TL;DR
Steps to make Stream Compaction fast enough and work correctly

~~Eliminate kernEvaluateRays kernel and refactor to include in first prefix sum scan~~ (doesn't save much time overall, would increase code complexity unnecessarily
Ping-Pong Ray buffers between ray bounce calculations.
??? Second prefix-sum-scan on terminated rays so that final compacted Ray buffer is partitioned into unterminated and terminated rays. (I'm pretty sure this is the silver bullet)
??? FINAL_GATHER at every ray bounce. This might be too expensive, and I actually think the above solution would be better.
??? Find a way to dynamically change the number of threadGroups dispatched for ray bounce computations after eliminating terminated Rays

The text was updated successfully, but these errors were encountered:

WilliamKHo · 2017-11-25T07:57:37Z

@YoussefV see this update. I'm pretty sure I can get this working the way it's supposed to tomorrow, but I should sleep tonight. Feel free to look at the branch and make comments/criticisms.

WilliamKHo · 2017-11-25T22:48:20Z

It works but it's too slow. Need to pass a buffer containing information about rays culled at each iteration between shaders to leverage better and earlier thread termination, and so that kernScatterRays doesn't re-partition the entire array each and every time.

WilliamKHo · 2017-11-25T22:53:11Z

No ugly visual bugs though so that's good

WilliamKHo self-assigned this Nov 25, 2017

WilliamKHo changed the title ~~Stream Compaction Issues~~ Stream Compaction Status Nov 25, 2017

YVin3D self-assigned this Nov 26, 2017

YVin3D added the enhancement label Nov 26, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stream Compaction Status #3

Stream Compaction Status #3

WilliamKHo commented Nov 25, 2017 •

edited

Loading

WilliamKHo commented Nov 25, 2017

WilliamKHo commented Nov 25, 2017

WilliamKHo commented Nov 25, 2017

Stream Compaction Status #3

Stream Compaction Status #3

Comments

WilliamKHo commented Nov 25, 2017 • edited Loading

WilliamKHo commented Nov 25, 2017

WilliamKHo commented Nov 25, 2017

WilliamKHo commented Nov 25, 2017

WilliamKHo commented Nov 25, 2017 •

edited

Loading