-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
QoL improvement for displaying GPU profiling information #59
Comments
I think this would be a great improvement! Possibly behind a feature toggle? If your gonna give this a shot and run into problems feel free to reach out and we can look at solutions together. |
One solution we discussed was to display a total duration "per thread" stream, ie. right after For recombining a CPU recording with the GPU timestamps we envision some small refactors to be able to take out the Finally, for mapping the GPU timeline to the CPU timeline we currently perform some improper tricks such as calling |
We currently do the same trick with I'm not too familiar with the code base yet but what your suggesting makes sense to me. @emilk Do you have any opinions around this? |
I'd implement the syscalls that feed Next we provide the I can PR this if everyone agrees on this solution :) |
If you want separate time-lines, then it sounds like you should just report CPU times and GPU times to two different places! Use |
The problem
Currently if you want to display GPU timestamps in puffin, you are forced to open a stream and manually convert all your measured timings from GPU timeline to the associated CPU timeline. Since GPU work is always executed an arbitrary time after a submit call, we would either need to:
A: Wait on a fence before proceeding to the next frame to associate GPU timings with the current measured frame
B: Read back n frames back and display that into the current puffin frame
C: Modifying a previously written frame with the measured GPU timings
In case A, puffin displays the total CPU frame plus the time elapsed from cpu work submission to gpu work execution, along with the total GPU frame. While technically correct, this results in a total frame time that is significantly higher compared to the actual frame time of an app, which is not super useful for profiling a total frametime since work overlaps in reality. This is also not an option in general as this forces single buffering.
In case B, puffin displays the measured frame n on the timeline before puffin's measured cpu frame times. This is also technically correct but again suffers from the same issue that case A has (total frame time not being super useful). However this allows for double/triple buffering without too much trouble.
Case C is not something that we are using puffin for, since we use it for realtime visualization of CPU and GPU workloads. This could however be worked around with by allowing a frame visualization delay, but that would be a different Issue alltogether.
Suggestion
I believe puffin would greatly benefit from the ability to separate timelines. Currently it's separated per thread but still shares the same timeline, thus sharing the total execution time. But for GPU timings it makes more sense to be able to define a separate timeline that has it's own "frame time", while potentially still being able to associate CPU and GPU work.
Alternatives
We currently have a little workaround locally that shifts the measured GPU timings to the start of the current puffin frame, essentially working around the total frametime if needed. However this workaround is not ideal and does not give the correct context to where the GPU workload was fired off.
The text was updated successfully, but these errors were encountered: