Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Profiling support #743

Merged
merged 11 commits into from
Nov 19, 2024
Merged

Profiling support #743

merged 11 commits into from
Nov 19, 2024

Conversation

Vrixyz
Copy link
Contributor

@Vrixyz Vrixyz commented Sep 27, 2024

This PR uses https://crates.io/crates/profiling so we can choose any supported profiling options. This is also helpful for users to get an understanding of performances.

using tracy on all_examples3 (cargo run --bin all_examples3 --features profiling/profile-with-tracy)

Screenshot from 2024-09-27 12-01-10

Should we remove counters from rapier ? I imagine it would be great to be able to output the same kind of csv we're doing currently, but using this profiling option.

.vscode/settings.json Outdated Show resolved Hide resolved
@@ -21,6 +27,7 @@ anyhow = "1"
bitflags = "2"
# NOTE: we are not using the (more recent) urdf-rs crate because of https://github.com/openrr/urdf-rs/issues/94
xurdf = "0.2"
profiling = "1.0"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the profiling dependency there but I think I didn´t mark any functions

@@ -1552,6 +1552,7 @@ fn update_testbed(
if state.running == RunMode::Step {
state.running = RunMode::Stop;
}
profiling::finish_frame!();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

profiling::finish_frame!(); belongs here (in user code), as rapier isn't strictly the owner of the loop.

@Vrixyz Vrixyz requested a review from sebcrozet September 27, 2024 10:10
Copy link
Member

@sebcrozet sebcrozet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking great, thank you! A few remarks:

  • Does #[profiling::function] generate code or slow down compilation when no backend is specified? If that’s the case, then profiling should be behind a feature.
  • The flamegraph is expanding vertically a lot with empty space. Can the vertical size be constrained?
  • The table is also quite big vertically, it would be nice if it was smaller.
  • On the testbed, we should remove from the flamegraph and table anything that isn’t related to rapier (like all the wgpu stuffs).

Let’s keep the rapier counters for now, until I get more familiar with this new profiling tool.

.vscode/settings.json Outdated Show resolved Hide resolved
@Vrixyz
Copy link
Contributor Author

Vrixyz commented Nov 4, 2024

  • Does #[profiling::function] generate code or slow down compilation when no backend is specified? If that’s the case, then profiling should be behind a feature.

No runtime impact should appear: from their docs :

  • "If the end-user of your library doesn't use profiling, the macros in this crate will emit no code at all.

Concerning compilation time, I didn't notice impact.

See attached compiling timings differences from this branch (without profiling enabled) with master:

timings_profiling_vs_no_profiling.zip

@Vrixyz
Copy link
Contributor Author

Vrixyz commented Nov 4, 2024

  • The flamegraph is expanding vertically a lot with empty space. Can the vertical size be constrained?
  • The table is also quite big vertically, it would be nice if it was smaller.
  • On the testbed, we should remove from the flamegraph and table anything that isn’t related to rapier (like all the wgpu stuffs).

I imagine the best solution would be to filter performance information, unfortunately it's complicated to make on egui_puffin:

I can imagine a solution by providing an automatic zoom into our relevant time span, by finding harness::step_with_graphics scope id, then retrieving its scope data from its stream, like it's done for the built-in UI.

Figuring this out is not trivial and it would not support selecting multiple frames.

Strictly removing data from wgpu etc doesn't seem supported by profiling nor egui_puffin.

As an ideal solution would require potentially controversial PR to puffin_egui, I settled for a simple approach, with a default filter to puffin_egui, which highlights our rapier specific profiling data. This needs user manual zoom.

I removed some function profiling markers to help with information overload.

adapting the egui render of the puffing profiler is not trivial either, so I shoved it into its own window, which is collapsed by default:

image
image

@sebcrozet
Copy link
Member

Not sure if I’m missing something but I’m not seeing any profiling now:
image

Note that I modified the cargo patches to use the git dependency to your branch expose_ui_options for puffin_egui.

Copy link
Member

@sebcrozet sebcrozet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!
I don’t know when puffin will merge your changes. So I think that in the mean time we move the changes needing the cargo patches behind a feature (#[unstable-puffin-pr-235) so we can use it locally without blocking our capability to publish the testbed.

Cargo.toml Outdated
Comment on lines 42 to 44
# puffin_egui = { optional = true, path = "../puffin/puffin_egui" }
# puffin = { optional = true, path = "../puffin/puffin" }

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# puffin_egui = { optional = true, path = "../puffin/puffin_egui" }
# puffin = { optional = true, path = "../puffin/puffin" }

@Vrixyz
Copy link
Contributor Author

Vrixyz commented Nov 15, 2024

outside the scope of this PR, but running the testbed in AutoNoVsync mode is interesting to get less visual noise about wgpu : image

But this would probably need some adaptation to the loop to avoid being a strict fixed timestep each draw calls: currently: higher FPS means faster simulation speed.

@@ -34,9 +34,11 @@ default = ["dim2"]
dim2 = []
parallel = ["rapier/parallel", "num_cpus"]
other-backends = ["wrapped2d"]
profiling = ["dep:puffin_egui", "profiling/profile-with-puffin"]
unstable-puffin-pr-235 = []
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add comment to explain the patch needed

@Vrixyz Vrixyz merged commit 510237c into dimforge:master Nov 19, 2024
7 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants