-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Very high CPU usage on MacOS #10261
Comments
This is a long-standing bug, but it's nice to see it seemingly isolated to Mac. Could we see a tracing log? https://github.com/bevyengine/bevy/blob/main/docs/profiling.md I'd also like to try stripping out even more things from the |
I've created a tracing log using Is there any way I can send it to you, via Perfetto or otherwise? |
Maybe a GDrive link? Feel free to delete it after a couple days. @superdump it looks like we're getting hit by excessively long prepare systems again 🤔 |
@alice-i-cecile - here's a Google Drive link for the trace, I hope it works for you: https://drive.google.com/file/d/1DxPcWasjVDRJDlL1_PSQh-5qhpxEeA3p/view?usp=sharing |
Trace downloaded and open locally, thank you very much! Pulling this up in Perfetto, I'm still just seeing extremely long Prepare systems. I wonder if the way that wgpu polls is different on Mac? Perhaps it's spinning a thread to try and vsync automatically? |
Not sure how helpful this is, but I did try a couple of the more simple But please keep in mind that I'm new to this entire ecosystem (including Rust itself), so YMMV... |
That's very helpful feedback. This might be related to #9964. |
If you reduce frame rate significantly, e.g. with |
Reduce frame rate to what? |
Like 1-10 FPS... |
I'm trying to reduce the amount of code necessary to reproduce out on hmans/bevy-mrp-macos-performance#1 after more looking, aren't we hitting the warning from the profiling docs (https://github.com/bevyengine/bevy/blob/main/docs/profiling.md#runtime):
? If that's the case, I'd love a dedicated issue to that. |
Seems related to #5713 |
I noticed that my MacBook was running hot, and I found this issue. Upon running the starting example with |
So I dug into this about 5 months ago, and found the same problem somewhere in the render pipeline. The discussion & profiling results are in the discord thread below. It looked like it may have been a contention issue in wgpu. That was about where I stopped because needed to make a wgpu-only poc to see if it was a wgpu-on-mac issue. Also this research was done on an Intel Mac. https://discord.com/channels/691052431525675048/743663924229963868/1119076766405886004 from the discord:
|
I've just checked the current |
We are interested in this improving! But this needs further investigation to pinpoint: it's still not at all clear where in the stack this is occurring. Does this occur with |
I just tried exactly this today (current |
Great :) I would double-check at https://github.com/gfx-rs/wgpu/releases/tag/v0.19.4 which is the version Bevy is currently on, but I would suspect that there seems to be something wrong with Bevy's implementation then. It looks like the various Prepare steps are at fault. I'll ask the rendering experts if they might have insight there. I suspect we're spin locking while waiting to coordinate between the various steps somewhere, which is manifesting as high CPU usage without actually doing useful work. Are you able to achieve reasonable results (more than 50k entities) on the Bevymark example? |
@hmans could you try forcing a scale_factor of 1.0 for the window and see if it makes a difference. For some reason macos reports a really high scale factor by default. |
I tested with 0.11.3 and 0.14.0-rc.2 and observe the same. Using scale_factor of 1.0 makes no difference. It hovers around 40-50% CPU in Activity Monitor, top, htop. I looked at a GPU frame trace in Xcode and it's only running the no_camera_clear_pass and using microseconds of time. Oddly it said that pass was using 10800 vertices. No idea why. I looked at a CPU profile using Instruments and that shows no significant CPU usage. Running for 20s put CPU cycles spent for frame updates (at 120Hz) at about the same as plugin initialisation. |
bevy-mrp-macos-performance.gputrace.zip |
And just to note that I think tracy traces showing waiting on prepare set systems makes sense as they are blocked by waiting on vsync for a new swapchain texture. |
I've tried 0.14 and the issue is still persisting. Since wgpu itself does not seem to have this issue, from the perspective of a naive outsider who's not familiar with either code bases, my best guess would be that wgpu uses vsync out of the box (idling until it's time to render the next frame) and Bevy doesn't (at least on macOS); however, even the "fps_overlay" example that renders literally nothing uses ~100% CPU while displaying an FPS count of 100 (on my 100Hz screen) when focused, and ~60 FPS when unfocused. So it's probably not vsync vs. unlocked. A couple of folks on Reddit pointed me towards the (new?) |
I know I've taken my good time with doing this (apologies), but I now got around to checking that specific version, and its Bevy 0.14 appears to use wgpu 0.20, which exhibits the same performance. 19% still feel a little too high for what the example is doing, but it's nowhere near the 50%-200% that I'm seeing with simple Bevy examples. As mentioned on Reddit earlier today, I'm (stupidly and naively, and very likely unsuccessfully) trying to track this issue down myself now. Just wanted to post these numbers here for reference. |
Can you try with |
i didn't bother
And the result with just Actually it look same all over setting. Something really weird here. |
What are the chances this issue is related to how bevy is calling winit? Bevy's entire update loop is currently running in the See |
Hey I was just reading the issue, that's is highly likely to be the issue. It should however not appear when the app is compiled in release mode. If anyone can provide a recent MPC example of this problem I would gladly help pinpoint the exact issue. I've worked closely with the macOS windowing api's so if the problem is located in there, it should be easy to spot |
I've done some basic profiling and it does look like the cpu usage is completely bevy sided, in release mode the time spend on the main update loop divided by 16,7 roughly equals the cpu usage. The high cpu usage that i've found is more likely due to blocking operations, either due to multi-core multi-threading overhead, or async overhead. |
If this still helps - I'm here with a M1 chip (Macbook Pro 2020 M1 16GB RAM).
One interesting thing I noticed is that the CPU jumps considerably when you start moving the mouse over the window while it's not focused. Here's a short example: CleanShot.2024-12-07.at.20.39.31.mp4(not sure if it's relevant, just thought it might be) Note: A very comparable result happens with several examples from the crate that I tried. Note 2:
Does not make a difference compared to just lmk if I can help reproducing anything, or trying out a fix 🙏🏻 |
I ran into similar issue on iOS (not Mac though). A simple example like https://bevyengine.org/examples/3d-rendering/3d-scene/ running on iOS (with bevy 0.14.2) has 100%+ CPU usage. I have not tried the --release flag yet. |
Summary
On MacOS, even simple Bevy apps cause very high CPU load; for example, even a blank app that just loads
DefaultPlugins
to render a blank window sits at ~50% CPU usage on an Apple Silicon M1. Add a couple of rendered meshes and postprocessing and it'll quickly go up to about ~90% CPU.On Windows and Linux, the same app will comfortable sit below 1% (even with some light rendering.)
Additional notes:
opt-level = 3
etc. trick from the docs.--release
build doesn't make a difference.Bevy version
0.11.3
(but I also tried66f72dd
, the newest commit onmain
at the time.)Relevant system information
What you did
What went wrong
Expected behavior: the app should cause CPU load in the single-digit percentage range, like it does on Windows (<1% CPU, on a Ryzen 7.)
Actual behavior: the app caused ~50% CPU load.
The text was updated successfully, but these errors were encountered: