-
Notifications
You must be signed in to change notification settings - Fork 341
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance of time panel #8224
Conversation
(this change makes a surprisingly large difference! gets `add_data` from 2.0 to 0.425)
gets me from 5.6 to 2.9ms in a heavy scenario with 95 calls per frame. Still not amazing
Web viewer built successfully. If applicable, you should also test it:
Note: This comment is updated whenever you push a commit. |
If you are using the latest |
I did, that was in the before state and was actually the first thing I committed :). Didn't bother though to restart my profiling for the PR description at an arbitrary commit and rather used latest main (which moved on quite a bit since I first removed those overhead inducing profile scopes). |
impl PathRecursiveChunksPerTimelineStoreSubscriber { | ||
/// Accesses the global store subscriber. | ||
/// | ||
/// Lazily registers the subscriber if it hasn't been registered yet. | ||
pub fn subscription_handle() -> ChunkStoreSubscriberHandle { | ||
static SUBSCRIPTION: OnceCell<ChunkStoreSubscriberHandle> = OnceCell::new(); | ||
*SUBSCRIPTION.get_or_init(|| ChunkStore::register_subscriber(Box::<Self>::default())) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any particular reason you made this a fully global scope subscriber as opposed to a per-recording thing directly in EntityDb?
If not, we should probably migrate this to EntityDb to be consistent with the others (and our general move towards less globally scoped things).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as is this thing is too specific to the timepanel, so I wouldn't want it to live in the entitydb directly. Having a subscription mechanism there instead would be great though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does late subscribers get all existing chunks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not today, no. Which is why I have to poke the subscriber in TimePanel::default
to ensure it's up and running before any data comes in
Related
What
Make the timepanel faster. Still plenty of room for improvement though.
Achieved by...
num_events_cumulative_per_unique_time_unsorted
DensityGraph::add_range
(this one was surprisingly significant)Testing
TL;DR: Minimized timepanel takes less than half the time it use. Other cases are better as well, but more noisy.
All testing done on the 2h airtraffic dataset (pre-saved rrd) with all panels hidden and views removed (unless noted otherwise) on my M1 Max. Note that the timeline perf is very dependent on amount of screen real estate - these tests were done maximized on the 14'' Mac's screen.
(Did some throw-away testing on other configs, but these are the ones we're comparing here!)
This round of optimization focused mostly on the "idle" case of having the time panel minimized. There are also some gains for the expanded case, but it's less significant - as illustrated by the profiler screenshots this is largely dominated
num_events_cumulative_per_unique_time
which I hope we can solve with chunk processors (and some untracked overhead I haven't looked into yet).Before:
Frame cpu time without profiler attached hovers around 4.2ms with some outliers.
Attaching the profiler doesn't tell us much since the profiler overhead drowns out everything else:
Restarting without profiler and expanding the time panel and making it as high as possible gives us about 12ms with frequent spikes beyond 13ms
Profiler overhead is ironically not as significant. Averaging a few frames tells us the time panel is at 11.5ms
After
Frame cpu time without profiler attached hovers between 1.5ms and 2.8ms, rather unsteady
Averaging a bunch of frames tells us that the data_density_graph takes now 0.55ms (imho still pretty bad for that it is)
Restarting without profiler and expanding the time panel and making it as high as possible gives us around 11ms
(important: this picture hasn't changed otherwise!)
The time panel now takes 9.4ms (that is definitely still very bad!), profiler overhead is still significant but it's manageable: