Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace String with CompactString #620

Merged
merged 12 commits into from
Oct 23, 2024
Merged

Conversation

rben01
Copy link
Contributor

@rben01 rben01 commented Oct 12, 2024

Numbat stores lots and lots of strings. Most of these are pretty short: unit names, variable names, function names... names generally shorter than the 24-byte cutoff at which compact_str::CompactString moves its data from the stack to the heap. This means we can often get away with zero heap allocations for the strings Numbat stores.

This results in a massive improvement over master (far larger than my previous owned-string-to-borrowed-string changes).

$ git checkout compact-str && cargo bench
...
     Running benches/prelude.rs (target/release/deps/prelude-8391b515e47498e0)
Gnuplot not found, using plotters backend
Import prelude          time:   [23.292 ms 23.324 ms 23.358 ms]
                        change: [-19.229% -19.074% -18.924%] (p = 0.00 < 0.05)
                        Performance has improved.
...
$ git checkout master && cargo bench
...
     Running benches/prelude.rs (target/release/deps/prelude-bad416b45fab4d8e)
Gnuplot not found, using plotters backend
Import prelude          time:   [28.629 ms 28.690 ms 28.768 ms]
                        change: [+22.659% +23.007% +23.375%] (p = 0.00 < 0.05)
                        Performance has regressed.
...

@sharkdp
Copy link
Owner

sharkdp commented Oct 12, 2024

Very nice! I can confirm these results with a full-numbat-startup benchmark. master is 14% slower.

hyperfine -w10 './numbat-master -e ""' './numbat-compact-str -e ""' --export-markdown -
Command Mean [ms] Min [ms] Max [ms] Relative
./numbat-master -e "" 38.5 ± 1.1 36.3 43.1 1.14 ± 0.05
./numbat-compact-str -e "" 33.7 ± 1.0 31.8 37.0 1.00

One thing I would like to understand is the following: do we need to go "all in" on this? Or is there maybe one module/component which accounts for >90% of the speedup? The code changes are not too bad, but it's still minor a complication. Especially for contributors who are new to the code base.

@rben01
Copy link
Contributor Author

rben01 commented Oct 12, 2024

Right, it probably makes almost no different for e.g., Markup since Markup is already expected to be somewhat slow and isn't part of the core computational stuff.

@rben01
Copy link
Contributor Author

rben01 commented Oct 12, 2024

That said, I think the complexity is fairly low. to_string becomes to_compact_string, except on s: &'static str which becomes CompactString::const_new(s). format! becomes format_compact!. Almost the entire rest of the API is the same, except of course you can't .to_owned() into a CompactString and CompactString::replace returns a String, not another CompactString. So I don't think it's that much complexity. Figuring out where we need CompactString, where we don't, and documenting the difference (so that we don't get // TODO: replace with CompactString?), sounds like more complexity than just leaving it basically everywhere.

@sharkdp sharkdp merged commit 80b3f34 into sharkdp:master Oct 23, 2024
15 checks passed
@sharkdp
Copy link
Owner

sharkdp commented Oct 23, 2024

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants