-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sim: Faster type coercion in toID
#10619
Sim: Faster type coercion in toID
#10619
Conversation
Ok, technically, this breaks for boolean and others. But there's no good reason to be passing those in the first place imo. Simple fix is to do '' + text more broadly, but I'll just put this on the backburner. Edit: aaand now I remember the original code doesn't handle bool anyway. Sigh. |
The reason why So, the fact that |
I understand that's the intent but based on #10549 (comment) The function's contract is just unnecessarily weak. |
That's true.
Yea. Changing that means that each of these functions would need a strict and a lax variant, thus doubling mental overhead from function names. And god forbid refactoring bugs, and having to think about |
To me, the more pressing source of that is how many states the system can be in. I think a simple system comes from predictable data flow. Lazy loading and Anyway, the fact that I can't meaningfully profile this ATM indicates I should be working on other problems. |
For the record, I do admit it was probably a mistake to use this function in hot code. I don't mind a separate |
51f29d3
to
f462716
Compare
This reverts commit f462716. It would break negatives and decimals.
Re-opened because this is now a measurable improvement. |
Thanks! |
Derivative of #10606
Stats
npm run full-test
got 2.5% faster(4,000,000 calls per row)
About 60M calls total. Before the test perf PR, there were ~100M.
So, this change makes sense because the overwhelming majority of inputs are already strings.
When I double the 'toLower().replace()' expression, runtime increased by 4%, indicating that there's still decent time to be saved here.
Next step would be to find the codepaths that trigger 'same as previous text'.
Up to 8.76 M / 60 M calls are redundant. (this catches "AAABBBCCC" sequences, but not "ABCABCABC").
It's not practical to remove all the repeated calls, but removing the easy cases is a good start.
Then, to search for the code paths that most frequently pass
ID
s to.get
ortoID
.(These measurements are probably heavily skewed towards sim code, so the last two measures might not be worthwhile in practice)
After that, it might help to do a preliminary regex to return early on strings that are already IDs.
After that, you could add a small cache of name->id, built with data from Dex. Other than speed, this can reduce sim/dex* mem usage by a little since you can re-use the string literal IDs when building the cache, instead of the computed IDs. String literals are interned so they already live forever. Might as well consider having dex* objects hold onto the interned strings instead of the computed strings.
profiling code