-
Notifications
You must be signed in to change notification settings - Fork 838
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add extend_dictionary
in dictionary builder for improved performance
#6875
Conversation
Hey @tustvold, can you please re-review? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @rluvaton -- this looks like a nice improvement to me.
I added some small suggestions on how to improve the docstrings, but we could do that as a follow on PR as well |
Co-authored-by: Andrew Lamb <[email protected]>
applied |
Thanks @rluvaton |
|
apache#6875) * add `extend_dictionary` in dictionary builder for improved performance * fix extends all nulls * support null in mapped value * adding comment * run `clippy` and `fmt` * fix ci * Apply suggestions from code review Co-authored-by: Andrew Lamb <[email protected]> --------- Co-authored-by: Andrew Lamb <[email protected]>
Which issue does this PR close?
No issue
Rationale for this change
This is done to improve the performance when wanting to add already build dictionary to existing builder by taking advantage of the fact that we don't need to check the values for each key
What changes are included in this PR?
added
extend_dictionary
forPrimitiveDictionaryBuilder
and forGenericByteDictionaryBuilder
Are there any user-facing changes?
yes, these are public methods