-
Notifications
You must be signed in to change notification settings - Fork 914
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable get_token_stream
to include LineEnd
tokens with optional parameter.
#15605
Enable get_token_stream
to include LineEnd
tokens with optional parameter.
#15605
Conversation
Signed-off-by: Suraj Aralihalli <[email protected]>
Signed-off-by: Suraj Aralihalli <[email protected]>
Signed-off-by: Suraj Aralihalli <[email protected]>
Signed-off-by: Suraj Aralihalli <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This adds another step where we need to remove these LineEnd
before tree algorithms.
Do we need LineEnd
tokens? if this is for finding the row number of tokens, it's possible to calculate using StructBegin, StructEnd.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One high-level comment -
Signed-off-by: Suraj Aralihalli <[email protected]>
Signed-off-by: Suraj Aralihalli <[email protected]>
Signed-off-by: Suraj Aralihalli <[email protected]>
I and @shrshi discussed about a profile of @revans2 's prototype https://github.com/revans2/spark-rapids-jni/pull/new/get_json_obj_experiment.CUDF A few outcomes of our meeting:
|
@SurajAralihalli @shrshi @karthikeyann is this PR something we still eventually want to get in once it's suitably updated, or is it a prototype that can be closed now? |
I think this PR can be dropped. |
Description
This PR adds parameter
LineEndTokenOption
to theget_token_stream
andprocess_token_stream
functions, enabling LineEnd tokens in the output. Also retained original declaration ofget_token_stream
to maintain backward compatibility.Checklist