Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for BYTES / string tensor types #8

Closed
njhill opened this issue Dec 3, 2021 · 10 comments
Closed

Support for BYTES / string tensor types #8

njhill opened this issue Dec 3, 2021 · 10 comments
Assignees
Labels
enhancement New feature or request

Comments

@njhill
Copy link
Member

njhill commented Dec 3, 2021

For BYTES I think the spec is currently ambiguous (to me at least). For example if strings are used then how should the encoding be determined. Also it's not clear how the byte lengths should be encoded when flattening. One option is to say that they must all be the same length and add a dimension to the shape for this. If their lengths can vary however then something extra is needed, like a corresponding integer tensor to hold them.

We can look at MLServer's BYTES string tests and base64 tests for reference.

@adriangonz
Copy link

Hey @njhill ,

This was quite confusing for me as well when doing it for MLServer. That was until I noticed that, in protobuf, the repeated bytes type (used for BYTES contents) actually is just an array where each element can be a variable-length bytes array (so each individual element of the array, is itself a variable-length bytes array). Therefore, it's alright to consider each bytes array as an individual element, so the following input is valid:

input = {
	"datatype": "BYTES",
	"data": [b"foo", b"asdasdasd"],
	"shape": [2]
}

Not sure if that may make it more confusing though - happy to clarify if you've got any questions!

@njhill
Copy link
Member Author

njhill commented Oct 11, 2022

Thanks @adriangonz! yes I'd also noticed that more recently. Along with content_type parameters this will be enough to implement.

I think the problem still applies to the raw_input_contents / raw_output_contents fields though - it doesn't seem that there's a way to use those currently with BYTES tensors. I guess that optimization is more valuable for primitive types anyhow.

@adriangonz
Copy link

Good point @njhill.

We've recently added support for those in MLServer. What we found is that Triton packs BYTES tensors into a single bytes array, where each string is preceded by its length. That way you can accommodate variable-length strings. For example, the BYTES tensor [b"foo", b"asdasd"] would get packed as 3foo6asdasd.

You can see this implemented in MLServer in the mlserver.raw module (which is pretty similar to Triton's own implementation - also linked there in a comment):

https://github.com/SeldonIO/MLServer/blob/f0748c7a14e4476f75b6154099247dc2c9b6d43b/mlserver/raw.py#L39-L55
https://github.com/SeldonIO/MLServer/blob/f0748c7a14e4476f75b6154099247dc2c9b6d43b/mlserver/raw.py#L58-L70

@njhill
Copy link
Member Author

njhill commented Oct 12, 2022

Thanks @adriangonz! I wasn't aware of that.

@njhill njhill self-assigned this Oct 18, 2022
njhill added a commit that referenced this issue Nov 23, 2022
Addresses #8

Current limitations:
- Does not obey request-level content_type request parameters (has to be set at tensor level)
- Always encodes response tensor bytes as base64

Signed-off-by: Nick Hill <[email protected]>
njhill added a commit that referenced this issue Nov 23, 2022
Addresses #8

Current limitations:
- Does not obey request-level content_type request parameters (has to be set at tensor level)
- Always encodes response tensor bytes as base64

Signed-off-by: Nick Hill <[email protected]>
njhill added a commit that referenced this issue Dec 8, 2022
Addresses #8

Current limitations:
- Does not obey request-level content_type request parameters (has to be set at tensor level)
- Always encodes response tensor bytes as base64

Signed-off-by: Nick Hill <[email protected]>
@njhill
Copy link
Member Author

njhill commented Dec 8, 2022

Now addressed via #17 with the following limitations:

  • Does not obey request-level content_type request parameters (has to be set at tensor level)
  • Always encodes response tensor bytes as base64

@lizzzcai
Copy link
Member

lizzzcai commented Jan 10, 2023

Hi @njhill , this is something we are confusing when comparing with the REST API in MLServer and the v2 spec, thanks for the fix.

want to confirm that with the above fix, the input accepts raw string but the output response will still be base64 encoded?

for example:

curl -i -X POST -H "Content-Type: application/json" "http://localhost:8008/v2/models/${MODEL_NAME}/infer" -d '{"inputs": [{ "name": "text", "shape": [2], "datatype": "BYTES", "data": ["I loved this food, it was very good", "I don't loved this food, it was not good"] }] }'

{"model_name":"custom-predictor__ksp-89efd40320","model_version":"v0.1.0","outputs":[{"name":"predictions","datatype":"BYTES","shape":[2],"parameters":{"content_type":{"ParameterChoice":{"StringParam":"str"}}},"data":["Y29tcGxpbWVudA==","Y29tcGxhaW50"]}]}%  # rather than ["compliment","complaint"]

and when will a new version of rest-proxy be released? thanks.

@njhill
Copy link
Member Author

njhill commented Jan 11, 2023

@lizzzcai yes that's exactly right. Just a small thing in your example though - the value of the returned content_type parameter will be "base64" rather than "str" ... reflecting the fact that the returned strings are b64 encoded.

@lizzzcai
Copy link
Member

@lizzzcai yes that's exactly right. Just a small thing in your example though - the value of the returned content_type parameter will be "base64" rather than "str" ... reflecting the fact that the returned strings are b64 encoded.

Thanks @njhill for your reply. We have tested the latest release and it works as expected. However, for the output response, will str be supported in the near future?

@njhill
Copy link
Member Author

njhill commented Jan 12, 2023

@lizzzcai we can open a new issue for it to replace this one :)

A complication is knowing when to encode as utf8 vs base64 because you can't encode arbitrary bytes as utf8. We could detect whether the bytes are valid utf8 but this would incur a performance overhead (espec. for non-text data cases).

and when will a new version of rest-proxy be released? thanks.

We are in the process of doing a 0.10.0 release of all the KServe components right now.

@njhill
Copy link
Member Author

njhill commented Jan 19, 2023

This is complete apart from the issues mentioned above and @lizzzcai has now opened #23 to track those.

@njhill njhill closed this as completed Jan 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants