[shortfin] Add C++ tokenizer wrapper library. #610

stellaraccident · 2024-11-26T03:48:45Z

This is gated by SHORTFIN_ENABLE_TOKENIZERS (presently off).
I'd like to either take over the wrapper or get Abort is not a great error handling strategy mlc-ai/tokenizers-cpp#50 before putting much weight on this.
There is no great C++ option for this component, so we go to the trouble of integrating a Rust component. We will need to do a bit of prep on our CI systems to enable this by default.
Python API will be added in a subsequent commit. This should be more efficient than the tokenizers Python API since we will allow direct access to the tokens vs doing a lot of conversions.
Size analysis: Prior to this patch, libshortfin was 1.8MB, which gave us an entire GPU and CPU runtime stack. After this patch (stripped) it is 8.4MB. Given how important the use case is, I'm willing to tolerate this for the moment. It seems like there is room for something better here, which is why I did not expose the underlying vendor'd API directly (edit: by switching to a nightly rust and activating a bunch of options from https://github.com/johnthagen/min-sized-rust, I was able to produce a binary that is 4.2MB, which is more reasonable).

* This is gated by SHORTFIN_ENABLE_TOKENIZERS (presently off). * I'd like to either take over the wrapper or get mlc-ai/tokenizers-cpp#50 before putting much weight on this. * There is no great C++ option for this component, so we go to the trouble of integrating a Rust component. We will need to do a bit of prep on our CI systems to enable this by default. * Python API will be added in a subsequent commit. This should be more efficient than the tokenizers Python API since we will allow direct access to the tokens vs doing a lot of conversions. * Obligatory language flame bait: Use Rust, they said. It's super efficient. Prior to this patch, libshortfin was 1.8MB, which gave us an entire GPU and CPU runtime stack. After this patch (stripped) it is 8.4MB. Given how important the use case is, I'm willing to tolerate this for the moment. It seems like there is room for something better here, which is why I did not expose the underlying vendor'd API directly.

ScottTodd

Generally LGTM. Thanks for filing those upstream issues to improve the binary size and error handling.

shortfin/CMakeLists.txt

shortfin/build_tools/cmake/shortfin_testing.cmake

shortfin/src/shortfin/components/tokenizers/CMakeLists.txt

marbre

I think Scott has already pointed out everything I might have said. No additional comments.

stellaraccident · 2024-11-27T00:09:16Z

PTAL

Progress on #130. The manylinux2014 image includes gcc 10.2.1 by default while manylinux_2_28 includes gcc 12.2.1. At one point we had warnings/errors building on the newer gcc version, but that is no longer the case. With the new Rust dependency coming from #610, we will likely want to revive https://github.com/nod-ai/base-docker-images/blob/main/dockerfiles/manylinux_x86_64.Dockerfile, add more dependencies there, then switch from the upstream `quay.io/...` image to that `ghcr.io/nod-ai/...` image. Tested locally with `OUTPUT_DIR="/tmp/wheelhouse" sudo -E ./build_tools/build_linux_package.sh`. If the nightly package build fails for some reason we can easily revert this.

stellaraccident requested review from ScottTodd and marbre November 26, 2024 03:49

ScottTodd reviewed Nov 26, 2024

View reviewed changes

marbre reviewed Nov 26, 2024

View reviewed changes

Address comments

8a6688c

ScottTodd approved these changes Nov 27, 2024

View reviewed changes

stellaraccident merged commit cdb4ccd into main Nov 27, 2024
13 of 19 checks passed

stellaraccident deleted the shortfin_bundle_tokenizers branch November 27, 2024 03:01

ScottTodd mentioned this pull request Nov 27, 2024

[shortfin] Upgrade package build dockerfile to manylinux_2_28. #627

Merged

ScottTodd mentioned this pull request Dec 11, 2024

Enable SHORTFIN_ENABLE_TOKENIZERS in Linux package builds #679

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[shortfin] Add C++ tokenizer wrapper library. #610

[shortfin] Add C++ tokenizer wrapper library. #610

stellaraccident commented Nov 26, 2024 •

edited

Loading

ScottTodd left a comment

marbre left a comment

stellaraccident commented Nov 27, 2024

[shortfin] Add C++ tokenizer wrapper library. #610

[shortfin] Add C++ tokenizer wrapper library. #610

Conversation

stellaraccident commented Nov 26, 2024 • edited Loading

ScottTodd left a comment

Choose a reason for hiding this comment

marbre left a comment

Choose a reason for hiding this comment

stellaraccident commented Nov 27, 2024

stellaraccident commented Nov 26, 2024 •

edited

Loading