Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster Embeddings #284

Open
srv1n opened this issue May 3, 2024 · 1 comment
Open

Faster Embeddings #284

srv1n opened this issue May 3, 2024 · 1 comment
Labels
🪄 enhancement additions to the software examples touches the examples 🚀 performance there is something that should be faster

Comments

@srv1n
Copy link

srv1n commented May 3, 2024

Great crate!

I was able to speed up embeddings by making the following changes -

  1. expose n_ubatch
  2. setting n_ubatch and n_batch to 2048
  3. initialize llamabatch with n-tokens with 2048
  4. updating line 65 to check on n_batch size instead of n_ctx. (Details below)

Line 65 - if (batch.n_tokens() as usize + tokens.len()) > n_ctx {

this needs to be n_batch & not n_ctx ( you can refer to the original llama example -
https://github.com/ggerganov/llama.cpp/blob/master/examples/embedding/embedding.cpp (line 164) - if (batch.n_tokens + n_toks > n_batch) {

@MarcusDunn
Copy link
Contributor

Thanks for the issue, would love a PR to this effect.

@MarcusDunn MarcusDunn added 🪄 enhancement additions to the software 🚀 performance there is something that should be faster examples touches the examples lib things that effect the library itself labels May 8, 2024
@MarcusDunn MarcusDunn removed the lib things that effect the library itself label May 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🪄 enhancement additions to the software examples touches the examples 🚀 performance there is something that should be faster
Projects
None yet
Development

No branches or pull requests

2 participants