-
Notifications
You must be signed in to change notification settings - Fork 350
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LLamaSharp v0.15.0 broke cuda backend #909
Comments
Can you try testing with the current master branch? We've just merged in new binaries which will be the 0.16.0 release soon. |
how can i do that |
Just clone this repo and build an application to run in your server environment (e.g. one of the examples). |
so just run an example? is it preconfigured with cuda? |
By default the examples have |
ok now the gpu is working, but just at like 25%, how can i now test the master branch in my app? |
@martindevans Otherwise, I'll stay on 0.13.0 and KM 0.62.240605.1 :) |
why are you tagging him here for another issue? |
It's an open source project, issues will get fixed when someone who wants them fixed puts in the work!
Easiest way is probably to remove the nuget reference from your main project, and add a reference to your cloned copy of LLamaSharp. |
I'm sorry if I broke the rules. |
when will v0.16.0 be released? |
Hopefully this weekend. I'm going to be busy for the rest of September so I want to get it released before then if possible. |
hmm it is running now on 0.16.0 but its not working in docker, it works fine without it tho, are the libraries maybe not copied correctly? edit:
|
I don't personally know much about docker, but I know some people have reported issues before with the binaries not loading in certain docker environments. In those cases I think it was due to missing dependencies. Try cloning llama.cpp inside the container and compiling it, then using those binaries (ensure you use exactly the right version, see the bottom of the readme). |
Description
i have a linux server with a Quadro RTX4000, i have installed drivers, my app runs in a docker container as base image i used:
FROM nvidia/cuda:12.5.0-runtime-ubuntu22.04 AS base
in v0.13.0 it worked with gguf models and gpu support, but now i wanted to run it with LLama3.1 so i need to upgrade to v0.15.0 after upgrading it cant load the library anymore, if i install only the cpu backend it works but well my server has a gpu for a reason
Edit: full error
The text was updated successfully, but these errors were encountered: