Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] BLAS=0 and no GPU usage when built with Llama-cpp with GPU support #2107

Open
8 of 9 tasks
KianRK opened this issue Oct 18, 2024 · 1 comment
Open
8 of 9 tasks
Labels
bug Something isn't working

Comments

@KianRK
Copy link

KianRK commented Oct 18, 2024

Pre-check

  • I have searched the existing issues and none cover this bug.

Description

I built llama-cpp as described for GPU support and also ran the (slightly modified) command: "MAKE_ARGS='-DGGML_CUBLAS=on' poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python==0.2.89 numpy==1.26.0 MarkupSafe==2.0.1"
to which the server starts but still show BLAS=0 and nvidia-smi shows no GPU usage.

I have a freshly installed cuda 12.6 with the correct driver and nvcc is also showing no problems.

I installed all the packages in a dedicated python 3.11 venv environment, but still it does not work.

I am starting the server with "PGPT_PROFILES=local make run", but Im not sure if this is correct, since documentation seem a bit ambiguous.

Steps to Reproduce

Build llama-cpp with cuda support
run MAKE_ARGS='-DGGML_CUBLAS=on' poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python==0.2.89 numpy==1.26.0 MarkupSafe==2.0.1 in virtual environment with python 3.11

Expected Behavior

After starting server it should say BLAS=1 and GPU should be used

Actual Behavior

prints BLAS=0 and no GPU usage

Environment

Ubuntu 20.04 with python 3.11 venv

Additional Information

No response

Version

0.6.2

Setup Checklist

  • Confirm that you have followed the installation instructions in the project’s documentation.
  • Check that you are using the latest version of the project.
  • Verify disk space availability for model storage and data processing.
  • Ensure that you have the necessary permissions to run the project.

NVIDIA GPU Setup Checklist

  • Check that the all CUDA dependencies are installed and are compatible with your GPU (refer to CUDA's documentation)
  • Ensure an NVIDIA GPU is installed and recognized by the system (run nvidia-smi to verify).
  • Ensure proper permissions are set for accessing GPU resources.
  • Docker users - Verify that the NVIDIA Container Toolkit is configured correctly (e.g. run sudo docker run --rm --gpus all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi)
@KianRK KianRK added the bug Something isn't working label Oct 18, 2024
@Gavin-oleary
Copy link

This is what ended up working for me
"$env:cmake_ARGS = '-DGGML_CUDA=on'
$env:PGPT_PROFILES = 'local'"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants