Use multiple GPUs to process queue #1816

theodufort · 2024-11-11T17:32:09Z

I am trying to use both of my GPUs who are passed through to my docker container.

services: faster-whisper-server-cuda: image: fedirz/faster-whisper-server:latest-cuda build: dockerfile: Dockerfile.cuda context: . platforms: - linux/amd64 - linux/arm64 restart: unless-stopped ports: - 8162:8000 environment: - WHISPER__MODEL=deepdml/faster-whisper-large-v3-turbo-ct2 - WHISPER__INFERENCE_DEVICE=cuda - WHISPER__COMPUTE_TYPE=int8 - WHISPER__NUM_WORKERS=4 - WHISPER__CPU_THREADS=4 - WHISPER_DEVICE=cuda - DEFAULT_LANGUAGE=en - PRELOAD_MODELS=["deepdml/faster-whisper-large-v3-turbo-ct2"] volumes: - hugging_face_cache:/root/.cache/huggingface privileged: true deploy: resources: reservations: devices: - driver: nvidia count: all capabilities: [gpu] volumes: hugging_face_cache:
I tried everything but it won't use more than 1 GPU even if:

The text was updated successfully, but these errors were encountered:

minhthuc2502 · 2024-11-19T09:56:12Z

Consider adding device_index=[0,1] when set up your Dockerfile.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use multiple GPUs to process queue #1816

Use multiple GPUs to process queue #1816

theodufort commented Nov 11, 2024

minhthuc2502 commented Nov 19, 2024 •

edited

Loading

Use multiple GPUs to process queue #1816

Use multiple GPUs to process queue #1816

Comments

theodufort commented Nov 11, 2024

minhthuc2502 commented Nov 19, 2024 • edited Loading

minhthuc2502 commented Nov 19, 2024 •

edited

Loading