gunicorn, torch, and cuda #243

keighrim · 2024-09-25T13:20:03Z

Bug Description

As found in clamsproject/app-whisper-wrapper#24 (comment), when a CLAMS app runs in HTTP + production mode (app.py --production) with CUDA device support, it runs over gunicorn wsgi with multiple workers. It seems that some torch-based CLAMS apps running under the scenario spawns multiple python processes and load multiple copies of the torch model to memory, resulting in OOM errors at some points.

Reproduction steps

pick a computer with a CUDA device (nvidia gpu).
run whisper wrapper v10 (https://apps.clams.ai/whisper-wrapper/v10/) in the production mode.
send multiple POST (annotate) requests simultaneously or with short gaps between.
watch VRAM saturation via e.g. nvidia-smi or a similar monitoring util.

Expected behavior

The app should reuse the already-loaded checkpoint/model in the memory. Instead, the app loads the model for each request and then doesn't release the model after the process is completed.

Log output

No response

Screenshots

No response

Additional context

Also, it's very likely that this issue shares the same root cause with clamsproject/app-doctr-wrapper#6.

The text was updated successfully, but these errors were encountered:

…lamsproject/clams-python#243

keighrim added the 🐛B Something isn't working label Sep 25, 2024

clams-bot added this to infra Sep 25, 2024

github-project-automation bot moved this to Todo in infra Sep 25, 2024

keighrim added a commit to clamsproject/app-whisper-wrapper that referenced this issue Oct 1, 2024

disabled production mode in dockerized env, a stopgap workaround for c…

8b55549

…lamsproject/clams-python#243

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gunicorn, torch, and cuda #243

gunicorn, torch, and cuda #243

keighrim commented Sep 25, 2024 •

edited

Loading

gunicorn, torch, and cuda #243

gunicorn, torch, and cuda #243

Comments

keighrim commented Sep 25, 2024 • edited Loading

Bug Description

Reproduction steps

Expected behavior

Log output

Screenshots

Additional context

keighrim commented Sep 25, 2024 •

edited

Loading