Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU扩容 #2226

Closed
1 of 3 tasks
mandone opened this issue Sep 4, 2024 · 6 comments
Closed
1 of 3 tasks

GPU扩容 #2226

mandone opened this issue Sep 4, 2024 · 6 comments
Labels
Milestone

Comments

@mandone
Copy link

mandone commented Sep 4, 2024

System Info / 系統信息

V100 * 8

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?

  • docker / docker
  • pip install / 通过 pip install 安装
  • installation from source / 从源码安装

Version info / 版本信息

13.3

The command used to start Xinference / 用以启动 xinference 的命令

xinference-local --host 0.0.0.0 --port 8080

Reproduction / 复现过程

目前在机器上用了两张V100卡部署GLM4,现在发现显存不太够用,如何无缝衔接的再添加两张显卡

Expected behavior / 期待表现

目前在机器上用了两张V100卡部署GLM4,现在发现显存不太够用,如何无缝衔接的再添加两张显卡

@XprobeBot XprobeBot added the gpu label Sep 4, 2024
@XprobeBot XprobeBot added this to the v0.15 milestone Sep 4, 2024
@qinxuye
Copy link
Contributor

qinxuye commented Sep 4, 2024

n-gpu 指定卡的个数。

@qinxuye qinxuye closed this as completed Sep 4, 2024
@mandone
Copy link
Author

mandone commented Sep 4, 2024

那是要重新启动后台服务? @qinxuye

@qinxuye
Copy link
Contributor

qinxuye commented Sep 4, 2024

不需要,停掉模型重新 launch。

@mandone
Copy link
Author

mandone commented Sep 4, 2024

也就是模型短暂是不可用的对吧

@xiaoyesoso
Copy link

如何不停模型服务,直接扩容呢?

@qinxuye
Copy link
Contributor

qinxuye commented Nov 25, 2024

开源不支持这个特性。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants