使用vllm启动glm-4v-9b服务，调用/v1/chat/completions报错 #630

OliverLiy · 2024-11-01T02:34:16Z

System Info / 系統信息

vllm 0.6.3.post1
transformers 4.46.1
glm-4v-9b

Who can help? / 谁可以帮助到您？

No response

Information / 问题信息

The official example scripts / 官方的示例脚本
My own modified scripts / 我自己修改的脚本和任务

Reproduction / 复现过程

1、启动脚本：CUDA_VISIBLE_DEVICES=5,7 python -m vllm.entrypoints.openai.api_server --model=/beeb/ap/iaf/models/modelscope/hub/glm-4v-9b --served-model-name=glm-4v-9b --device=cuda --port=10085 --host=0.0.0.0 --tensor-parallel-size=2 --dtype=auto --trust-remote-code
2、调用API：/v1/chat/completions

{
    "model": "glm-4v-9b",
    "messages": [
            {"role": "system", "content": "You are a helpful assistant."}, 
            {"role": "user", "content": "如何用python读取一个文件"} 
        ]
}

3、出现报错
As of transformers v4.44, default chat template is no longer allowed, so you must provide a chat template if the tokenizer does not define one.

Expected behavior / 期待表现

调用接口后回答对应的结果

sixsixcoder · 2024-11-01T05:49:59Z

请提供一下调用api的脚本

OliverLiy · 2024-11-01T05:55:32Z

请提供一下调用api的脚本

启动后Postman调用，也可以使用curl命令

curl --location --request POST 'http://136.1.5.93:10085/v1/chat/completions' \
--header 'Content-Type: application/json' \
--data-raw '{
    "model": "glm-4v-9b",
    "messages": [
        {
            "role": "system",
            "content": "You are a helpful assistant."
        },
        {
            "role": "user",
            "content": "如何用python读取一个文件"
        }
    ]
}'

OliverLiy · 2024-11-04T01:05:39Z

@sixsixcoder 似乎需要有chat template，GLM-4v-9b模型是否有模板呢

OliverLiy · 2024-11-04T08:07:08Z

@sixsixcoder modelscope处提交了一个pr，添加了chat_template，修改了该问题
https://www.modelscope.cn/models/ZhipuAI/glm-4v-9b/feedback/prDetail/17590

sixsixcoder · 2024-11-12T08:58:29Z

感谢您的贡献

zRzRzRzRzRzRzR · 2024-11-13T06:22:07Z

应该是vllm一定会调用template，这个建议是好的，非常感谢，我们会验证这个template能不能在transfomers上使用，直接一并合并

sixsixcoder self-assigned this Nov 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

使用vllm启动glm-4v-9b服务，调用/v1/chat/completions报错 #630

使用vllm启动glm-4v-9b服务，调用/v1/chat/completions报错 #630

OliverLiy commented Nov 1, 2024

sixsixcoder commented Nov 1, 2024

OliverLiy commented Nov 1, 2024

OliverLiy commented Nov 4, 2024

OliverLiy commented Nov 4, 2024

sixsixcoder commented Nov 12, 2024

zRzRzRzRzRzRzR commented Nov 13, 2024

使用vllm启动glm-4v-9b服务，调用/v1/chat/completions报错 #630

使用vllm启动glm-4v-9b服务，调用/v1/chat/completions报错 #630

Comments

OliverLiy commented Nov 1, 2024

System Info / 系統信息

Who can help? / 谁可以帮助到您？

Information / 问题信息

Reproduction / 复现过程

Expected behavior / 期待表现

sixsixcoder commented Nov 1, 2024

OliverLiy commented Nov 1, 2024

OliverLiy commented Nov 4, 2024

OliverLiy commented Nov 4, 2024

sixsixcoder commented Nov 12, 2024

zRzRzRzRzRzRzR commented Nov 13, 2024