Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

框架vllm输出截断,但是官方vllm启动和transformers运行模型都不 #314

Open
2 tasks done
TLL1213 opened this issue Sep 28, 2024 · 2 comments
Open
2 tasks done

Comments

@TLL1213
Copy link

TLL1213 commented Sep 28, 2024

提交前必须检查以下项目 | The following items must be checked before submission

  • 请确保使用的是仓库最新代码(git pull),一些问题已被解决和修复。 | Make sure you are using the latest code from the repository (git pull), some issues have already been addressed and fixed.
  • 我已阅读项目文档FAQ章节并且已在Issue中对问题进行了搜索,没有找到相似问题和解决方案 | I have searched the existing issues / discussions

问题类型 | Type of problem

模型推理和部署 | Model inference and deployment

操作系统 | Operating system

Linux

详细描述问题 | Detailed description of the problem

PORT=6006

model related

MODEL_NAME=qwen2
MODEL_PATH=qwen2-7B-Instruct
PROMPT_NAME=qwen2

own

MAX_NUM_SEQS=4096
CONTEXT_LEN = 4096

rag related

EMBEDDING_NAME=
RERANK_NAME=

api related

API_PREFIX=/v1

vllm related

ENGINE=vllm
TRUST_REMOTE_CODE=true
TOKENIZE_MODE=auto
TENSOR_PARALLEL_SIZE=1
DTYPE=auto

TASKS=llm

TASKS=llm,rag

上面是运行的配置文件,我尝试过使用transformers运行,也尝试过命令python -m vllm.entrypoints.openai.api_server --model qwen2-7B-Instruct --port 8080 --served-model-name qwen2运行,后面二者都不会产生截断问题,当使用该项目启动时,便会存在截断问题,大概生成六百字左右就开始截断,模型是我微调过的模型,主要任务是生成长文本。

Dependencies

vllm 0.4.3

运行日志或截图 | Runtime logs or screenshots

甲乙双方各持一份,具有

我截取了最后截断的不烦,“具有”两个字之后输出突然戛然而止

@xusenlinzy
Copy link
Owner

xusenlinzy commented Sep 29, 2024

CONTEXT_LEN=4096的意思是输入+输出一共4096个token,如果任务是生成长文本可以调大一点,比如CONTEXT_LEN=8192或者更大,另外调用的时候也需要指定max_tokens为你希望生成的最大token数

@TLL1213
Copy link
Author

TLL1213 commented Sep 29, 2024

CONTEXT_LEN=4096的意思是输入+输出一共4096个token,如果任务是生成长文本可以调大一点,比如CONTEXT_LEN=8192或者更大,另外调用的时候也需要指定max_tokens为你希望生成的最大token数

感谢您的回答

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants