Update on the development branch #1955
kaiyux
announced in
Announcements
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
The TensorRT-LLM team is pleased to announce that we have pushed an update to the development branch (and the Triton backend) this July 16, 2024.
This update includes:
docs/source/speculative_decoding.md
.max_output_len
is removed fromtrtllm-build
command, if you want to limit sequence length on engine build stage, specifymax_seq_len
.cpp/include/tensorrt_llm/executor/version.h
file is going to be generated.Thanks,
The TensorRT-LLM Engineering Team
Beta Was this translation helpful? Give feedback.
All reactions