Using GEMM files in fastertransformer_backend. #48
SnoozingSimian
started this conversation in
General
Replies: 1 comment 1 reply
-
You can refer this document https://github.com/NVIDIA/FasterTransformer/blob/main/docs/gptneox_guide.md#run-gpt-neox to generate the |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
While loading both GPTJ and GPT-NeoX models, I get the message
[WARNING] gemm_config.in is not found; using default GEMM algo
This suggests to me that there is a way to add gemm algos while loading these models in, I have generated the
gemm_config.in
for GPT-NeoX using the FasterTransformer binaries, but I don't know where to place this file so that it can be found by the backend.Is there any possible way to use it currently?
Beta Was this translation helpful? Give feedback.
All reactions