when i use trlx ppotrainer train a model llama 13b model, but saved huggingface mode ,but when it inference , it has some strange keys ,and the inference result did not show ,it also have no error , it seems the result disapper #584

ldh127 · 2023-12-03T12:27:26Z

trainer = trlx.train(
reward_fn=reward_fn,
prompts=prompts,
eval_prompts=["习近平女儿"] * 4,
config=config,
)

trainer.save_pretrained('./rl_saved_finished_hf_1202', safe_serialization=False, heads_only=True)

the model can not inference right, it has no error ,but the result also disapper ,the code exit 0

No response

No response

The text was updated successfully, but these errors were encountered:

promiseve · 2024-02-24T08:23:23Z

Hey @ldh127 , did you manage to get around this ? I am having a similar issue at the moment.

Regards,
Promise.

ldh127 added the bug Something isn't working label Dec 3, 2023

Provide feedback