Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regarding the issue of training templates in Qwen2VLDataCollator #57

Closed
Asunatan opened this issue Nov 28, 2024 · 6 comments
Closed

Regarding the issue of training templates in Qwen2VLDataCollator #57

Asunatan opened this issue Nov 28, 2024 · 6 comments

Comments

@Asunatan
Copy link

Hello, I am a beginner in the field of VLM and have a question regarding the training template issue. In the Qwen2VLDataCollator you provided, I noticed there are some additional fields.
image
This differs from directly applying
apply_chat_template_text= self.processor.apply_chat_template(cur_text, tokenize=False, add_generation_prompt=True,)which seems to result in some differences. Could this lead to discrepancies during prediction?
Below is the result obtained directly from applying apply_chat_template:
image
The gpt_response obtained from apply_chat_template appears to lack the fields such as [ {"type":"text", "text":. I found that the source of the issue seems to be:
image
I am curious whether the differences between these two could lead to training biases.

@Elenore1997
Copy link

I found the same problem as you mention. So I directly edit the code:
cur_text.append({
"role": "assistant",
# "content": [
# {"type": "text", "text": text},
# ]
"content": text
})

@Asunatan
Copy link
Author

Asunatan commented Dec 3, 2024

I found the same problem as you mention. So I directly edit the code: cur_text.append({ "role": "assistant", # "content": [ # {"type": "text", "text": text}, # ] "content": text })

Yes, I have adopted the same strategy as you, but is it correct to do so?

@Elenore1997
Copy link

I think it is ok to edit like this, I have trained model using this repository (edit this few lines of code) and inference with qwen2vl official code (using apply_chat_template) and get the correct result.

@Asunatan
Copy link
Author

Asunatan commented Dec 3, 2024

I think it is ok to edit like this, I have trained model using this repository (edit this few lines of code) and inference with qwen2vl official code (using apply_chat_template) and get the correct result.

Thank you, this is very helpful to me.

@zjysteven
Copy link
Owner

@Asunatan @Elenore1997 Yes I think the proposed fix looks good. Sorry for not being able to respond earlier; I was moving the past few days. Tagging @linyueqian to be aware of this.

@linyueqian
Copy link
Collaborator

Yes, as mentioned in #56, we should directly use the text as the content value. Just updated in the latest commit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants