Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Training detail] About training order #3

Open
Inch-Z opened this issue Nov 8, 2023 · 5 comments
Open

[Training detail] About training order #3

Inch-Z opened this issue Nov 8, 2023 · 5 comments

Comments

@Inch-Z
Copy link

Inch-Z commented Nov 8, 2023

For the Legal Move task, I can't reproduce experimental results. I believe that the issue might be related to the order of training, so I would like to ask about the training details of the ChessGPT model, specifically regarding the order of tasks.

@ziyan-wang98
Copy link
Collaborator

Hi! I’m reaching out to better understand your concerns about this issue. Firstly, when you mention the “legal move” task, are you referring to the results listed under the Bigbench State Tracking in Chess in Table 1 of our paper? Could you specify which exact task results are not aligning with your reproduction, and by what margin they differ? Secondly, it would be very helpful to know more about how you do the reproduction. Did you perform the evaluation using the ChessGPT-v1 model provided on HuggingFace, or did you train a new model utilizing the dataset according to the procedures described in our paper? This information will be vital in assisting you further.

@Inch-Z
Copy link
Author

Inch-Z commented Nov 8, 2023

I am using Hugging Face's data for training, specifically OpenLLAMA3B. I'm curious about the sequence in which you use the data or the order of training tasks. Thank you very much for your response.

@waterhorse1
Copy link
Owner

waterhorse1 commented Nov 8, 2023

Hi @Inch-Z,

Our training order is
(1) We firstly continue pretraining over https://huggingface.co/datasets/Waterhorse/chess_data/tree/main/chessgpt_data to get our chessgpt-base, but note that it's normal if you cannot completely reproduce our result, because the data we share is not exactly the same with our pretraining data. Because of some legal issues, we do not share some of them including blog, book and some of our annotated book.
(2) Then we conduct SFT tuning over https://huggingface.co/datasets/Waterhorse/chess_data/tree/main/chessgpt_sft_data, we release all our SFT data.

If you focus more on legal move prediction task, I would recommend you to especially look at https://huggingface.co/datasets/Waterhorse/chess_data/tree/main/chessgpt_data/chess_modeling, where we create several chess modeling task (specially described in our paper appendix), including the legal move prediction task.

Best,
Xidong

@Inch-Z
Copy link
Author

Inch-Z commented Jan 8, 2024

Hello, we have encountered a new problem about how the Elo score in the paper is calculated. At the same time, we are curious whether the model can play against humans and output the next step. If possible, we are more curious about the prompt format.

Thanks

@waterhorse1
Copy link
Owner

Hi, in fact we do not directly calculate the Elo rating because we find there might be some issues with the policy training (see our description in our section 5.3). Currently, we are working on refining the game data and doing a round of retraining on the new dataset with a bigger and more advanced model. Also we will include Elo rating calculation this time. Please stay tuned and we will have some results by the end of this month.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants