[Training detail] About training order #3

Inch-Z · 2023-11-08T03:58:15Z

For the Legal Move task, I can't reproduce experimental results. I believe that the issue might be related to the order of training, so I would like to ask about the training details of the ChessGPT model, specifically regarding the order of tasks.

ziyan-wang98 · 2023-11-08T10:50:40Z

Hi! I’m reaching out to better understand your concerns about this issue. Firstly, when you mention the “legal move” task, are you referring to the results listed under the Bigbench State Tracking in Chess in Table 1 of our paper? Could you specify which exact task results are not aligning with your reproduction, and by what margin they differ? Secondly, it would be very helpful to know more about how you do the reproduction. Did you perform the evaluation using the ChessGPT-v1 model provided on HuggingFace, or did you train a new model utilizing the dataset according to the procedures described in our paper? This information will be vital in assisting you further.

Inch-Z · 2023-11-08T11:38:02Z

I am using Hugging Face's data for training, specifically OpenLLAMA3B. I'm curious about the sequence in which you use the data or the order of training tasks. Thank you very much for your response.

waterhorse1 · 2023-11-08T13:50:21Z

Hi @Inch-Z,

Our training order is
(1) We firstly continue pretraining over https://huggingface.co/datasets/Waterhorse/chess_data/tree/main/chessgpt_data to get our chessgpt-base, but note that it's normal if you cannot completely reproduce our result, because the data we share is not exactly the same with our pretraining data. Because of some legal issues, we do not share some of them including blog, book and some of our annotated book.
(2) Then we conduct SFT tuning over https://huggingface.co/datasets/Waterhorse/chess_data/tree/main/chessgpt_sft_data, we release all our SFT data.

If you focus more on legal move prediction task, I would recommend you to especially look at https://huggingface.co/datasets/Waterhorse/chess_data/tree/main/chessgpt_data/chess_modeling, where we create several chess modeling task (specially described in our paper appendix), including the legal move prediction task.

Best,
Xidong

Inch-Z · 2024-01-08T08:48:43Z

Hello, we have encountered a new problem about how the Elo score in the paper is calculated. At the same time, we are curious whether the model can play against humans and output the next step. If possible, we are more curious about the prompt format.

Thanks

waterhorse1 · 2024-01-08T11:09:23Z

Hi, in fact we do not directly calculate the Elo rating because we find there might be some issues with the policy training (see our description in our section 5.3). Currently, we are working on refining the game data and doing a round of retraining on the new dataset with a bigger and more advanced model. Also we will include Elo rating calculation this time. Please stay tuned and we will have some results by the end of this month.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Training detail] About training order #3

[Training detail] About training order #3

Inch-Z commented Nov 8, 2023

ziyan-wang98 commented Nov 8, 2023

Inch-Z commented Nov 8, 2023

waterhorse1 commented Nov 8, 2023 •

edited

Loading

Inch-Z commented Jan 8, 2024

waterhorse1 commented Jan 8, 2024

[Training detail] About training order #3

[Training detail] About training order #3

Comments

Inch-Z commented Nov 8, 2023

ziyan-wang98 commented Nov 8, 2023

Inch-Z commented Nov 8, 2023

waterhorse1 commented Nov 8, 2023 • edited Loading

Inch-Z commented Jan 8, 2024

waterhorse1 commented Jan 8, 2024

waterhorse1 commented Nov 8, 2023 •

edited

Loading