-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Training detail] About training order #3
Comments
Hi! I’m reaching out to better understand your concerns about this issue. Firstly, when you mention the “legal move” task, are you referring to the results listed under the Bigbench State Tracking in Chess in Table 1 of our paper? Could you specify which exact task results are not aligning with your reproduction, and by what margin they differ? Secondly, it would be very helpful to know more about how you do the reproduction. Did you perform the evaluation using the ChessGPT-v1 model provided on HuggingFace, or did you train a new model utilizing the dataset according to the procedures described in our paper? This information will be vital in assisting you further. |
I am using Hugging Face's data for training, specifically OpenLLAMA3B. I'm curious about the sequence in which you use the data or the order of training tasks. Thank you very much for your response. |
Hi @Inch-Z, Our training order is If you focus more on legal move prediction task, I would recommend you to especially look at https://huggingface.co/datasets/Waterhorse/chess_data/tree/main/chessgpt_data/chess_modeling, where we create several chess modeling task (specially described in our paper appendix), including the legal move prediction task. Best, |
Hello, we have encountered a new problem about how the Elo score in the paper is calculated. At the same time, we are curious whether the model can play against humans and output the next step. If possible, we are more curious about the prompt format. Thanks |
Hi, in fact we do not directly calculate the Elo rating because we find there might be some issues with the policy training (see our description in our section 5.3). Currently, we are working on refining the game data and doing a round of retraining on the new dataset with a bigger and more advanced model. Also we will include Elo rating calculation this time. Please stay tuned and we will have some results by the end of this month. |
For the Legal Move task, I can't reproduce experimental results. I believe that the issue might be related to the order of training, so I would like to ask about the training details of the ChessGPT model, specifically regarding the order of tasks.
The text was updated successfully, but these errors were encountered: