-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added mlflow #74
base: main
Are you sure you want to change the base?
Added mlflow #74
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can we keep this file as it is and move all MLflow related example to the mlflow folder?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems that the new notebook has better training results, so worth keep new one :)
example/rlhf/mlflow/demo_rl.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: does this file suppose to be here because I do not see any MLflow related logic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: same as this file. Are you planning to add MLflow on this in the future but make a copy of the current training script first.
from pykoi.chat import QuestionAnswerDatabase | ||
from pykoi.rlhf import RLHFConfig | ||
from pykoi.rlhf import SupervisedFinetuning | ||
import mlflow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All corrected. But please hold on and don't approve and merge until we meet. There are some issues. We can discuss them in person. |
This is the final version. All the added scripts have been tested. |
@@ -172,13 +172,21 @@ cython_debug/ | |||
|
|||
# data | |||
*.csv | |||
!example/rlhf/mlflow/input_rw/ranking.csv | |||
!example/rlhf/ranking.csv |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this two csv can be kept for future users?
They are kept for demo.On Oct 7, 2023 4:50 PM, goldmermaid ***@***.***> wrote:
@goldmermaid commented on this pull request.
In .gitignore:
@@ -172,13 +172,21 @@ cython_debug/
# data
*.csv
+!example/rlhf/mlflow/input_rw/ranking.csv
+!example/rlhf/ranking.csv
Maybe this two csv can be kept for future users?
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: ***@***.***>
|
example/rlhf/demo_rl.ipynb
Outdated
"\u001b[0;31mAttributeError\u001b[0m Traceback (most recent call last)", | ||
"\u001b[1;32m/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb Cell 7\u001b[0m line \u001b[0;36m1\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=0'>1</a>\u001b[0m \u001b[39mfrom\u001b[39;00m \u001b[39maccelerate\u001b[39;00m \u001b[39mimport\u001b[39;00m notebook_launcher\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=2'>3</a>\u001b[0m config \u001b[39m=\u001b[39m RLHFConfig(base_model_path\u001b[39m=\u001b[39m\u001b[39m\"\u001b[39m\u001b[39melinas/llama-7b-hf-transformers-4.29\u001b[39m\u001b[39m\"\u001b[39m, \u001b[39m# \"elinas/llama-7b-hf-transformers-4.29\", \u001b[39;00m\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=3'>4</a>\u001b[0m dataset_type\u001b[39m=\u001b[39m\u001b[39m\"\u001b[39m\u001b[39mlocal_db\u001b[39m\u001b[39m\"\u001b[39m,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=4'>5</a>\u001b[0m reward_model_path\u001b[39m=\u001b[39m\u001b[39m\"\u001b[39m\u001b[39mgoldmermaid/rlhf_reward_model\u001b[39m\u001b[39m\"\u001b[39m,\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=8'>9</a>\u001b[0m \n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=9'>10</a>\u001b[0m )\n\u001b[0;32m---> <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=10'>11</a>\u001b[0m rlhf_step3_rl \u001b[39m=\u001b[39m RL(config)\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=11'>12</a>\u001b[0m rlhf_step3_rl\u001b[39m.\u001b[39mtrain(\u001b[39m\"\u001b[39m\u001b[39m./models/rlhf_step3_rl\u001b[39m\u001b[39m\"\u001b[39m, num_processes\u001b[39m=\u001b[39m\u001b[39m1\u001b[39m)\n", | ||
"\u001b[1;32m/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb Cell 7\u001b[0m line \u001b[0;36m9\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=4'>5</a>\u001b[0m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mnum_proc \u001b[39m=\u001b[39m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39mnum_workers \u001b[39mif\u001b[39;00m \u001b[39mnot\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39mstreaming \u001b[39melse\u001b[39;00m \u001b[39mNone\u001b[39;00m\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=5'>6</a>\u001b[0m set_seed(rlhf_config\u001b[39m.\u001b[39mseed) \u001b[39m## TODO: how to set seed properly in __init__?\u001b[39;00m\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=7'>8</a>\u001b[0m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mppo_config\u001b[39m=\u001b[39mPPOConfig(\n\u001b[0;32m----> <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=8'>9</a>\u001b[0m steps\u001b[39m=\u001b[39m\u001b[39mself\u001b[39;49m\u001b[39m.\u001b[39;49m_rlhf_config\u001b[39m.\u001b[39;49mtotal_ppo_epochs,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=9'>10</a>\u001b[0m model_name\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39mbase_model_path,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=10'>11</a>\u001b[0m learning_rate\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39mlearning_rate,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=11'>12</a>\u001b[0m batch_size\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39mppo_batch_size,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=12'>13</a>\u001b[0m mini_batch_size\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39mmini_batch_size,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=13'>14</a>\u001b[0m gradient_accumulation_steps\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39mgradient_accumulation_steps,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=14'>15</a>\u001b[0m optimize_cuda_cache\u001b[39m=\u001b[39m\u001b[39mTrue\u001b[39;00m,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=15'>16</a>\u001b[0m early_stopping\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39mearly_stopping,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=16'>17</a>\u001b[0m target_kl\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39mtarget_kl,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=17'>18</a>\u001b[0m ppo_epochs\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39mppo_epochs,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=18'>19</a>\u001b[0m seed\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39mseed,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=19'>20</a>\u001b[0m init_kl_coef\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39minit_kl_coef,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=20'>21</a>\u001b[0m adap_kl_ctrl\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39madap_kl_ctrl,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=21'>22</a>\u001b[0m \u001b[39m# accelerator_kwargs=self._rlhf_config.accelerator_kwargs,\u001b[39;00m\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=22'>23</a>\u001b[0m )\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=24'>25</a>\u001b[0m \u001b[39m## Load the base model and tokenizer and define the PPO Trainer for RL\u001b[39;00m\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=25'>26</a>\u001b[0m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mbase_tokenizer \u001b[39m=\u001b[39m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mcreate_tokenizer(rlhf_config\u001b[39m.\u001b[39mbase_model_path)\n", | ||
"\u001b[0;31mAttributeError\u001b[0m: 'RLHFConfig' object has no attribute 'total_ppo_epochs'" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that there is still an error here. Can you clean up the output and write a to do note?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, can we remove this step 3 rl notebook example? For step 3, we use the py file, not the notebook file, as the example.
In the mlflow folder, I removed the notebook for step 3 rl.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error commented. Added a to-do comment in that block.
Or we can remove this step 3 rl notebook example? For step 3 we only use the py file as example.In mlflow folder, I removed this notebook.On Oct 7, 2023 4:54 PM, goldmermaid ***@***.***> wrote:
@goldmermaid commented on this pull request.
In example/rlhf/demo_rl.ipynb:
"output_type": "error",
"traceback": [
- "\u001b[1;31mThe Kernel crashed while executing code in the the current cell or a previous cell. Please review the code in the cell(s) to identify a possible cause of the failure. Click <a href='https://aka.ms/vscodeJupyterKernelCrash'>here</a> for more info. View Jupyter <a href='command:jupyter.viewOutput'>log</a> for further details."
+ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
+ "\u001b[0;31mAttributeError\u001b[0m Traceback (most recent call last)",
+ "\u001b[1;32m/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb Cell 7\u001b[0m line \u001b[0;36m1\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=0'>1</a>\u001b[0m \u001b[39mfrom\u001b[39;00m \u001b[39maccelerate\u001b[39;00m \u001b[39mimport\u001b[39;00m notebook_launcher\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=2'>3</a>\u001b[0m config \u001b[39m=\u001b[39m RLHFConfig(base_model_path\u001b[39m=\u001b[39m\u001b[39m\"\u001b[39m\u001b[39melinas/llama-7b-hf-transformers-4.29\u001b[39m\u001b[39m\"\u001b[39m, \u001b[39m# \"elinas/llama-7b-hf-transformers-4.29\", \u001b[39;00m\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=3'>4</a>\u001b[0m dataset_type\u001b[39m=\u001b[39m\u001b[39m\"\u001b[39m\u001b[39mlocal_db\u001b[39m\u001b[39m\"\u001b[39m,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=4'>5</a>\u001b[0m reward_model_path\u001b[39m=\u001b[39m\u001b[39m\"\u001b[39m\u001b[39mgoldmermaid/rlhf_reward_model\u001b[39m\u001b[39m\"\u001b[39m,\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=8'>9</a>\u001b[0m \n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=9'>10</a>\u001b[0m )\n\u001b[0;32m---> <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=10'>11</a>\u001b[0m rlhf_step3_rl \u001b[39m=\u001b[39m RL(config)\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=11'>12</a>\u001b[0m rlhf_step3_rl\u001b[39m.\u001b[39mtrain(\u001b[39m\"\u001b[39m\u001b[39m./models/rlhf_step3_rl\u001b[39m\u001b[39m\"\u001b[39m, num_processes\u001b[39m=\u001b[39m\u001b[39m1\u001b[39m)\n",
+ "\u001b[1;32m/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb Cell 7\u001b[0m line \u001b[0;36m9\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=4'>5</a>\u001b[0m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mnum_proc \u001b[39m=\u001b[39m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39mnum_workers \u001b[39mif\u001b[39;00m \u001b[39mnot\u001b[39;00m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39mstreaming \u001b[39melse\u001b[39;00m \u001b[39mNone\u001b[39;00m\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=5'>6</a>\u001b[0m set_seed(rlhf_config\u001b[39m.\u001b[39mseed) \u001b[39m## TODO: how to set seed properly in __init__?\u001b[39;00m\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=7'>8</a>\u001b[0m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mppo_config\u001b[39m=\u001b[39mPPOConfig(\n\u001b[0;32m----> <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=8'>9</a>\u001b[0m steps\u001b[39m=\u001b[39m\u001b[39mself\u001b[39;49m\u001b[39m.\u001b[39;49m_rlhf_config\u001b[39m.\u001b[39;49mtotal_ppo_epochs,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=9'>10</a>\u001b[0m model_name\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39mbase_model_path,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=10'>11</a>\u001b[0m learning_rate\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39mlearning_rate,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=11'>12</a>\u001b[0m batch_size\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39mppo_batch_size,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=12'>13</a>\u001b[0m mini_batch_size\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39mmini_batch_size,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=13'>14</a>\u001b[0m gradient_accumulation_steps\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39mgradient_accumulation_steps,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=14'>15</a>\u001b[0m optimize_cuda_cache\u001b[39m=\u001b[39m\u001b[39mTrue\u001b[39;00m,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=15'>16</a>\u001b[0m early_stopping\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39mearly_stopping,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=16'>17</a>\u001b[0m target_kl\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39mtarget_kl,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=17'>18</a>\u001b[0m ppo_epochs\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39mppo_epochs,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=18'>19</a>\u001b[0m seed\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39mseed,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=19'>20</a>\u001b[0m init_kl_coef\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39minit_kl_coef,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=20'>21</a>\u001b[0m adap_kl_ctrl\u001b[39m=\u001b[39m\u001b[39mself\u001b[39m\u001b[39m.\u001b[39m_rlhf_config\u001b[39m.\u001b[39madap_kl_ctrl,\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=21'>22</a>\u001b[0m \u001b[39m# accelerator_kwargs=self._rlhf_config.accelerator_kwargs,\u001b[39;00m\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=22'>23</a>\u001b[0m )\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=24'>25</a>\u001b[0m \u001b[39m## Load the base model and tokenizer and define the PPO Trainer for RL\u001b[39;00m\n\u001b[1;32m <a href='vscode-notebook-cell://ssh-remote%2B34.215.248.75/home/ubuntu/pykoi/example/rlhf/demo_rl.ipynb#W6sdnNjb2RlLXJlbW90ZQ%3D%3D?line=25'>26</a>\u001b[0m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mbase_tokenizer \u001b[39m=\u001b[39m \u001b[39mself\u001b[39m\u001b[39m.\u001b[39mcreate_tokenizer(rlhf_config\u001b[39m.\u001b[39mbase_model_path)\n",
+ "\u001b[0;31mAttributeError\u001b[0m: 'RLHFConfig' object has no attribute 'total_ppo_epochs'"
It seems that there is still an error here. Can you clean up the output and write a to do note?
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: ***@***.***>
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love the ML Flow example! @larryyin Please made the minor changes above.
Added draft example code to train with mlflow. It encountered some GPU issue. We can discuss it when we meet, maybe on Friday.
This is not the final version, just a point for review.