ml2_trl_simp

Project Machine Learning 2 - Group 10

This repository contain notebooks to implement reinforcement learning on text simplification.

Notebook: ml2_batchsize_15.ipynb produces (TRL-simp) models in the report.
Folder trl-simp-batch15 contains the configuration and pre-trained TRL-simp model exported after finish training.
Notebook: evaluation matrix.ipynb is for evaluation different models

At the time of this experiment, the Transformer Reinforcement learning (TRL) API in HuggingFace was not updated with method AutoModelForSeq2SeqLMWithValueHead for class model. Thus, we installed the library using the Github souce code (pip install trl-0.2.1.tar.gz).

Configuration of the PPO Trainer:

Batch-size: 15
Learning rate: 5e-4
Add customization tuning for memory efficient: Learning rate scheduler, pass SGD optimizer (https://github.com/lvwerra/trl/blob/main/docs/source/customization.mdx)

Run logs for the TRL-simp model are available at this link: https://wandb.ai/ml2_g10/trl/runs/v0dcp47v?workspace=user-anhtth

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
trl-simp-batch15		trl-simp-batch15
README.md		README.md
evaluation.ipynb		evaluation.ipynb
ml2_batchsize_15.ipynb		ml2_batchsize_15.ipynb
ml2_batchsize_15.md		ml2_batchsize_15.md
trl-0.2.1.tar.gz		trl-0.2.1.tar.gz

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ml2_trl_simp

About

Releases

Packages

Languages

anhtth16/ml2_trl_simp

Folders and files

Latest commit

History

Repository files navigation

ml2_trl_simp

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages