Code for the paper Handling Long-Term Safety and Uncertainty in Safe Reinforcement Learning.
We use the uv package manager. If you don't want to use uv
we provide a requirements.txt
for manual installation.
git clone https://github.com/cube1324/d-atacom.git
cd d-atacom/cremini_rl
To train D-ATACOM on the Planar Air Hockey environment run:
uv run run.py
To use different environments or algorithms, modify the run.py
file.
The package cremini_rl is based on the mushroom_rl framework and contains the implementation of D-ATACOM as well as several Safe RL baselines.
Currently D-ATACOM
, LagSAC
, WCSAC
, SafeLayerTD3
, CBF-SAC
, ATACOM
, IQN-ATACOM
are implemented.
To run the algorithms on a new environment add it to the build_mdp
function in cremini_rl/experiment.py
.
The environment should be a subclass of mushroom_rl.core.Environment
. The environment cremini_rl\envs\goal_navigation_env.py
is an example of a environment wrapper for safety gymnasium.
For D-ATACOM
, IQN-ATACOM
, CBF-SAC
the dynamics of the agent are also required. They should be a subclass of cremini_rl.dynamics.dynamics.ControlAffineSystem
.
If you find this code useful in your research, please consider citing:
@inproceedings{gunster2024handling,
title={Handling Long-Term Safety and Uncertainty in Safe Reinforcement Learning},
author={G{"u}nster, Jonas and Liu, Puze and Peters, Jan and Tateo, Davide},
booktitle={Conference on Robot Learning},
year={2024},
}