A simple interface to batch-rl. Expert data for all 60 Atari 2006 and 2 continuous control environments.
$ pip install git+https://github.com/indrasweb/expert-offline-rl.git
from eorl import OfflineDataset
ds = OfflineDataset(
env = 'Pong', # pass name in `supported environments` below
dataset_size = 500000, # [0, 1e6) frames of atari
train_split = 0.9, # 90% training, 10% held out for testing
obs_only = False, # only get observations (no actions, rewards, dones)
framestack = 1, # number of frames per sample
shuffle = False # chronological samples if False, randomly sampled if true
stride = 1 # return every stride`th chunk (where chunk size == `framestack)
verbose = 1 # 0 = silent, >0 for reporting
)
obs, actions, rewards, dones, next_obs = ds.batch(batch_size=128, split='train')
Dataset is loaded into memory. Large dataset_size
needs large amount of memory. Use <400k in Colab.
Continuous control environments (Box2D) - 100k expert steps of each:
LunarLanderContinuous-v2
MountainCarContinuous-v0
BipedalWalker-v3
All Atari 2006 environments (Discrete)- 1 million steps of each.
Each dataset was collected from a DQN agent trained on 200 million frames of the following NoFrameSkip-v4
gym environments:
AirRaid
Alien
Amidar
Assault
Asterix
Asteroids
Atlantis
BankHeist
BattleZone
BeamRider
Berzerk
Bowling
Boxing
Breakout
Carnival
Centipede
ChopperCommand
CrazyClimber
DemonAttack
DoubleDunk
ElevatorAction
Enduro
FishingDerby
Freeway
Frostbite
Gopher
Gravitar
Hero
IceHockey
Jamesbond
JourneyEscape
Kangaroo
Krull
KungFuMaster
MontezumaRevenge
MsPacman
NameThisGame
Phoenix
Pitfall
Pong
Pooyan
PrivateEye
Qbert
Riverraid
RoadRunner
Robotank
Seaquest
Skiing
Solaris
SpaceInvaders
StarGunner
Tennis
TimePilot
Tutankham
UpNDown
Venture
VideoPinball
WizardOfWor
YarsRevenge
Zaxxon
Decompressing to disk on Colab takes >10 mins which is annoying, hence we decompress to memory. Could add a switch that lets the user choose, thus allowing for larger DS size (e.g. grab ~50k samples from disk on demand).