-
Notifications
You must be signed in to change notification settings - Fork 1
Final Experiments To Do
Tonio Weidler edited this page Jan 21, 2019
·
33 revisions
If there are no significant differences in performance we will only use a DDQN with prioritized memory to reduce the number of total experiments to run.
Task: Tunnel
Representation-learner = Flatten
memory_delay = 0
init_eps = 1.0
memory_eps = 0.8
min_eps = 0.01
eps_decay = 20000
Policy-learner | Score | Training-time (Note: not on same machine) |
---|---|---|
DDQN | ||
PrioDDQN | 199.1 | 63min |
I also ran experiments with CartPole using all three architectures and 10.000 Episodes.
Policy-learner | Score | Training-time» |
---|---|---|
DDQN | 200 | 111.63min |
PrioDDQN | 114.6 | 72.93min |
PrioDuelingDQN | 56.9 | 27min (mostly due to the bad performance) |
Policy-learners: DDQN
Representation-learners: ConvolutionalJanusAE, ConvolutionalCerberusAE, ConvolutionalVariationalAE
Tasks: (Tunnel, Evasion, EvasionWalls), Tunnel, (4 Pathing-tasks)
Train for 1.000.000 Episodes
n_hidden=32
memory_delay = 100000
init_eps = 1.0
memory_eps = 0.8
min_eps = 0.01
eps_decay = 40000000
ID | Policy-learner | Representation-learner | Task | Running | Done | Score | On Colab | Actively running on Cluster | Progress |
---|---|---|---|---|---|---|---|---|---|
1 | DDQN | ConvJanus | (Tunnel, Evasion, EvasionWalls) | [x] | [] | Danni(Local) | 320K as of 18.00 Sun. | ||
2 | DDQN | ConvJanus | Tunnel | [x] | [] | Kevin | 330k as of 13.00 Sun | ||
3 | DDQN | ConvCerberus | (Tunnel, Evasion, EvasionWalls) | [x] | [] | Adrian | till ~23.00 Tue. | 230K as of 18.00 Sun. | |
4 | DDQN | ConvCerberus | Tunnel | [] | [] | Tonio | |||
5 | DDQN | ConvVAE | (Tunnel, Evasion, EvasionWalls) | [x] | [] | Danni(Azure) | 320K as of 18.00 Sun. | ||
6 | DDQN | ConvVAE | Tunnel | [x] | [] | Alessandro | 220k as of 19.00 Sun. | ||
7 | DDQN | ConvJanus | (4 Pathing) | [x] | [] | till ~ 10.30 Mon. | 23K as of 18.00 Sun. | ||
8 | DDQN | ConvCerberus | (4 Pathing) | [] | [] | queuing | |||
9 | DDQN | ConvVAE | (4 Pathing) | [] | [] | ||||
10 | DDQN | ConvVAE | (Scrollers) only update DDQN | [] | [] |
- representation module performance on its own
- single task
- multi task
- DDQN performance on its own