Release Deobfuscation of the code base + pep8 and fixes · hill-a/stable-baselines

Fixed tf.session().__enter__() being used, rather than sess = tf.session() and passing the session to the objects
Fixed uneven scoping of TensorFlow Sessions throughout the code
Fixed rolling vecwrapper to handle observations that are not only grayscale images
Fixed deepq saving the environment when trying to save itself
Fixed ValueError: Cannot take the length of Shape with unknown rank. in acktr, when running run_atari.py script.
Fixed calling baselines sequentially no longer creates graph conflicts
Fixed mean on empty array warning with deepq
Fixed kfac eigen decomposition not cast to float64, when the parameter use_float64 is set to True
Fixed Dataset data loader, not correctly resetting id position if shuffling is disabled
Fixed EOFError when reading from connection in the worker in subproc_vec_env.py
Fixed behavior_clone weight loading and saving for GAIL
Avoid taking root square of negative number in trpo_mpi.py
Removed some duplicated code (a2cpolicy, trpo_mpi)
Removed unused, undocumented and crashing function reset_task in subproc_vec_env.py
Reformated code to PEP8 style
Documented all the codebase
Added atari tests
Added logger tests

Missing: tests for acktr continuous (+ HER, gail but they rely on mujoco...)

Provide feedback