Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

General errors in Unitree A1 env. #32

Open
Danfoa opened this issue Jun 10, 2024 · 6 comments
Open

General errors in Unitree A1 env. #32

Danfoa opened this issue Jun 10, 2024 · 6 comments

Comments

@Danfoa
Copy link

Danfoa commented Jun 10, 2024

Dear @robfiras,

Thank you very much for your efforts in building this library. I wanted to point out several issues I found with the Unitree A1 environment, which might also be present in other environments.

  1. There is a mismatch between dataset observations and environment observations. Specifically, in the Unitree A1 environment, the heading orientation is returned with a -π/2 bias, which the dataset values do not have. This makes imitation learning impossible.
  2. The desired velocity is set at initialization to be the mean of the recorded trajectory motion velocity. However, the observations returned by the environment do not reflect this average value.

Additionally, I found it confusing for new users that the observation spec does not match the actual observation space. In other libraries, the observation spec typically serves as a data class to understand the dimensionality and semantic information of each dimension of the MDP state. In your custom use, it appears to be a placeholder for all physical observables from the system. Without documentation of this custom use, it is challenging to follow the codebase. I suggest simplifying or reducing the amount of pre-processing and post-processing of observations to prevent the issues mentioned above.

Thank you for your attention to these matters.

@robfiras
Copy link
Owner

Hi @Danfoa,

thanks a lot for the valuable feedback! I agree that the A1 environment did not get much love compared to the humanoids ...

Here some comments:

  1. the _modify_observation_callback is called when creating the observation, but also when creating the dataset. So both should be rotated. Training with the imitation learning scripts also worked for us.
  2. That's a good point, as of now the goal speed is set to 0.5 by default (which is roughly the mean vel of the trajectory). I will update this in the next release.

I can understand the confusion about the observation spec. The latter is mainly used to access information in the mujoco data structure. But not all information you want for the observation is in that datastructure. Often you want to add custom information like the goal or some custom foot forces. I tried to make this more clear in the documentation, did you find that helpful? In any case, there will be a major release soon, where I will try to make the observation space clearer in code as well.

@Danfoa
Copy link
Author

Danfoa commented Jun 10, 2024

@robfiras

I am running the tests now and I can confirm that the error in the angle is still present. I raise the issue because I can see the difference between sampling the state from the dataset, and the state returned by the environment after a reset (to the dataset initial state).

Also the velocity of the quadruped never exceeds .2 m/s, the average values being always less than .2 m/s.

@robfiras
Copy link
Owner

alright, I will take a closer look into it, which environment are you running, "simple" or "hard"?

@Danfoa
Copy link
Author

Danfoa commented Jun 10, 2024

@robfiras

Hard. By commenting out this -π/2 bias the issue with the angle is solved.

For the target velocity, it is unclear to me why you use the mean velocity, instead of the velocity error, or the actual target speed.

@Danfoa
Copy link
Author

Danfoa commented Jun 10, 2024

Hi @robfiras

I was wondering if the expert policy can be made public, such as the user can re-create a dataset and evaluate the performance of the policies in mildly distinct environment conditions.

@robfiras
Copy link
Owner

alright, I will check what's going on with that bias.

Yeah, publishing the policies as well (instead of just providing the training script) is on my todo, this might take a while to collect all policies though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants