Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] Is it possible to use custom LSTM and transformer models with MlpPolicy ActorCriticPolicy? #1407

Closed
4 tasks done
HaakonFlaaronning opened this issue Mar 23, 2023 · 2 comments
Labels
duplicate This issue or pull request already exists question Further information is requested

Comments

@HaakonFlaaronning
Copy link

❓ Question

I want to use PPO and A2C with a custom LSTM and transformer network. PPO only have native support for "MlpPolicy" "CnnPolicy" and "MultiInputPolicy". Can I still use "MlpPolicy" but specify a custom LSTM or transformer network, or can it purely be used with networks that only have linear layers? Does it mess up the training if I specify a LSTM network?

Checklist

  • I have checked that there is no similar issue in the repo
  • I have read the documentation
  • If code there is, it is minimal and working
  • If code there is, it is formatted using the markdown code blocks for both code and stack traces.
@HaakonFlaaronning HaakonFlaaronning added the question Further information is requested label Mar 23, 2023
@araffin araffin added the duplicate This issue or pull request already exists label Mar 23, 2023
@araffin
Copy link
Member

araffin commented Mar 24, 2023

I have checked that there is no similar issue in the repo

Duplicate of #1387, #1077 #177 and Stable-Baselines-Team/stable-baselines3-contrib#165

For RecurrentPPO, please take a look at SB3 contrib (as written in the doc): https://sb3-contrib.readthedocs.io/en/master/modules/ppo_recurrent.html

@Pythoniasm
Copy link

I can recommend to use a custom features extractor that does everything you want. However, you might want to adjust the MLP heads for critic and actor accordingly - but still using the MlpPolicy.

@araffin araffin closed this as not planned Won't fix, can't repro, duplicate, stale Mar 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate This issue or pull request already exists question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants