-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Another implementation of Randomized #232
base: main
Are you sure you want to change the base?
Conversation
@wasowski |
@mohsen-ghaffari1992 |
It should only be created in top-level executable files (which in this project means Spec files)
concrete.simplebandit.Bandit is an Agent.agent.observe (step (s) (a)._1) ∈ ObservableState *** FAILED *** (44 milliseconds)
[info] ArrayIndexOutOfBoundsException was thrown during property evaluation. Gaussian Simple Bandit seems to fail because of a bug in spire (issue #233). Do we need Gaussian for the paper anywhere @mohsen-ghaffari1992 ? |
No, we do not need the gaussian.
|
OK. Then we fix this later. Now cartpole was failing because of an assertion, but as far as I can see the assertion was wrong (the last requirement for state invariant was require (pv <= PvMin) I switch this to PvMax and now seems to work, although very strange how this mistake survived so long. Now cartpole still uses memory intensively. Even with 10K episodes I am running out of memory.
|
- otherwise we have OOM errors
So that we can diagnostic information in more situations, not only when moving (it was previously placed in move)
It was not excluding the right point margin as the spec would indicate
It seems that it was forgetting initializing at the max edges of the board
Thanks! That is fine.
No problem. We do not evaluate on CartPole.
Best,
Mohsen
From: Andrzej Wąsowski ***@***.***>
Date: Saturday, 13 January 2024 at 14.28
To: itu-square/symsim ***@***.***>
Cc: Mohsen Ghaffari ***@***.***>, Mention ***@***.***>
Subject: Re: [itu-square/symsim] Another implementation of Randomized (PR #232)
OK. Then we fix this later.
Now cartpole was failing because of an assertion, but as far as I can see the assertion was wrong (the last requirement for state invariant was
require (pv <= PvMin)
I switch this to PvMax and now seems to work, but this one still uses memory intensively. Even with 10K episodes I am running out of memory.
1. Do we use cartpole in the paper?
2. With what number of episodes?
3. Did it work before or also timed out?
—
Reply to this email directly, view it on GitHub<#232 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ASQKWROT5MSI23Z2HCQJLVLYOKDZJAVCNFSM6AAAAABBY25HNOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOJQGQ2TSMZXGI>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
but I think there is something buggy, which means me question on all results. WindyGrid and CliffWalking are also not able to complete even one episode, and these should work just fine! CliffWalking works very well in ADPRO on an extremely similar design, so something is fishy. |
I checked the CliffWalking and the problem is initialize function. You check it is not final but sometimes it is out of board.
I believe the same issue for WindyGrid.
Best,
Mohsen
From: Andrzej Wąsowski ***@***.***>
Date: Saturday, 13 January 2024 at 15.50
To: itu-square/symsim ***@***.***>
Cc: Mohsen Ghaffari ***@***.***>, Mention ***@***.***>
Subject: Re: [itu-square/symsim] Another implementation of Randomized (PR #232)
but I think there is something buggy, which means me question on all results. WindyGrid and CliffWalking are also not able to complete even one episode, and these should work just fine! CliffWalking works very well in ADPRO on an extremely similar design, so something is fishy.
—
Reply to this email directly, view it on GitHub<#232 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ASQKWROG434B27O6RP6RA23YOKNMNAVCNFSM6AAAAABBY25HNOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOJQGQ3TQNJWGA>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Hmm... this is weird. The test for validity (CliffWalkingIsAgent) is passing, and we only produce states within the board? Also I moved the assertions about the states being legal from move to the constructor in CWState, and it never fails , which should indicate that these states are all good? How can you see that this is happening? (I suspected the dreaded function tailRecM in Randomized2, but looking at it for several hours, I start to believe that it is correct). |
I just printed the states that are sending to step function and noticed the value is out of bound. Since, state is created by initialize function, we should check it. The point is that we check the state is not final, but we do not check whether it is valid!
Best,
Mohsen
From: Andrzej Wąsowski ***@***.***>
Date: Saturday, 13 January 2024 at 20.00
To: itu-square/symsim ***@***.***>
Cc: Mohsen Ghaffari ***@***.***>, Mention ***@***.***>
Subject: Re: [itu-square/symsim] Another implementation of Randomized (PR #232)
Hmm... this is weird. The test for validity (CliffWalkingIsAgent) is passing, and we only produce states within the board? How can you see that this is happening?
(I suspected the dreaded function tailRecM in Randomized2, but looking at it for several hours, I start to believe that it is correct).
—
Reply to this email directly, view it on GitHub<#232 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ASQKWRJ536CP6ZOZUIAKOZTYOLKWFAVCNFSM6AAAAABBY25HNOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOJQG4YTMMJSHA>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
But maybe I do not understand what valid states are. // from CliffWalking
def initialize: Randomized2[CWState] = { for
x <- Randomized2.between (0, BoardWidth + 1)
y <- Randomized2.between (0, BoardHeight + 1)
s = CWState (x, y)
yield s }.filter { !this.isFinal (_) } I do not understand how this can create states outside the board. Also I have assertions now checking when the state is created in CWState, so if I try to create an illegal state everything crashes with an exception, but I do not see these exceptions. For example: scala> CWState(20,20)
java.lang.IllegalArgumentException: requirement failed: Out-Of-Width x: ¬(20 ≤ 11)
at scala.Predef$.require(Predef.scala:337)
at symsim.examples.concrete.cliffWalking.CWState.<init>(CliffWalking.scala:18)
at symsim.examples.concrete.cliffWalking.CWState$.apply(CliffWalking.scala:16)
... 42 elided So I think this is not happening. We already had this problem that everything was hanging during the first episode (or it seems, after some debugging that this might be during the second episode). But I do not remember what was the reason. |
Oh - some progress. I just discovered that I was misguided. It is the evaluation that is hanging, not learning! I have not been looking at the evaluation code, as I was sure it was learning. I will look into eval. It might be the same problem as with maze, that the early policies are bad and they need a timeout. |
If it is hanging, then the issue is probably the same.
When I was testing, it reported error that I found it is for required in move function. That is how I became with initialize. However, the error that I am talking about can be due to moving from one step to another. Because we do not check the valid state for return of the move.
Best,
Mohsen
From: Andrzej Wąsowski ***@***.***>
Date: Saturday, 13 January 2024 at 20.29
To: itu-square/symsim ***@***.***>
Cc: Mohsen Ghaffari ***@***.***>, Mention ***@***.***>
Subject: Re: [itu-square/symsim] Another implementation of Randomized (PR #232)
Oh - some progress. I just discovered that I was misguided. It is the evaluation that is hanging, not learning! I have not been looking at the evaluation code, as I was sure it was learning. I will look into eval. It might be the same problem as with maze, that the early policies are bad and they need a timeout.
—
Reply to this email directly, view it on GitHub<#232 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ASQKWRJWHVWDZHIXTCUGMSTYOLOC5AVCNFSM6AAAAABBY25HNOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOJQG42DOMZTGI>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Now we check it every time (in cliffwalking) when a new state is constructed. So this checks both when moving and when initializing. I will look into eval. |
[like] Mohsen Ghaffari reacted to your message:
…________________________________
From: Andrzej Wąsowski ***@***.***>
Sent: Saturday, January 13, 2024 7:36:34 PM
To: itu-square/symsim ***@***.***>
Cc: Mohsen Ghaffari ***@***.***>; Mention ***@***.***>
Subject: Re: [itu-square/symsim] Another implementation of Randomized (PR #232)
Now we check it every time (in cliffwalking) when a new state is constructed. So this checks both when moving and when initializing. I will look into eval.
—
Reply to this email directly, view it on GitHub<#232 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ASQKWRMALIY3RMSHKIPCNVLYOLO4FAVCNFSM6AAAAABBY25HNOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOJQG42DQNRUGY>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
This seems to be an old bug! It was not initializing in the rightmost column.
Otherwise it is difficult to write some tests (becase class members are private for non-case classes apparently)
OK. I added time horizon to mountaincar, cliffwalking, and windygrid. Now everything seems to behave. I think this is ready for your experiments. I hope this branch will allow all the experiments you need. I definitely hit heap problems still, but I believe we can handle larger cases now. Some tests are still failing (about 10). All of them are due to randomness or the spire bug mentioned above (with Gaussian). I will stop now and move to other things. |
Thanks Andrzej!
You put a real effort on this, I appreciate it.
Best,
Mohsen
From: Andrzej Wąsowski ***@***.***>
Date: Saturday, 13 January 2024 at 22.50
To: itu-square/symsim ***@***.***>
Cc: Mohsen Ghaffari ***@***.***>, Mention ***@***.***>
Subject: Re: [itu-square/symsim] Another implementation of Randomized (PR #232)
OK. I added time horizon to mountaincar, cliffwalking, and windygrid. Now everything seems to behave. I think this is ready for your experiments. I hope this branch will allow all the experiments you need. I definitely hit heap problems still, but I believe we can handle larger cases now.
Some tests are still failing (about 10). All of them are due to randomness or the spire bug mentioned above (with Gaussian). I will stop now and move to other things.
—
Reply to this email directly, view it on GitHub<#232 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ASQKWRONGGNBY2OKYXQ5LWDYOL6S5AVCNFSM6AAAAABBY25HNOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOJQG43TOMJSG4>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
One test fails, but this is the same bug in spire as others
@mohsen-ghaffari1992 in a push to try to resolve bug #59, I have proposed to replace Randomized with Randomized2 based on probula. This has actually uncovered some design flaw with Randomized (it was not properly abstracted so its design details were leaking to symsim code). As a result this required a massive amount of changes.
I had no time to try the entire test suite, so I do not even know if this is correct (and it may have positive or negative impact on convergence)
But simple maze seems to work, and I was able to run a million episodes. So try using this branch for your experiments.