You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
From only looking at the docstrings of the relevant functions, I think I noticed a discrepancy to the paper. I am writing this without checking the math in the code so I may be wrong.
V returned in RbfController.compute_action() in controllers.py
corresponds to Cov[x,u]
From backtracking to MGPR.predict_given_factorizations() in models/mgpr.py, I think the docstrings indicate that:
V = cov[x,x]^{-1} @ cov[x,pi] @ cov[pi,u]
where I call pi the action before squashing
From section 5.5 of the 2015 paper, it says:
V = cov[x,pi] @ cov[pi,pi]^{-1} @ cov[pi,u]
Are these expressions equivalent or have I misread something. Thanks!
The text was updated successfully, but these errors were encountered:
From only looking at the docstrings of the relevant functions, I think I noticed a discrepancy to the paper. I am writing this without checking the math in the code so I may be wrong.
V
returned inRbfController.compute_action()
incontrollers.py
corresponds to Cov[x,u]
From backtracking to
MGPR.predict_given_factorizations()
inmodels/mgpr.py
, I think the docstrings indicate that:V = cov[x,x]^{-1} @ cov[x,pi] @ cov[pi,u]
where I call pi the action before squashing
From section 5.5 of the 2015 paper, it says:
V = cov[x,pi] @ cov[pi,pi]^{-1} @ cov[pi,u]
Are these expressions equivalent or have I misread something. Thanks!
The text was updated successfully, but these errors were encountered: