Computation of cross-covariance of state and action #35

dvtailor · 2019-10-07T03:04:36Z

From only looking at the docstrings of the relevant functions, I think I noticed a discrepancy to the paper. I am writing this without checking the math in the code so I may be wrong.

V returned in RbfController.compute_action() in controllers.py
corresponds to Cov[x,u]

From backtracking to MGPR.predict_given_factorizations() in models/mgpr.py, I think the docstrings indicate that:

V = cov[x,x]^{-1} @ cov[x,pi] @ cov[pi,u]

where I call pi the action before squashing

From section 5.5 of the 2015 paper, it says:

V = cov[x,pi] @ cov[pi,pi]^{-1} @ cov[pi,u]

Are these expressions equivalent or have I misread something. Thanks!

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Computation of cross-covariance of state and action #35

Computation of cross-covariance of state and action #35

dvtailor commented Oct 7, 2019

Computation of cross-covariance of state and action #35

Computation of cross-covariance of state and action #35

Comments

dvtailor commented Oct 7, 2019