diff --git a/_posts/2023-08-10-dreamer_v3.md b/_posts/2023-08-10-dreamer_v3.md index c0f0676..1681959 100644 --- a/_posts/2023-08-10-dreamer_v3.md +++ b/_posts/2023-08-10-dreamer_v3.md @@ -39,7 +39,7 @@ $$ \text{symlog}(x) \doteq \text{sign}(x) \ln \left(\lvert x \rvert + 1\right) $ Given the neural network prediction, it is possible to obtain the non-transformed target by appling the inverse transformation (i.e., the *symexp*): -$$ \text{symexp}(x) \doteq \text{sign}(x) \left(\exp \left(\lvert x \rvert \right) + 1\right) $$ +$$ \text{symexp}(x) \doteq \text{sign}(x) \left(\exp \left(\lvert x \rvert \right) - 1\right) $$ The last detail to report is that the symlog prediction is used in the decoder, the reward model and the critic. Moreover, the inputs of the MLP encoder (the one that encodes observations in vector form) are squashed with the *symlog* function ([Figure 2](#fig-symlog){: .fig-link}).