From c05354d3f3f39b57c1c9f6b408b1127147279f03 Mon Sep 17 00:00:00 2001 From: michele-milesi <74559684+michele-milesi@users.noreply.github.com> Date: Tue, 12 Dec 2023 11:53:09 +0100 Subject: [PATCH] fix: code copy button (#20) --- _posts/2023-05-16-functionality-checks.md | 2 ++ _posts/2023-05-17-welcome.md | 4 ++-- _posts/2023-07-06-dreamer_v2.md | 4 ++++ _posts/2023-08-10-dreamer_v3.md | 4 ++++ assets/js/main.js | 11 ++++++++++- 5 files changed, 22 insertions(+), 3 deletions(-) diff --git a/_posts/2023-05-16-functionality-checks.md b/_posts/2023-05-16-functionality-checks.md index 6905b67..069d36b 100644 --- a/_posts/2023-05-16-functionality-checks.md +++ b/_posts/2023-05-16-functionality-checks.md @@ -14,6 +14,7 @@ subclass: 'post' --- # Available Functionalities +
```python @@ -78,6 +79,7 @@ trainer = pl.Trainer(gpus=4, num_nodes=8, precision=16, limit_train_batches=0.5) trainer.fit(model, train_loader, val_loader) ``` +
# Latex Formulas diff --git a/_posts/2023-05-17-welcome.md b/_posts/2023-05-17-welcome.md index 52c583e..61816e9 100644 --- a/_posts/2023-05-17-welcome.md +++ b/_posts/2023-05-17-welcome.md @@ -35,8 +35,8 @@ Picture this: Within a mere five minutes, you'll have your first agent trained a
-
- +
+ git clone https://github.com/Eclectic-Sheep/sheeprl.git cd sheeprl python3.10 -m venv .venv diff --git a/_posts/2023-07-06-dreamer_v2.md b/_posts/2023-07-06-dreamer_v2.md index 43a6436..9c69513 100644 --- a/_posts/2023-07-06-dreamer_v2.md +++ b/_posts/2023-07-06-dreamer_v2.md @@ -93,6 +93,8 @@ Our PyTorch implementation aims to be a simple, scalable and well-documented rep As an example, the implementation of the *KL balancing* directly follows the equation above: +
+ ```python from torch.distributions import Independent, OneHotCategoricalStraightThrough @@ -108,6 +110,8 @@ rhs = kl_divergence( kl_loss = alpha * lhs + (1 - alpha) * rhs ``` +
+ Do you want to know more about how we implemented Dreamer-V2? Check out [our implementation](https://github.com/Eclectic-Sheep/sheeprl/tree/main/sheeprl/algos/dreamer_v2){:target="_blank"}. ### References diff --git a/_posts/2023-08-10-dreamer_v3.md b/_posts/2023-08-10-dreamer_v3.md index b450980..da236f7 100644 --- a/_posts/2023-08-10-dreamer_v3.md +++ b/_posts/2023-08-10-dreamer_v3.md @@ -68,6 +68,8 @@ $$ #### Uniform Mix To prevent spikes in the KL loss, the categorical distributions (the one for discrete actions and the one for the posteriors/priors) are parametrized as mixtures of $1\%$ uniform and $99\%$ neural network output. This avoid the distributions to become near deterministic. To implement the *uniform mix*, we applied the *uniform mix* function to the logits returned by the neural networks. +
> + ```python import torch from torch import Tensor @@ -86,6 +88,8 @@ def uniform_mix(self, logits: Tensor, unimix: float = 0.01) -> Tensor: return logits ``` +
+ #### Return regularizer for the policy The main difficulty in Dreamer-V2 *actor learning phase* is the choosing of the entropy regularizer, which heavily depends on the scale and the frequency of the rewards. To have a single entropy coefficient, it is necessary to normalize the returns using moving statistics. In particular, they found out that it is more convenient to scale down large rewards and not scale up small rewards, to avoid adding noise. diff --git a/assets/js/main.js b/assets/js/main.js index 8e6ddce..d7c0e15 100644 --- a/assets/js/main.js +++ b/assets/js/main.js @@ -49,7 +49,8 @@ $(document).ready(function () { }); // Document Ctrl + C - const sources = document.querySelectorAll("code:not(.with-new-line)"); + const sources = document.querySelectorAll(":not(.with-new-line) code"); + const sources_new_line = document.querySelectorAll(".with-new-line code"); sources.forEach(source => { source.addEventListener("copy", (event) => { @@ -58,4 +59,12 @@ $(document).ready(function () { event.preventDefault(); }); }); + + sources_new_line.forEach(source => { + source.addEventListener("copy", (event) => { + const selection = document.getSelection(); + event.clipboardData.setData("text/plain", selection.toString()); + event.preventDefault(); + }); + }); }); \ No newline at end of file