-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
line 244 in meta_neural_network_architectures.py #24
Comments
This is the same as issue: #3 My understanding is that this will affect the results reported in the paper. The code as written will always use the batch statistics, not a running average accumulated per step. |
What you stated was what I thought was the case. However, after doing a few
tests I found that what I stated previously was the right way to go. Check
for yourself.
…On Fri, 10 Apr 2020 at 16:11, jfb54 ***@***.***> wrote:
This is the same as issue #3
<#3>:
#3 <#3>
My understanding is that this will affect the results reported in the
paper. The code as written will always use the batch statistics, not a
running average accumulated per step.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#24 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACSK4NVZNHHH3IKFGAQK4XDRL4ZJZANCNFSM4I6H2MBQ>
.
|
Thanks for the quick response! The following is a short script that demonstrates my assertion. If you have tests that show otherwise, it would be great to see them.
|
I can definitely confirm that it was the case back in 2018. I'll need to
reconfirm with the latest versions of pytorch. Will come back to you soon.
…On Fri, 10 Apr 2020 at 17:43, jfb54 ***@***.***> wrote:
Thanks for the quick response! The following is a short script that
demonstrates my assertion. If you have tests that show otherwise, it would
be great to see them.
import torch
import torch.nn.functional as F
N = 64 # batch size
C = 16 # number of channels
H = 32 # image height
W = 32 # image width
eps = 1e-05
input = 10 * torch.randn(N, C, H, W) # create a random input
running_mean = torch.zeros(C) # set the running mean for all channels to be 0
running_var = torch.ones(C) # set the running mean for all channels to be 1
# Call batch norm with training=False. Expect that the input is normalized with the running mean and running variance
output = F.batch_norm(input, running_mean, running_var, weight=None, bias=None, training=False, momentum=0.1, eps=1e-05)
# Assert that the output is equal to the input
assert torch.allclose(input, output)
# Call batch norm with training=True. Expect that the input is normalized with batch statistics of the input.
output_bn = F.batch_norm(input, running_mean, running_var, weight=None, bias=None, training=True, momentum=0.1, eps=eps)
# Normalize the input manually
batch_mean = torch.mean(input, dim=(0, 2, 3), keepdim=True)
batch_var = torch.var(input, dim=(0, 2, 3), keepdim=True)
output_manual = (input - batch_mean) / torch.sqrt(batch_var + eps)
# Assert that output_bn equals output_manual
assert torch.allclose(output_bn, output_manual)
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#24 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACSK4NVCK3KIKFFIOQ65YHTRL5EE5ANCNFSM4I6H2MBQ>
.
|
Open
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Thank you for releasing the code.
I notice that the function
def forward(self, input, num_step, params=None, training=False, backup_running_statistics=False)
has a training indicator. However, within the function (line 244):
should the training be always set to true? Does this affect the reported results in the original paper, as batch norm per step appears to be an important trick for improving maml from the paper?
Many thanks.
The text was updated successfully, but these errors were encountered: