You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Dear author,
Hello.
I am now training dino, the swin-s chosen by the backbone. My configuration is the same as yours, 4gpus, but my batch_size is halved to 8, so the initial learning rate is halved, but the training results are all 0.
"d2.checkpoint.c2_model_loading WARNING: Shape of norm.weight in checkpoint is torch.Size([768]), while shape of necks.norm.weight in model is torch.Size([256]) "
"d2.checkpoint.c2_model_loading WARNING: Shape of norm.weight in checkpoint is torch.Size([768]), while shape of transformer.decoder.norm.weight in model is torch.Size([256])"I downloaded from techches website weight directly, is this why?
Please don't hesitate to enlighten me!
The text was updated successfully, but these errors were encountered:
And I think you don't have to half the batch_size and learning rate, you can use gradient_checkpoint to lower the gpu memory usage and keep the batch_size the same for training.
Dear author,
Hello.
I am now training dino, the swin-s chosen by the backbone. My configuration is the same as yours, 4gpus, but my batch_size is halved to 8, so the initial learning rate is halved, but the training results are all 0.
"d2.checkpoint.c2_model_loading WARNING: Shape of norm.weight in checkpoint is torch.Size([768]), while shape of necks.norm.weight in model is torch.Size([256]) "
"d2.checkpoint.c2_model_loading WARNING: Shape of norm.weight in checkpoint is torch.Size([768]), while shape of transformer.decoder.norm.weight in model is torch.Size([256])"I downloaded from techches website weight directly, is this why?
Please don't hesitate to enlighten me!
The text was updated successfully, but these errors were encountered: