-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pretraining DCT #26
Comments
Hi,
|
Thanks! |
So you first pretrained the DCT stream as a binary classification problem and then used that to transfer it for training the segmentation model, which is what this repo contains, is that right? In that case I understand I should modify the training file to use the network_DCT_cl model. |
For the first question, you are right. Just use any binary classification code to train the DCT Stream. I used a learning rate of 0.05 which decayed every 10 epochs by x0.1 until 30 epochs. The optimizer was SGD with a momentum of 0.9. |
I used this repo for pretraining: https://github.com/HRNet/HRNet-Image-Classification |
And are the weights available for downloading (DCT_only_v2.pth.tar) from the pretrained classification model with Park? If not, could you make them public?, so that I can compare it with my results. |
It's pretrained weights / DCT_djpeg.pth.tar |
Hi,
Thank you for the code.
I was trying to understand how you do the pretraining step using only the DCT stream.
You say in the paper that you use data from Park et al. containing single a double compressed images. Regarding pretraining I have a couple of doubts that I hope you can help me understand:
Looking at the code I see that in Splicing/data/AbstractDataset.py you return a tensor with the qtables and the gt masks. But in the case of the DJPEG annotations you are receiving an integer 0 or 1 instead of a mask, so when the tensor is created in the
_create_tensor
function it throws anAttributeError: 'int' object has no attribute 'shape'
in line 133.Shouldn't this code create a mask full of zeros when the image is single compressed and full of ones when it is double compressed?
If these images have sizes 256x256, doesn't the padding to 512x512 with pixels with values 127.5 affect the performance during pretraining? SInce masks would have 1/4 of pixels with 0's or 1's and the remaining 3/4 with gray values. Maybe I didn't understand it correctly.
Thank you.
The text was updated successfully, but these errors were encountered: