Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The reason why constant loss #2

Open
weiweisunWHU opened this issue Jun 14, 2017 · 30 comments
Open

The reason why constant loss #2

weiweisunWHU opened this issue Jun 14, 2017 · 30 comments

Comments

@weiweisunWHU
Copy link

Hello Wei,
I have prepared the data and trained the models without changing anything. However, I found the loss converging to 3.69. Then I changed initial learning rate but obtaining same converged loss (3.69). Do you know is there any problem? By the way, could you please provide the trained weight? Thanks a lot.

@WeiTang114
Copy link
Owner

Could you post your a rendered view image?
A possible problem is that you have "grey object on white background". We need "white object on black backgroud".

@weiweisunWHU
Copy link
Author

Thank you for your reply!
It is one of the rendered images. airplane_0001_004
Did you successfully train this model? Because I find that your FC8 layer is followed by a RELU layer. That's why the model finally outputs a stable value (0).

@WeiTang114
Copy link
Owner

WeiTang114 commented Jun 16, 2017

Would you try inverting the views so that they have black background?
White background would make the activations unstable.
You can also try my rendered views:
https://drive.google.com/open?id=0B4v2jR3WsindMUE3N2xiLVpyLW8

  • Yes, I successfully trained my model to about 90% accuracy with ReLU after FC8.

@weiweisunWHU
Copy link
Author

Thank you very much!
I successfully trained the model too by using mean-subtraction. Still, I discarded the last ReLU layer.
Anyway, Thanks a lot for your kind help.

@ghost
Copy link

ghost commented Jun 23, 2017

@WeiTang114 I used the rendered views(given by your link) as input and still get a constant loss of approximately 3.69 even after 25 epochs! Could you tell me what might be the issue? Also, is there a way to visualize the loss/accuracy vs epoch?

Thanks!

@WeiTang114
Copy link
Owner

@Priyam1994

  • What's your learning rate? 0.001 should be good.
  • You can use TensorBoard for Visualization. Specifically start it by $ tensorboard --logdir tmp/ --port 5000, and view the graphs in the browser.

@ghost
Copy link

ghost commented Jun 23, 2017

@WeiTang114
Thanks for your reply! I used a learning rate of 0.001 for my training as suggested.

@WeiTang114
Copy link
Owner

@Priyam1994 Then I'm not very sure what problem it may be.
I would see if gradients exploded (in the "distribution" tab in tensorboard), or weights are not initialized well, etc.

@weiweisunWHU
Copy link
Author

In my perspective, the learning rate for deep layer (fc layers) should be multiplied by 10. It works for me. I recommend it for you @Priyam1994.

@ghost
Copy link

ghost commented Jun 30, 2017

@weiweisunWHU Could you tell me how you changed the learning rate only for the fc layers?
Additionally, did you train the network from scratch or used the pre-trained model to fine tune?
Thank you.

@weiweisunWHU
Copy link
Author

@Priyam1994
For example:
opt1=tf.train.AdamOptimizer(lr*10).minimize(loss,var_list=listvar_update)
the var_list should be the variables list of the fc layers.

@Xmen0123
Copy link

Xmen0123 commented Jul 7, 2017

@weiweisunWHU
I am having the same problem.
With this information alone I could not solve it. I'm sorry, could you tell me in a bit more detail?
For example, I want to know the source code change.

@WeiTang114
Copy link
Owner

@weiweisunWHU @Priyam1994 @Xmen0123
Sorry! I found there was a typo in readme. The "--learning-rate=0.001" should be "--learning_rate=0.001", so that argument didn't work...

Also, I found 0.0001 is more reliable for training.

I've updated the code (commit b476e17f11bd540f4f962ae157f20c17067996b2).

@youkaichao
Copy link
Contributor

I also get the magic number:3.69.
My rendered view is in white background with the size of 224x224.
airplane_0627_012

With the constant loss of 3.69, I got a poor accuracy of just 2%. Too sad…… There must be something wrong……

@WeiTang114
Copy link
Owner

@youkaichao
It should work with black background. Mine is at the comment above. Otherwise, you may just invert the images offline or online (invert the image in input.py, after reading the images at line 27).

If simply inverting your images works, I'll consider adding an option such as "--white_background=True" 😅

@youkaichao
Copy link
Contributor

@WeiTang114
It works! After adding the code below in input.py, after line 27, the loss is no longer constantly 3.69 and the accuracy is decent now.
im = cv2.bitwise_not(im)

But I'm puzzled. What's the difference between white and black? If I feed the MVCNN with white-background views, it's expected to identify the white-background views, isn't it?

@WeiTang114
Copy link
Owner

@youkaichao
Black is 0 and white is 255. My theory is that zero-background passed into convolution layers (practically matrix multiplications) leads to zero outputs, while the object in greyscale has informative output after the convolution. Thus black background makes the activation of the layers more stable than white.

@youkaichao
Copy link
Contributor

Now the problem is solved, I got an accuracy of 85%. Not state-of-art, but reasonably good. Thank you! I'll go fine-tuning now ^_^

@ghost
Copy link

ghost commented Jul 21, 2017

@WeiTang114 Your suggestion worked and I could achieve a test accuracy of 88%. But it always results in the constant 3.69 loss if I use the caffe alexnet model, any thoughts? @youkaichao Did you train from scratch or did you use the alexnet model? If you did use the pretrained model, did you make any changes?
Thanks!

@youkaichao
Copy link
Contributor

@Priyam1994 I'm using the pretrained alexnet model, and it works. I have made no changes to the pretrained alexnet model.

@ghost
Copy link

ghost commented Jul 21, 2017

@youkaichao Thank you for the clarification. Did you maybe change any other parameters than what has been originally suggested?

@youkaichao
Copy link
Contributor

@Priyam1994 Nope. I run MVCNN with default settings. And I don't know how to replace the alexnet model ……
Is it possible that you forgot to run ./prepare_pretrained_alexnet.sh?

@ghost
Copy link

ghost commented Sep 10, 2017

@WeiTang114 Could you tell me if there is a specific reason the rendered images(specified in the link given by you) are of dimension 600 * 600? Can I also feed in images of other dimensions, say 300 * 300 or 400*400?

Thank you

@WeiTang114
Copy link
Owner

WeiTang114 commented Sep 10, 2017

@Priyam1994 That's just random. After input into the network the images are resized to 256 and cropped to 227, so I just wanted a large enough size so that the images aren't distorted after resized. Of course any other size is fine!

@ghost
Copy link

ghost commented Sep 10, 2017

@WeiTang114 Thank you for your and informative reply!

@rlczddl
Copy link

rlczddl commented Nov 1, 2017

@weiweisunWHU
hi,can you tell me what changes have you made?i set substract_mean is true and remove relu layer after fc8,and the prediction are always same(not only 0,but also other numbers).

@rlczddl
Copy link

rlczddl commented Nov 1, 2017

@youkaichao
hi,i want to know you just add "cv2.bitwise_not(im)" after line27 in input.py,no other chages,then it works?why i cannot.

@491506870
Copy link

@youkaichao @WeiTang114 hello, after i use your code "cv2.bitwise_not(im)" after line 27, my acc is still about 2%, for example, sometimes it's "acc=1.953125", sometimes it's "acc=2.734375" and other value, and i haven't changed anything in the code, the dataset i train is ModelNet40 with write background like your airplane image. Do you know what should i do? thank you so much.

@491506870
Copy link

@weiweisunWHU your method really works!i tried it successfully to more than 85% accuracy.

@WChen1996
Copy link

@491506870
Hi, would you please specify the method? Like which code you have changed. Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants