You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Oct 7, 2024. It is now read-only.
First of all, thanks for sharing the great project! I have tried to implement your mobilePydnet network but cannot reach totally to the same results compared with pre-trained model. For that reason I have several questions about the model, loss, data and training itself.
Did you initialize weights and biases by using some particular initialization strategy or did you just use the default initialization of convolution layers?
Did you use any data augmentation like flipping, rotating, random cropping or blurring?
You told here in the issues section that your range of input and output images are [0,255]. Does it mean that in the training when you load input image and ground truth as float32, you don't normalize them for example by dividing 255 to range [0,1]?
The loss is described in the paper as: , where is fixed to 1 and goes from 0.5, 0.25, 0.125 (if I understood correctly you just used 3 different scales). Here is the python code for calculating the loss but I'm not sure if I am missing something:
The text was updated successfully, but these errors were encountered:
I found the answer to the question 1. from the provided codes. So, convolutional kernels are initialized with xavier initialization and biases with trundated normal initialization (mean 0.0 and std 1.0).
However I have a new question about the network architecture.
In the original Pydnet get_disp extracts depth map by means of a sigmoid operator but in your network sigmoids are replaced by convolutions which outputs 1 channel. Does it really goes like this?
First of all, thanks for sharing the great project! I have tried to implement your mobilePydnet network but cannot reach totally to the same results compared with pre-trained model. For that reason I have several questions about the model, loss, data and training itself.
Did you initialize weights and biases by using some particular initialization strategy or did you just use the default initialization of convolution layers?
Did you use any data augmentation like flipping, rotating, random cropping or blurring?
You told here in the issues section that your range of input and output images are [0,255]. Does it mean that in the training when you load input image and ground truth as float32, you don't normalize them for example by dividing 255 to range [0,1]?
The loss is described in the paper as: , where is fixed to 1 and goes from 0.5, 0.25, 0.125 (if I understood correctly you just used 3 different scales). Here is the python code for calculating the loss but I'm not sure if I am missing something:
The text was updated successfully, but these errors were encountered: