Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can you give me some advice on training the model? #6

Open
BayMaxBHL opened this issue Nov 10, 2022 · 13 comments
Open

Can you give me some advice on training the model? #6

BayMaxBHL opened this issue Nov 10, 2022 · 13 comments

Comments

@BayMaxBHL
Copy link

After setting up the environment, I used NYU dataset for training, but the training results were very strange. The loss function converged slowly, rmse kept increasing, and delta kept decreasing.

@BayMaxBHL
Copy link
Author

image

@BayMaxBHL
Copy link
Author

image

@BayMaxBHL
Copy link
Author

image

@BayMaxBHL
Copy link
Author

image

@BayMaxBHL
Copy link
Author

image

@BayMaxBHL
Copy link
Author

The only changes I made to the code were to write a script to generate the CSV with a batch-size of 8.

@haifengwu205
Copy link

After setting up the environment, I used NYU dataset for training, but the training results were very strange. The loss function converged slowly, rmse kept increasing, and delta kept decreasing.
Can you share your code, I can't run it with the current

@haifengwu205
Copy link

@BayMaxBHL Hello,can you run the test code? I run it with a size mismatch, as follows:
size mismatch for net.coords: copying a param with shape torch.Size([8, 480, 640, 2]) from checkpoint, the shape in current model is torch.Size([16, 480, 640, 2]).

@BayMaxBHL
Copy link
Author

@haifengwu205 确实是用不了,我这不晒出来的结果就是不收敛嘛。rmse还卡卡往上涨,人都麻了。

@hutingz
Copy link

hutingz commented Apr 13, 2023

@haifengwu205 确实是用不了,我这不晒出来的结果就是不收敛嘛。rmse还卡卡往上涨,人都麻了。

请问这个代码用的是多大数据的NYU呀

@macromogic
Copy link

Hi. I'm also trying to train the model myself and it does not converge as well. Does anyone have some solution so far? Thanks in advance!

@zhaorui-tan
Copy link

It's a similar situation to me. I used NYU dataset, and the result crashed as hell. It seems the model learned a completed wrong thing.
image
1697630602121
1697630674325

@jianqiaowang-wjq
Copy link

Why did I not output any results after using the train.sh script?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants