Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault (core dumped) when trying to train or overtrain a model #156

Open
Adodiego opened this issue May 22, 2023 · 1 comment

Comments

@Adodiego
Copy link

Hello, I'm trying to use the train.py function to train a RIFE model with my own database. It all goes well till the last epoch where it gives me a "Segmentation fault (core dumped)". I'm using --nproc_per_node=1 and --world_size=1, so maybe that's the issue? It doesn't matter how many epoch I use it always gives this error at the last epoch. Also by launching the code like this: sudo -E /usr/bin/python3 -m torch.distributed.launch --nproc_per_node=1 train.py --epoch=1 --world_size=1 the error becomes simply "Segmentation fault" without the "(core dumped)" part. Any ideas of why is giving me this issue?

@rriicckkee
Copy link

Have you solved this problem?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants