Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault,Core dumped #85

Open
ggggxm opened this issue Oct 22, 2024 · 1 comment
Open

Segmentation fault,Core dumped #85

ggggxm opened this issue Oct 22, 2024 · 1 comment

Comments

@ggggxm
Copy link

ggggxm commented Oct 22, 2024

My env is:

  • python 3.9
  • pytorch 2.4.0+cu12.1
  • cudatoolkit 12.1

I have tried:

  • decrease batchsize to 1
  • only test 1 picture
  • change python version to 3.10 and cuda 12.4
  • set n_worker to 0

But the issue still exists, gdb output like this:
#0 0x00007fff4402c640 in ?? () from /lib/x86_64-linux-gnu/libcuda.so.1
#1 0x00007fff43eedfe8 in ?? () from /lib/x86_64-linux-gnu/libcuda.so.1
#2 0x00007fff440544da in ?? () from /lib/x86_64-linux-gnu/libcuda.so.1
#3 0x00007fff43f2cf56 in ?? () from /lib/x86_64-linux-gnu/libcuda.so.1
#4 0x00007fff43f2d667 in ?? () from /lib/x86_64-linux-gnu/libcuda.so.1
#5 0x00007fff43f30431 in ?? () from /lib/x86_64-linux-gnu/libcuda.so.1
#6 0x00007fff44132370 in ?? () from /lib/x86_64-linux-gnu/libcuda.so.1
#7 0x00007ffff5c1498c in ?? ()
from /home/baifeng/miniconda/envs/comfy/lib/python3.9/site-packages/torch/lib/../../nvidia/cuda_runtime/lib/libcudart.so.12
#8 0x00007ffff5c6bf5e in cudaLaunchKernel ()......

and the issue occurs in randow step(from 20 to 88......)

@ggggxm
Copy link
Author

ggggxm commented Oct 22, 2024

Epoch 101/1000 - steps: 10%|█████▌ | 100/1000 [10:17<1:32:37, 6.18s/it, avr_loss=0.268]
Thread 74 "pt_main_thread" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffef1800640 (LWP 323521)]

I think the problem is due to my cuda version or env,can somebody succeed training tell me your environment?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant