Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in CuDNN: CUDNN_STATUS_INTERNAL_ERROR(lua5.1,1080ti, CUDA8.0,cudnn5.1) #374

Open
zhuaichun opened this issue Jun 19, 2017 · 6 comments

Comments

@zhuaichun
Copy link

I encounter these error messages. Command line is: th neural_style.lua -gpu 0 -backend cudnn.

/home/aichun/torch/install/bin/luajit: /home/aichun/torch/install/share/lua/5.1/nn/Container.lua:67:
In 2 module of nn.Sequential:
/home/aichun/torch/install/share/lua/5.1/cudnn/init.lua:145: Error in CuDNN: CUDNN_STATUS_INTERNAL_ERROR
stack traceback:
[C]: in function 'error'
/home/aichun/torch/install/share/lua/5.1/cudnn/init.lua:145: in function 'getHandle'
/home/aichun/torch/install/share/lua/5.1/cudnn/init.lua:153: in function 'call'
/home/aichun/torch/install/share/lua/5.1/cudnn/init.lua:159: in function 'errcheck'
/home/aichun/torch/install/share/lua/5.1/cudnn/init.lua:174: in function 'toDescriptor'
...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:37: in function 'resetWeightDescriptors'
...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:96: in function 'checkInputChanged'
...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:120: in function 'createIODescriptors'
...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:188: in function <...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:186>
[C]: in function 'xpcall'
/home/aichun/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
/home/aichun/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
neural_style.lua:162: in function 'main'
neural_style.lua:601: in main chunk
[C]: in function 'dofile'
...chun/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00406620

WARNING: If you see a stack trace below, it doesn't point to the place where this error occurred. Please use only the one above.
stack traceback:
[C]: in function 'error'
/home/aichun/torch/install/share/lua/5.1/nn/Container.lua:67: in function 'rethrowErrors'
/home/aichun/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
neural_style.lua:162: in function 'main'
neural_style.lua:601: in main chunk
[C]: in function 'dofile'
...chun/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00406620

Anyone here could help me?

@zhuaichun zhuaichun changed the title Error in CuDNN: CUDNN_STATUS_INTERNAL_ERROR(lua5.1,1080ti) Error in CuDNN: CUDNN_STATUS_INTERNAL_ERROR(lua5.1,1080ti, CUDA8.0,cudnn5.1) Jun 19, 2017
@dinggd
Copy link

dinggd commented Oct 3, 2017

I've run into this too and solved it by simply reducing the batch size, this seems to be a gpu memory issue.

@visonpon
Copy link

visonpon commented Dec 7, 2017

@gddingcs @soumith I also encounter this problem. but my batch size is 1 and my gpu memory is sufficient .

@visonpon
Copy link

visonpon commented Dec 8, 2017

@gddingcs @zhuaichun @soumith
my cudnn version is 5.1 and i have installed the latest cutorch , i really don't know why.
since it has confused me some days , hope you guys can help me.
thanks~
(the error message is the same as above...)

@caiqingnanhai
Copy link

caiqingnanhai commented Jan 11, 2018

@visonpon Hi,
You can try sudo rm -rf ~/.nv, it can fix it .

@avcs2080
Copy link

avcs2080 commented Feb 1, 2018

@caiqingnanhai

"sudo rm -rf ~/.nv" actually worked for me....

@ten1er
Copy link

ten1er commented Jan 8, 2020

I've run into this too and solved it by changed cudnn5.0 to cudnn5.1, when i run the code again, it suddenly blacked and reboot, then the error disappeared. (ubuntu18, rtx2070, CUDA9.0, cudnn5.1)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants