Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow loading time #385

Open
Atcold opened this issue Oct 28, 2017 · 8 comments
Open

Slow loading time #385

Atcold opened this issue Oct 28, 2017 · 8 comments

Comments

@Atcold
Copy link

Atcold commented Oct 28, 2017

Any idea why require 'cudnn' may take 45 seconds on my machine?

th> require 'cunn';
                                                                      [0.9818s]
th> require 'cudnn';
                                                                      [44.7415s]

Edit: Oh, maybe this is related.
Edit2: System info is the following.

Distributor ID: CentOS
Description:    CentOS Linux release 7.4.1708 (Core) 
Release:        7.4.1708
Codename:       Core
@Atcold
Copy link
Author

Atcold commented Oct 29, 2017

Hmm, other server here take 10 to 15 seconds... And the one above 40 to 45 seconds...
How can I debug this?

@clement-masson
Copy link

'require cudnn' initialize some stuff on every visible GPU. If you're on a machine with many GPUs, it may be the cause of the long loading time.

We've got a machine with 4 GPUs. Setting CUDA_VISIBLE_DEVICES=0 (for instance) reduce the loading time by almost a factor 4. On our machine, it takes <10sec though ...

@Atcold
Copy link
Author

Atcold commented Oct 30, 2017

@clement-masson, right. I just saw that. Still, I believe some things must be wrong. I've contacted the IT (I don't have sudo here...).

@ajhool
Copy link

ajhool commented Jan 26, 2019

I'm finding that require cudnn on a volta takes 10 minutes. @clement-masson , any idea how I can profile the require function to see what exactly is taking so long with the volta architecture?

@ajhool
Copy link

ajhool commented Jan 31, 2019

@nagadomi , I'm using your distro with cuda9/10 support. Any ideas why the bindings might be struggling with the Volta architecture?

@nagadomi
Copy link
Contributor

@ajhool
If you are using Docker, it may be caused by JIT Caching.
See nagadomi/waifu2x#138 ,
https://github.com/nagadomi/waifu2x/pull/138/files#diff-04c6e90faac2675aa89e2176d2eec7d8

@ajhool
Copy link

ajhool commented Jan 31, 2019

I am using docker and I'll give that a shot, thanks!

@ajhool
Copy link

ajhool commented Feb 1, 2019

So far, the JIT Caching fix does not appear to be working, although I'm having a hard time debugging Torch/Lua without a debug environment or print statements. I believe I have the cache and cache path configured correctly and the load time is still about 10 minutes.

The fact that the code executes quickly on K80's but takes so much longer on Voltas makes me suspect there's more to it than just luajit. Will continue to try and get to the bottom of this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants