Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BPNet fails with wrong cudnn version, but tensorflow doesn't. #8

Open
mmtrebuchet opened this issue Jan 30, 2020 · 1 comment
Open

Comments

@mmtrebuchet
Copy link

Hoo boy. Got a rough one.
I'm trying to run BPNet on chemical mapping data, and it gets to epoch one before it crashes. It doesn't even crash cleanly. There's a segfault, and control returns to the terminal, but several bpnet processes continue to exist though they don't seem to be doing anything. A killall bpnet is necessary to stop it. Logs are attached, with tensorflow complaining about driver versions.
But it gets worse than just the wrong version of the drivers. Because the simple Tensorflow tutorial succeeds. Included in the file is a testTensorflow.py file that executes correctly. This leads me to think that the problem is actually not a problem in tensorflow configuration, but rather an insidious bug in BPNet itself.
problemRun.zip

Good luck! Let me know if you need me to test anything.

@snystrom
Copy link

I had a similar issue and fixed by installing tensorflow-gpu==1.8 (assuming you're using a GPU). Conda seems to pull the wrong CuDnn with v1.7.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants