Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error building cuda packages #125

Open
cravisjan97 opened this issue Dec 21, 2020 · 8 comments
Open

Error building cuda packages #125

cravisjan97 opened this issue Dec 21, 2020 · 8 comments

Comments

@cravisjan97
Copy link

I tried to build CUDA packages during the installation phases and I get a runtime error. Here are the commands I implemented:

cd DAIN/my_package/MinDepthFlowProjection/
rm -rf build *.egg-info dist
python setup.py install

After this, I get a runtime error:

ninja: build stopped: subcommand failed.
Traceback (most recent call last):
  File "/home/csundaram/anaconda3/envs/dain/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1533, in _run_ninja_build
    subprocess.run(
  File "/home/csundaram/anaconda3/envs/dain/lib/python3.8/subprocess.py", line 512, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

File "/home/csundaram/anaconda3/envs/dain/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1555, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error compiling objects for extension

How do I solve this?

Note: I see that the compilation is done using D_GLIBCXX_USE_CX11_ABI=0 but I know that my machine requires D_GLIBCXX_USE_CX11_ABI=1. This might solve the issue but I don't know how to set this flag in setup.py.

Any help is appreciated. Thanks!

@theaswanson
Copy link

theaswanson commented Dec 27, 2020

I'm getting the same issue. Running CUDA 11 on WSL 2, if it helps. I'm also getting lots of compilation errors and warnings, likely due to -std=c++11 being used in the ninja.build files rather than c++14 at least.

Edit: just saw issue #118 and realized I was using PyTorch 1.7.1 while the latest supported version for DAIN is 1.4.0. Will be compiling PyTorch 1.4.0 from source for CUDA 11 and see if that fixes anything. Might want to update the README to reflect this version requirement

@flobauer
Copy link

Have you been able to compile 1.4.0 with CUDA 11? I am not able to do it yet.

@theflanman
Copy link

I appear to have been able to get it working. Here's what I had to do, based on a few issues I read up on.

  • Ubuntu 20.04 on WSL2
  • Cuda 11.0
    • needed to upgrade windows via preview program dev channel to get this to work
  • Anaconda
    • Active environment called "pytorch1.0.0"
  • Downgraded pytorch to 1.4.0
  • Running as root

@YouweiLyu
Copy link

Manually setting the cxx and nvcc parameters in ./my_package/*/setup.py may help to solve this compiling problem. And one could refer to NVIDIA website to find the proper -gencode corresponding to the GPU used.

cxx_args = ['-std=c++14']
nvcc_args = [
    # '-gencode', 'arch=compute_50,code=sm_50',
    # '-gencode', 'arch=compute_52,code=sm_52',
    # '-gencode', 'arch=compute_60,code=sm_60',
    # '-gencode', 'arch=compute_61,code=sm_61'
    '-gencode', 'arch=compute_86,code=sm_86', # for RTX3090
    # '-gencode', 'arch=compute_70,code=compute_70'
]

I am using

  • Manjaro Linux KDE 20.1
  • PyTorch 1.7.1 & Anaconda env python=3.8
  • CUDA 11.1
  • RTX 3090

Moreover, remember change pytorch1.0.0 to your env name in ./build.sh
This way works for me. After the modification, some codes have to be adapted for pytorch 1.7.1 as well. Hope this method helpful to you. @flobauer @cravisjan97 @theflanman

@Khipucamayoc
Copy link

@YouweiLyu, is it too much to ask to share a walk-through on how to build/install it with CUDA 11.1? I am getting so many errors it would be useless to share them here (maybe)...

Could you tell me which are the codes that need to be adapted for PyTorch 1.7.1.?

I am using:

Ubuntu 20.04
PyTorch 1.9 & Anaconda env python=3.8
CUDA 11.1
RTX 3090

I changed the PyTorch as you recommended to my env name in ./build.sh but still does not work...
I suppose it is me not knowing which are the codes to be adapted to PyTorch 1.9? As it is backwards compatible, the problem should not lay in the release I suppose...

All help is appreciated!

@michaelmaverick
Copy link

@YouweiLyu, is it too much to ask to share a walk-through on how to build/install it with CUDA 11.1? I am getting so many errors it would be useless to share them here (maybe)...

Could you tell me which are the codes that need to be adapted for PyTorch 1.7.1.?

I am using:

Ubuntu 20.04 PyTorch 1.9 & Anaconda env python=3.8 CUDA 11.1 RTX 3090

I changed the PyTorch as you recommended to my env name in ./build.sh but still does not work... I suppose it is me not knowing which are the codes to be adapted to PyTorch 1.9? As it is backwards compatible, the problem should not lay in the release I suppose...

All help is appreciated!
This should work.
compiler_args.txt

@Khipucamayoc
Copy link

Hey, thanks for that @michaelmaverick! But it is not the problem, as I had already changed what it states in your .txt file... And still nothing.
@YouweiLyu mentioned that "After the modification, some codes have to be adapted for pytorch 1.7.1"... is that the only thing that one supposedly should change then? @michaelmaverick do you have it running? If you have the time, a walkthrough would be extremely appreciated, as I think the issue must lay somewhere else. Thanks beforehand!

@laomao0
Copy link

laomao0 commented Nov 7, 2021

If you do not want to build CUDA programs.
We provide the CUPY version of those packages.
The cupy files do not need to be built.
please refer to:
https://github.com/laomao0/cupy_packages

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants