Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thank for your contribution! #283

Open
limaolin2017 opened this issue Feb 5, 2024 · 13 comments
Open

Thank for your contribution! #283

limaolin2017 opened this issue Feb 5, 2024 · 13 comments

Comments

@limaolin2017
Copy link

limaolin2017 commented Feb 5, 2024

I had the following question the first time I ran this:

Error file:'Traceback (most recent call last):
File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/_backend.py", line 53, in
from nerfacc import csrc as _C
ImportError: cannot import name 'csrc' from 'nerfacc' (/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/init.py)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1717, in _run_ninja_build
subprocess.run(
File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/subprocess.py", line 528, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/gpfs/home/mli/banmo/scripts/visualize/nvs.py", line 201, in
app.run(main)
File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/absl/app.py", line 312, in run
run_main(main, args)
File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/absl/app.py", line 258, in run_main
sys.exit(main(argv))
File "/gpfs/home/mli/banmo/scripts/visualize/nvs.py", line 140, in main
rendered_chunks = render_rays(nerf_models,
File "/gpfs/home/mli/banmo/nnutils/rendering.py", line 132, in render_rays
ray_indices, t_starts, t_ends = estimator.sampling(
File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
return func(*args, **kwargs)
File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/estimators/occ_grid.py", line 164, in sampling
intervals, samples, _ = traverse_grids(
File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
return func(*args, **kwargs)
File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/grid.py", line 158, in traverse_grids
t_mins, t_maxs, hits = ray_aabb_intersect(rays_o, rays_d, aabbs)
File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
return func(*args, **kwargs)
File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/grid.py", line 43, in ray_aabb_intersect
t_mins, t_maxs, hits = C.ray_aabb_intersect(
File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/init.py", line 11, in call_cuda
from .backend import C
File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/backend.py", line 61, in
C = load(
File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1124, in load
return jit_compile(
File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1337, in jit_compile
write_ninja_file_and_build_library(
File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1449, in write_ninja_file_and_build_library
run_ninja_build(
File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1733, in run_ninja_build
raise RuntimeError(message) from e
RuntimeError: Error building extension 'nerfacc_cuda': [1/6] /home/mli/.conda/envs/banmo-cu113/bin/nvcc -DTORCH_EXTENSION_NAME=nerfacc_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/TH -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/THC -isystem /home/mli/.conda/envs/banmo-cu113/include -isystem /home/mli/.conda/envs/banmo-cu113/include/python3.9 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS
-D__CUDA_NO_BFLOAT16_CONVERSIONS
-D__CUDA_NO_HALF2_OPERATORS
--expt-relaxed-constexpr -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 --compiler-options '-fPIC' -O3 -std=c++14 -c /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/csrc/grid.cu -o grid.cuda.o
FAILED: grid.cuda.o
/home/mli/.conda/envs/banmo-cu113/bin/nvcc -DTORCH_EXTENSION_NAME=nerfacc_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/TH -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/THC -isystem /home/mli/.conda/envs/banmo-cu113/include -isystem /home/mli/.conda/envs/banmo-cu113/include/python3.9 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS
-D__CUDA_NO_BFLOAT16_CONVERSIONS
-D__CUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 --compiler-options '-fPIC' -O3 -std=c++14 -c /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/csrc/grid.cu -o grid.cuda.o
/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/csrc/grid.cu:4:10: fatal error: ATen/cuda/CUDAGeneratorImpl.h: No such file or directory
#include <ATen/cuda/CUDAGeneratorImpl.h>
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
[2/6] /home/mli/.conda/envs/banmo-cu113/bin/nvcc -DTORCH_EXTENSION_NAME=nerfacc_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/TH -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/THC -isystem /home/mli/.conda/envs/banmo-cu113/include -isystem /home/mli/.conda/envs/banmo-cu113/include/python3.9 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS -D__CUDA_NO_BFLOAT16_CONVERSIONS -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 --compiler-options '-fPIC' -O3 -std=c++14 -c /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/csrc/pdf.cu -o pdf.cuda.o
FAILED: pdf.cuda.o
/home/mli/.conda/envs/banmo-cu113/bin/nvcc -DTORCH_EXTENSION_NAME=nerfacc_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/TH -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/THC -isystem /home/mli/.conda/envs/banmo-cu113/include -isystem /home/mli/.conda/envs/banmo-cu113/include/python3.9 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 --compiler-options '-fPIC' -O3 -std=c++14 -c /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/csrc/pdf.cu -o pdf.cuda.o
/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/csrc/pdf.cu:4:10: fatal error: ATen/cuda/CUDAGeneratorImpl.h: No such file or directory
#include <ATen/cuda/CUDAGeneratorImpl.h>
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
[3/6] g++ -MMD -MF nerfacc.o.d -DTORCH_EXTENSION_NAME=nerfacc_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/TH -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/THC -isystem /home/mli/.conda/envs/banmo-cu113/include -isystem /home/mli/.conda/envs/banmo-cu113/include/python3.9 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -O3 -c /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/csrc/nerfacc.cpp -o nerfacc.o
[4/6] /home/mli/.conda/envs/banmo-cu113/bin/nvcc -DTORCH_EXTENSION_NAME=nerfacc_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/TH -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/THC -isystem /home/mli/.conda/envs/banmo-cu113/include -isystem /home/mli/.conda/envs/banmo-cu113/include/python3.9 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS
-D__CUDA_NO_BFLOAT16_CONVERSIONS
-D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 --compiler-options '-fPIC' -O3 -std=c++14 -c /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/csrc/camera.cu -o camera.cuda.o
[5/6] /home/mli/.conda/envs/banmo-cu113/bin/nvcc -DTORCH_EXTENSION_NAME=nerfacc_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="gcc" -DPYBIND11_STDLIB="libstdcpp" -DPYBIND11_BUILD_ABI="cxxabi1011" -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/TH -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/THC -isystem /home/mli/.conda/envs/banmo-cu113/include -isystem /home/mli/.conda/envs/banmo-cu113/include/python3.9 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 --compiler-options '-fPIC' -O3 -std=c++14 -c /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/csrc/scan.cu -o scan.cuda.o
ninja: build stopped: subcommand failed.',

info:'
cuda 113
Torch 1.10
nerfacc 0.5.3
'

@liruilong940607
Copy link
Collaborator

I see this error fatal error: ATen/cuda/CUDAGeneratorImpl.h: No such file or directory. For me this file lives in:

/home/ruilongli/anaconda3/envs/nerfacc/lib/python3.9/site-packages/torch/include/ATen/cuda/CUDAGeneratorImpl.h

You might want to check if your torch is installed in this conda env correctly.

Alternately, you could install our prebuilt wheels from here

@limaolin2017
Copy link
Author

Hi, does the latest version of nerfacc support torch 1.10?

@liruilong940607
Copy link
Collaborator

Yes

@limaolin2017
Copy link
Author

limaolin2017 commented Feb 13, 2024

I changed another conda env.

env info:'
cuda 11.3,
Torch 1.11'

I encounter some errors:'Traceback (most recent call last):
File "/gpfs/home/mli/banmo/scripts/visualize/nvs.py", line 201, in
app.run(main)
File "/home/mli/.conda/envs/torch-110/lib/python3.9/site-packages/absl/app.py", line 308, in run
_run_main(main, args)
File "/home/mli/.conda/envs/torch-110/lib/python3.9/site-packages/absl/app.py", line 254, in _run_main
sys.exit(main(argv))
File "/gpfs/home/mli/banmo/scripts/visualize/nvs.py", line 140, in main
rendered_chunks = render_rays(nerf_models,
File "/gpfs/home/mli/banmo/nnutils/rendering.py", line 136, in render_rays
ray_indices, t_starts, t_ends = estimator.sampling(
File "/home/mli/.conda/envs/torch-110/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/home/mli/.conda/envs/torch-110/lib/python3.9/site-packages/nerfacc/estimators/occ_grid.py", line 164, in sampling
intervals, samples, _ = traverse_grids(
File "/home/mli/.conda/envs/torch-110/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/home/mli/.conda/envs/torch-110/lib/python3.9/site-packages/nerfacc/grid.py", line 165, in traverse_grids
intervals, samples, termination_planes = _C.traverse_grids(
File "/home/mli/.conda/envs/torch-110/lib/python3.9/site-packages/nerfacc/cuda/init.py", line 13, in call_cuda
return getattr(_C, name)(*args, **kwargs)
RuntimeError: CUDA error: an illegal memory access was encountered'

@liruilong940607
Copy link
Collaborator

Could you dump the input of this function here and share it so I can take a look?

@limaolin2017
Copy link
Author

traverse_grids inputs: {'rays_o': tensor([[-0.2437, -0.0069, 0.0255],
[-0.2437, -0.0069, 0.0255],
[-0.2437, -0.0069, 0.0255],
...,
[-0.2437, -0.0069, 0.0255],
[-0.2437, -0.0069, 0.0255],
[-0.2437, -0.0069, 0.0255]], device='cuda:0'), 'rays_d': tensor([[ 0.9839, 0.1560, -0.2869],
[ 0.9840, 0.1560, -0.2848],
[ 0.9841, 0.1559, -0.2827],
...,
[ 0.9838, 0.1237, -0.3298],
[ 0.9839, 0.1236, -0.3277],
[ 0.9840, 0.1235, -0.3256]], device='cuda:0'), 'binaries': tensor([[[[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
...,
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False]],

     [[False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False],
      ...,
      [False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False]],

     [[False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False],
      ...,
      [False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False]],

     ...,

     [[False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False],
      ...,
      [False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False]],

     [[False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False],
      ...,
      [False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False]],

     [[False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False],
      ...,
      [False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False]]]]), 'aabbs': tensor([[0.0000, 0.0000, 0.0000, 0.3000, 0.3000, 0.3000]], device='cuda:0'), 'near_planes': tensor([0.2000, 0.2000, 0.2000,  ..., 0.2000, 0.2000, 0.2000], device='cuda:0'), 'far_planes': tensor([1., 1., 1.,  ..., 1., 1., 1.], device='cuda:0'), 'step_size': 0.001, 'cone_angle': 0.0}

@liruilong940607
Copy link
Collaborator

I can't use these pasted outputs to examine the code.. Could you save them into a file (say .pth or .npz) and upload it here?

@limaolin2017
Copy link
Author

Thank you for checking!

inputs.pth.zip

@limaolin2017
Copy link
Author

I have checked the NaN values, shape, and type of input arguments, and there are no existing issues. Can you provide any suggestions on how to handle it?

@limaolin2017
Copy link
Author

I have checked the GPU memory usage, it is normal.

@liruilong940607
Copy link
Collaborator

Hi I will check this issue after ECCV's ddl tmr!

@ZitongLan
Copy link

Having the same issue on the sampling function, returns
File "/home/ztlan/anaconda3/envs/nerf2/lib/python3.8/site-packages/nerfacc/estimators/occ_grid.py", line 164, in sampling
intervals, samples, _ = traverse_grids(
File "/home/ztlan/anaconda3/envs/nerf2/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/ztlan/anaconda3/envs/nerf2/lib/python3.8/site-packages/nerfacc/grid.py", line 165, in traverse_grids
intervals, samples, termination_planes = _C.traverse_grids(
File "/home/ztlan/anaconda3/envs/nerf2/lib/python3.8/site-packages/nerfacc/cuda/init.py", line 13, in call_cuda
return getattr(_C, name)(*args, **kwargs)
RuntimeError: CUDA error: an illegal memory access was encountered
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

Is there any update on the issue? Thanks

@ZitongLan
Copy link

I have solved the issue. I didn't move my estimator to the GPU, so there is illegal memory access occurred.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants