Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code related #1

Open
jacksonsc007 opened this issue Nov 7, 2023 · 8 comments
Open

Code related #1

jacksonsc007 opened this issue Nov 7, 2023 · 8 comments

Comments

@jacksonsc007
Copy link

Thanks for the excellent work. I wonder the release time of the source code.
Looking forward to your reply. : )

@WANGSSSSSSS
Copy link
Collaborator

NOW!

@jacksonsc007
Copy link
Author

Terrific! Thanks a lot. Could you specify the version of mmdetection btw?

@jacksonsc007
Copy link
Author

And the pytorch version, it seems like you have pytorch>=2.0.0

@WANGSSSSSSS
Copy link
Collaborator

System environment:
sys.platform: linux
Python: 3.8.16 (default, Jun 12 2023, 18:09:05) [GCC 11.2.0]
CUDA available: True
numpy_random_seed: 1123624972
GPU 0: NVIDIA GeForce RTX 3090
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 11.8, V11.8.89
GCC: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0
PyTorch: 2.0.1+cu117
PyTorch compiling details: PyTorch built with:

  • GCC 9.3

  • C++ Version: 201703

  • Intel(R) oneAPI Math Kernel Library Version 2022.2-Product Build 20220804 for Intel(R) 64 architecture applications

  • Intel(R) MKL-DNN v2.7.3 (Git Hash 6dbeffbae1f23cbbeae17adb7b5b13f1f37c080e)

  • OpenMP 201511 (a.k.a. OpenMP 4.5)

  • LAPACK is enabled (usually provided by MKL)

  • NNPACK is enabled

  • CPU capability usage: AVX2

  • CUDA Runtime 11.7

  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86

  • CuDNN 8.5

  • Magma 2.6.1

  • Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.7, CUDNN_VERSION=8.5.0, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 -fabi-version=11 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wunused-local-typedefs -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_DISABLE_GPU_ASSERTS=ON, TORCH_VERSION=2.0.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=1, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,

    TorchVision: 0.15.2+cu117
    OpenCV: 4.8.0
    MMEngine: 0.8.0

@WANGSSSSSSS
Copy link
Collaborator

And the pytorch version, it seems like you have pytorch>=2.0.0

yes, I try to catch up with the rapid world.

@jacksonsc007
Copy link
Author

Hi, while I went through your code, I encountered some issues, hope you could help me out :)

  1. what does "self.grad_accumulation" mean here? And the meaning of "stash gradient"?
  2. for the refine-aware gradient formulation you proposed in equation (11), it seems that you didn't not use this technique in your code implementation to speed up training and save memory, but used naive iteration and autograd of pytorch to deal with backward grad propagation instead. Am I right?

@WANGSSSSSSS
Copy link
Collaborator

Hi, while I went through your code, I encountered some issues, hope you could help me out :)

1. what does "self.grad_accumulation" mean [here](https://github.com/MCG-NJU/DEQDet/blob/fa72a62b2340a04300424041e9ebd0087a700eba/projects/deqdet/deq_det_roi_head.py#L219C12-L219C35)? And the meaning of "stash gradient"?

2. for the refine-aware gradient formulation you proposed in equation (11), it seems that you didn't not use this technique in your code implementation to speed up training and save memory, but used naive iteration and **autograd of pytorch** to deal with backward grad propagation instead. Am I right?

The refinement aware gradient is equivalent to the truncated bptt to some extend, cutting off the higher order terms of the rnn iterations. Due to that then each supervision is independent, we can use gradient accumulation between each supervision to avoid the extra memory consumption, but the Autograd in pytorch will push the gradient calculated in single supervision to every parameters, resulting serval back pass to backbone though, so I use this hook to stash gradient to mlvl features, the last backward of the supervision will restore the stashed gradient, and bring stashed gradient to backbone weights

@WANGSSSSSSS
Copy link
Collaborator

WANGSSSSSSS commented Nov 21, 2023

Hi, while I went through your code, I encountered some issues, hope you could help me out :)

1. what does "self.grad_accumulation" mean [here](https://github.com/MCG-NJU/DEQDet/blob/fa72a62b2340a04300424041e9ebd0087a700eba/projects/deqdet/deq_det_roi_head.py#L219C12-L219C35)? And the meaning of "stash gradient"?

2. for the refine-aware gradient formulation you proposed in equation (11), it seems that you didn't not use this technique in your code implementation to speed up training and save memory, but used naive iteration and **autograd of pytorch** to deal with backward grad propagation instead. Am I right?

For the question 2, yes, the RAG formulation is derived from 2-step unrolled fix-point formulation in paper, the implementation in codebase is that 2-step unrolled fix-point. The equation mainly helps to analyze the reason why two-step better than simple estimation method used in deq-flow. You can find the pesudo code in appendix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants