Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot deploy model to other inference engine (e.g. ONNX, OpenVINO) #38

Open
RiverLight4 opened this issue Mar 18, 2024 · 7 comments
Open

Comments

@RiverLight4
Copy link

Hello,
I'm interested in the RefineMask method and I'd like to use it to cut out the detected image from pictures.
I could train with this repository, but unfortunately, I cannot deploy the trained model to other inference engine. I'm afraid that it is because this official implementation is based on too old MMDetection code (v2.3.0).

I tried with tools/pytorch2onnx.py but the result is failed.
I also tried with MMdeploy v0.14.0 but the result is failed too.
At last, I tried to implement RefineMask into MMDetection v2.28.2 (latest version of 2.x) and tried with MMdeploy v0.14.0. I think I could implement correctly, and it works on MMDetection v2.28.2, but converting the model is failed.

All of them can inference on the MMDetection, but it is failed when converting model with MMDeploy, at torch.jit.trace and torch.jit.script.
I tried with Python 3.7, PyTorch 1.13.1 and CUDA 11.7.

Are there any solution to convert RefineMask pretrained .pth model to .onnx model or other formats?
Or, if anyone knows, could you tell me the implementation to other train/inference platforms?

@zhanggang001
Copy link
Owner

ut the result is failed.
I also tried with MMdeploy v0.14.0 but the result is fa

Unfortunately, we did not test the deployment process for ONNX.

I can transfer the code to the latest MMDetection if I have time later.

@RiverLight4
Copy link
Author

@zhanggang001 , Thanks for your reply!

Unfortunately, we did not test the deployment process for ONNX.
I can transfer the code to the latest MMDetection if I have time later.

OK, I'll wait for your implement. I'll try to challenge converting by myself, too.
I hope that RefineMask can be used in other inference engines.

Additional info:

It seems that RefineMask is able to work on MMDetection v2.28.2 with little bit fix below.
However, I'm facing the trouble that torch.jit.trace cannot relay 'img_metas' to simple_test_mask in mmdet/models/roi_heads/refine_roi_head.py.

ori_shape = img_metas[0]['ori_shape'] ## ERROR: 'img_metas'[0] has no 'ori_shape', 'scale_factor', etc. because 'img_metas' cannot input into torch.jit.trace()

I'll ask about this problem at open-mmlab/mmdeploy community, because I think it's not the problem of RefineMask but would be related to MMdeploy or pytorch 1.13.x.

My workaround fix info with MMdetection 2.28.2

  1. fix refine_roi_head.py : def simple_test_mask : (workaround)
  • det_bboxes -> det_bboxes [0]
  • det_labels -> det_labels [0]
    • Because det_bboxes and det_labels are provided as list in MMDetection-2.28.2, for inference multiple images.
    • This fix causes that, if multiple images inputs, only 1st image is inferenced. not good patch :(
  1. fix configs/refinemask/r50-refinemask-1x.py

@Myxrf
Copy link

Myxrf commented Jul 12, 2024

Hello, have you successfully exported onnx?

@RiverLight4
Copy link
Author

Unfortunately, No.

@hnyang00
Copy link

hnyang00 commented Oct 15, 2024

Hello, I followed your advice and modified the refinemask code to adapt to mmdetection2.28.2. The model can be trained but not tested. Have you ever encountered this situation(listed below)? How to solve it? Thank you!
File "tools/train.py", line 247, in
main()
File "tools/train.py", line 236, in main
train_detector(
File "/data0/mmdetection/mmdet/apis/train.py", line 246, in train_detector
runner.run(data_loaders, cfg.workflow)
File "/home/miniconda3/envs/mmdet/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 136, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/miniconda3/envs/mmdet/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 58, in train
self.call_hook('after_train_epoch')
File "/home/miniconda3/envs/mmdet/lib/python3.8/site-packages/mmcv/runner/base_runner.py", line 317, in call_hook
getattr(hook, fn_name)(self)
File "/home/miniconda3/envs/mmdet/lib/python3.8/site-packages/mmcv/runner/hooks/evaluation.py", line 271, in after_train_epoch
self._do_evaluate(runner)
File "/data0/mmdetection/mmdet/core/evaluation/eval_hooks.py", line 60, in _do_evaluate
results = single_gpu_test(runner.model, self.dataloader, show=False)
File "/data0/mmdetection/mmdet/apis/test.py", line 65, in single_gpu_test
result = [(bbox_results, encode_mask_results(mask_results))
File "/data0/mmdetection/mmdet/apis/test.py", line 65, in
result = [(bbox_results, encode_mask_results(mask_results))
File "/data0/mmdetection/mmdet/core/mask/utils.py", line 60, in encode_mask_results
cls_segm[:, :, np.newaxis], order='F',
IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed

@RiverLight4
Copy link
Author

Hi @hnyang00 ,
My workaround fix for v2.28.2 is only for inference. I use the model trained with original RefineMask code.

@hnyang00
Copy link

Hi @hnyang00 , My workaround fix for v2.28.2 is only for inference. I use the model trained with original RefineMask code.

Thank you for your reply. I have solved the above problem, which is caused by the different format of "results" returned by the previous version of mmdet. However, after I modified it, the values ​​in the val stage were all 0, and I am still solving new problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants