Failure of TensorRT 8.6 on the PyTorch version of Faster-RCNN #3034

micheleantonazzi · 2023-06-03T15:55:14Z

Description

I'm trying to convert the Pytorch Implementation of Faster-RCNN in TensorRT 8.6.
The procedure that I followed:

Load the Faster-RCNN from TorchHub
Export to onnx
Build the TensorRT engine with the trtexec tool
That procedure fails on the if node, which is generated from the MultiScaleRoIAlign class of torchvision.
The error is the following

[E] Error[4]: /roi_heads/box_roi_pool/If_OutputLayer: IIfConditionalOutputLayer inputs must have the same shape. Shapes are [-1] and [-1,1].
[06/03/2023-17:38:40] [E] [TRT] ModelImporter.cpp:771: While parsing node number 1579 [If -> "/roi_heads/box_roi_pool/If_output_0"]:
[.... node ....]
ModelImporter.cpp:777: ERROR: ModelImporter.cpp:195 In function parseGraph:
[6] Invalid Node - /roi_heads/box_roi_pool/If
/roi_heads/box_roi_pool/If_OutputLayer: IIfConditionalOutputLayer inputs must have the same shape. Shapes are [-1] and [-1,1].

Steps to reproduce:

model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=True)
    model.eval()
    dummy_input = torch.randn(1,3,320,320)
    torch.onnx.export(model,
            dummy_input,
            "model_onnx.onnx",
            export_params=True,
            )

trtexec --onnx=model_onnx.onnx --saveEngine=resnet_engine_pytorch.trt  --explicitBatch

you can also download the onnx model from here

Environment

TensorRT Version: 8.6

NVIDIA GPU: rtx 3050 mobile

NVIDIA Driver Version: 530

CUDA Version: 11.7 or 12 (both tested)

CUDNN Version: latest

Operating System: ubuntu 20.04

Python Version (if applicable): 3.8

PyTorch Version (if applicable): 2.0.0

Relevant Files

Model link: link

Can this model run on other frameworks? For example run ONNX model with ONNXRuntime (polygraphy run <model.onnx> --onnxrt): YES

Could you help me to solve this issue? Thank you so much in advance

The text was updated successfully, but these errors were encountered:

zerollzeng · 2023-06-04T14:20:43Z

Looks like it trigger the TRT limitation

IIfConditionalOutputLayer inputs must have the same shape. Shapes are [-1] and [-1,1].

See https://docs.nvidia.com/deeplearning/tensorrt/api/c_api/classnvinfer1_1_1_i_if_conditional.html and https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#work-with-conditionals

zerollzeng · 2023-06-04T14:22:23Z

If the model has static shapes, have you tried constant folding, this may eliminate the error node.

zerollzeng · 2023-06-04T14:23:29Z

Also we have a old Faster RCNN sample(deprecated), see https://github.com/NVIDIA/TensorRT/tree/release/8.4/samples/sampleFasterRCNN

micheleantonazzi · 2023-06-05T08:07:04Z

Hi, thank you for your suggestion.
I have tried to work

If the model has static shapes, have you tried constant folding, this may eliminate the error node.

Yes, I'm working with static shapes (the onnx model is exported specifying a dummy input and the trtexec is run with the argument --explicitBatch). It is correct?
I also tried to sanitize the model with polygraphy, running the command:

polygraphy surgeon sanitize --fold-constants model_onnx.onnx -o folded.onnx

but, the conversion of the sanitized model in tensorrt failed with the following error:

Error[4]: [shapeContext.cpp::operator()::3602] Error Code 4: Shape Error (reshape wildcard -1 has infinite number of solutions or no solution. Reshaping [0,12] to [0,-1].)
[06/05/2023-10:05:11] [E] [TRT] ModelImporter.cpp:771: While parsing node number 545 [Reshape -> "/roi_heads/Reshape_output_0"]:
[06/05/2023-10:05:11] [E] [TRT] ModelImporter.cpp:772: --- Begin node ---
[06/05/2023-10:05:11] [E] [TRT] ModelImporter.cpp:773: input: "/roi_heads/box_predictor/bbox_pred/Gemm_output_0"
input: "/roi_heads/Concat_output_0"
output: "/roi_heads/Reshape_output_0"
name: "/roi_heads/Reshape"
op_type: "Reshape"
attribute {
  name: "allowzero"
  i: 0
  type: INT
}

[06/05/2023-10:05:11] [E] [TRT] ModelImporter.cpp:774: --- End node ---
[06/05/2023-10:05:11] [E] [TRT] ModelImporter.cpp:777: ERROR: ModelImporter.cpp:195 In function parseGraph:
[6] Invalid Node - /roi_heads/Reshape
[shapeContext.cpp::operator()::3602] Error Code 4: Shape Error (reshape wildcard -1 has infinite number of solutions or no solution. Reshaping [0,12] to [0,-1].)

Any other suggestions on it?
Thank you so much again

zerollzeng · 2023-06-07T14:37:40Z

I check the onnx you provided, there are a lot of redundant ops, it comes from the pytorch source code. which make onnx folding hard to optimize and the error you seen, I think there should be many other problems with this onnx, so I would suggest using a new model, at least looks clean in ONNX, it will make the work much simple.

micheleantonazzi · 2023-06-07T15:23:37Z

I will try a simpler model or try to re-implement portions of the FasterRCNN architecture provided by PyTorch.
Thank you so much again @zerollzeng

zerollzeng · 2023-06-11T14:03:02Z

Good luck :-D

ttyio · 2023-07-05T16:05:30Z

closing since no activity for more than 3 weeks, pls reopen if you still have question, thanks!

zhurou603 · 2024-03-27T02:32:46Z

micheleantonazzi

@micheleantonazzi hi ！Have you solved this problem? I encountered the same error as yours, which is also an error reported by the reshape operator。

micheleantonazzi · 2024-03-28T08:57:57Z

Hi @zhurou603
No, I didn't manage to solve the problem. The PyTorch implementation of Faster R-CNN is not compatible with TensorRT and fixing it is very complex. I suggest using ONNXRUNTIME, with this inference engine the onnx export of Faster works well.

xuebuaa · 2024-07-11T07:36:35Z

facing same problem, the reshape op not support Shape[-1], cannot convert to trt model

zerollzeng self-assigned this Jun 4, 2023

zerollzeng added the triaged Issue has been triaged by maintainers label Jun 4, 2023

ttyio closed this as completed Jul 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Failure of TensorRT 8.6 on the PyTorch version of Faster-RCNN #3034

Failure of TensorRT 8.6 on the PyTorch version of Faster-RCNN #3034

micheleantonazzi commented Jun 3, 2023 •

edited

Loading

zerollzeng commented Jun 4, 2023

zerollzeng commented Jun 4, 2023

zerollzeng commented Jun 4, 2023

micheleantonazzi commented Jun 5, 2023

zerollzeng commented Jun 7, 2023

micheleantonazzi commented Jun 7, 2023

zerollzeng commented Jun 11, 2023

ttyio commented Jul 5, 2023

zhurou603 commented Mar 27, 2024

micheleantonazzi commented Mar 28, 2024

xuebuaa commented Jul 11, 2024

Failure of TensorRT 8.6 on the PyTorch version of Faster-RCNN #3034

Failure of TensorRT 8.6 on the PyTorch version of Faster-RCNN #3034

Comments

micheleantonazzi commented Jun 3, 2023 • edited Loading

Description

Steps to reproduce:

Environment

Relevant Files

zerollzeng commented Jun 4, 2023

zerollzeng commented Jun 4, 2023

zerollzeng commented Jun 4, 2023

micheleantonazzi commented Jun 5, 2023

zerollzeng commented Jun 7, 2023

micheleantonazzi commented Jun 7, 2023

zerollzeng commented Jun 11, 2023

ttyio commented Jul 5, 2023

zhurou603 commented Mar 27, 2024

micheleantonazzi commented Mar 28, 2024

xuebuaa commented Jul 11, 2024

micheleantonazzi commented Jun 3, 2023 •

edited

Loading