Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deployment to TensorRT #98

Open
CMSC740Student opened this issue Sep 29, 2024 · 9 comments
Open

Deployment to TensorRT #98

CMSC740Student opened this issue Sep 29, 2024 · 9 comments

Comments

@CMSC740Student
Copy link

Hi All,

Thank you for the amazing work! Can this model be exported to TensorRT for inference?

Thanks

@shubhendu-ranadive
Copy link

shubhendu-ranadive commented Sep 30, 2024

The complete model maybe difficult to deploy to TensorRT due to Deformable Aggregation Function. But I think, parts of the model can be deployed on TensorRT. Like the Resnet50 Backbone or the FPN Neck.

You can try to use torch2trt from NVIDIA for that purpose.

@CMSC740Student
Copy link
Author

Thank you for your response!
What about converting it to ONNX? What is the recommended route to do this? It it via torch.onnx.export or to use mmdeploy?

@shubhendu-ranadive
Copy link

shubhendu-ranadive commented Oct 1, 2024

I haven't tried to convert the model to ONNX yet. So far I have only been successful to convert the Backbone, Neck and the Encoders to TensorRT using torch2trt without causing massive change in accuracy. So, I'm not certain which is the best way to approach for ONNX. I think MMDeploy would be great since you can define custom plugins to convert Deformable Aggregation Function (Haven't tried so not sure). Maybe you can try that and let me know how it goes?

@CMSC740Student
Copy link
Author

CMSC740Student commented Oct 2, 2024

@shubhendu-ranadive Thanks for the suggestions!

I am able to convert the model to onnx but the model outputs same results for different inputs, so something is wrong.

Here are the steps I performed

  1. pip3 install onnx
  2. Update forward definition for sparse4d.py, sparse4d_head.py, instance_bank.py, blocks.py & detection3d_blocks.py
# Change this:
def forward(self, img, **data):

# To this
def forward(self, img, timestamp=None, projection_mat=None, image_wh=None):
  1. Use torch.onnx.export
with torch.no_grad():
        model.eval()
        torch.onnx.export(
            model,
            args,
            output_path,
            export_params=True,
            input_names=input_names,
            output_names=output_names,
            opset_version=opset_version,
            dynamic_axes=dynamic_axes,
            keep_initializers_as_inputs=keep_initializers_as_inputs,
            verbose=verbose)

The issue right now is that the model outputs same results for different inputs. Debugging in progress...

@shubhendu-ranadive
Copy link

shubhendu-ranadive commented Oct 3, 2024

@CMSC740Student That's great! 👍
Can I ask what's the value set for keep_initializers_as_inputs ? If you haven't already tried, maybe setting it to False would change the results?

@CMSC740Student
Copy link
Author

@shubhendu-ranadive Did some debugging...I think the issue may be with InstanceBank.

InstanceBank caches intermediate results from previous inputs & passes them along with the next set of inputs.

Therefore, that InstanceBank class contains logic that only gets triggered when it processes sequential batches of inputs, as opposed to a single batch.

When I try to export my model with dummy inputs with a single batch, the outputs are incorrect (likely because the InstanceBank logic is not traced/exported correctly via torch.onnx.export)

Do you know if its possible to pass sequential batches of inputs so that the model is traced correctly with all inputs?

@shubhendu-ranadive
Copy link

shubhendu-ranadive commented Oct 3, 2024

@CMSC740Student Thanks for your reply. That indeed looks like a problem when creating a graph for ONNX.
I don't think torch.onnx.export allows sequential inputs.

The only thing I found after searching is maybe using torch.jit.script? This will try to trace dynamic control flow of the model which you can then try to convert to ONNX?

Edit: more on using torch.jit.script here

@shubhendu-ranadive
Copy link

@CMSC740Student Did you get it working?

@PonyAIjkz
Copy link

PonyAIjkz commented Nov 4, 2024

@CMSC740Student @shubhendu-ranadive This repository may be useful for your deployment:https://github.com/ThomasVonWu/SparseEnd2End

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants