-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conditions / example of DepSepConvolution fusion #3237
Comments
@nvpohanh ^ ^ |
Thank you! We will try this pattern! It would be awesome to have this fusion example as an |
closing since no activity for more than 3 weeks, pls reopen if you still have question, thanks all! |
The thing is I cannot reopen if it was the third party (you) who closed the question :) but yeah, I will add a comment when we have some feedbacks |
reopen for now. thanks |
Hey @nvpohanh, I tried the above graph in a small example as attatched below. I got the following error: |
@aboubezari Could you provide the ONNX file so that we can repro and debug this issue? Thanks |
Yes, I've attached the ONNX file as a zip file with just the onnx model in it. |
Filed internal tracker 4454538. Will let you know if we have any findings. |
Awesome, thanks. |
@aboubezari unrelated to the problem you've reported, I recommend placing the first BatchNorm after the first convolution (as it appears in the diagram above). |
@nzmora-nvidia I realized that I exported the model after tweaking it a bit to figure out the issue, my bad. |
@aboubezari Thank you, we can recreate the error and do not need the new model. |
I guess it would be awesome to have such example ONNX files (or even complete PyTorch + torch-tensorrt) examples in the docs of TRT, especially when fusion is discussed (and given that fusion patterns are often fragile, especially together with quantization)! |
@vadimkantorov That's a fair request. I'll provide some pytorch examples in the next TREx release. |
This issue has been fixed in TRT 10.0.0 EA. https://docs.nvidia.com/deeplearning/tensorrt/release-notes/index.html#rel-10-0-0-EA Thanks for reporting this issue. |
@nvpohanh Please add somewhere in the docs an example E.g. a complete example of export of MobileNetV3 (making use of DepSep) https://pytorch.org/vision/stable/models/generated/torchvision.models.quantization.mobilenet_v3_large.html#mobilenet-v3-large would be great |
Thank you @nvpohanh! Look forward to trying it out. |
I will close this since this is solved, thanks all! |
@ttyio I think it's still important to provide in the docs ONNX files with examples of fusable graphs and ideally some complete examples of PyTorch code exporting these ONNX graphs |
https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#fusion-types says:
Depthwise Separable Convolution
A depthwise convolution with activation followed by a convolution with activation
may sometimesbe fused into a single optimized DepSepConvolution layer. The precision of both convolutions must be INT8 and the device's computes capability must be 7.2 or later.
Are there any other conditions? What types of activations are admissible?
Is there example of fusable graphs? (this is important especially given that convs must already be int8)
There is almost no example or mentions of DepSepConvolution/TRT in Google Search.
Wonder about constraints of Q-DQ and qparams.
Thank you :)
The text was updated successfully, but these errors were encountered: