Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Engine build failure of TensorRT 10.0.0.6EA when running trtexec on GPU NVIDIA GeForce RTX 3060 #3760

Closed
roxanacincan opened this issue Apr 1, 2024 · 4 comments

Comments

@roxanacincan
Copy link

Description

I tried to build the .plan file for my onnx model using a simple trtexec command: trtexec --onnx=path/to/onnx/model --saveEngine=path/to/engine/model
The issue is that although the console doesn't show any errors the build process gets interrupted and the trt model is not generated

Environment

TensorRT Version: 10.0.0.6EA

NVIDIA GPU: NVIDIA GeForce RTX 3060

NVIDIA Driver Version: 551.86

CUDA Version: 11.8

CUDNN Version: 8.9.7

ONNX version: 1.15.0

onnxruntime version: 1.17.1

OPset version: 17

Operating System: Win10

Python Version (if applicable): 3.10

PyTorch Version (if applicable): 2.2.0

Can this model run on other frameworks? For example run ONNX model with ONNXRuntime (polygraphy run <model.onnx> --onnxrt):

I was able to generate the trt model using poligraphy on Win10.

Below is the output on my console when i try to generate the trt model:

&&&& RUNNING TensorRT.trtexec [TensorRT v100000] # trtexec --onnx=./trt_v10_model/pu_3_inspyre.onnx --saveEngine=./trt_v10_model/pu_3_inspyre.plan --verbose
[04/01/2024-17:14:52] [I] === Model Options ===
[04/01/2024-17:14:52] [I] Format: ONNX
[04/01/2024-17:14:52] [I] Model: ./trt_v10_model/pu_3_inspyre.onnx
[04/01/2024-17:14:52] [I] Output:
[04/01/2024-17:14:52] [I] === Build Options ===
[04/01/2024-17:14:52] [I] Memory Pools: workspace: default, dlaSRAM: default, dlaLocalDRAM: default, dlaGlobalDRAM: default, tacticSharedMem: default
[04/01/2024-17:14:52] [I] avgTiming: 8
[04/01/2024-17:14:52] [I] Precision: FP32
[04/01/2024-17:14:52] [I] LayerPrecisions:
[04/01/2024-17:14:52] [I] Layer Device Types:
[04/01/2024-17:14:52] [I] Calibration:
[04/01/2024-17:14:52] [I] Refit: Disabled
[04/01/2024-17:14:52] [I] Strip weights: Disabled
[04/01/2024-17:14:52] [I] Version Compatible: Disabled
[04/01/2024-17:14:52] [I] ONNX Plugin InstanceNorm: Disabled
[04/01/2024-17:14:52] [I] TensorRT runtime: full
[04/01/2024-17:14:52] [I] Lean DLL Path:
[04/01/2024-17:14:52] [I] Tempfile Controls: { in_memory: allow, temporary: allow }
[04/01/2024-17:14:52] [I] Exclude Lean Runtime: Disabled
[04/01/2024-17:14:52] [I] Sparsity: Disabled
[04/01/2024-17:14:52] [I] Safe mode: Disabled
[04/01/2024-17:14:52] [I] Build DLA standalone loadable: Disabled
[04/01/2024-17:14:52] [I] Allow GPU fallback for DLA: Disabled
[04/01/2024-17:14:52] [I] DirectIO mode: Disabled
[04/01/2024-17:14:52] [I] Restricted mode: Disabled
[04/01/2024-17:14:52] [I] Skip inference: Disabled
[04/01/2024-17:14:52] [I] Save engine: ./trt_v10_model/pu_3_inspyre.plan
[04/01/2024-17:14:52] [I] Load engine:
[04/01/2024-17:14:52] [I] Profiling verbosity: 0
[04/01/2024-17:14:52] [I] Tactic sources: Using default tactic sources
[04/01/2024-17:14:52] [I] timingCacheMode: local
[04/01/2024-17:14:52] [I] timingCacheFile:
[04/01/2024-17:14:52] [I] Enable Compilation Cache: Enabled
[04/01/2024-17:14:52] [I] errorOnTimingCacheMiss: Disabled
[04/01/2024-17:14:52] [I] Preview Features: Use default preview flags.
[04/01/2024-17:14:52] [I] MaxAuxStreams: -1
[04/01/2024-17:14:52] [I] BuilderOptimizationLevel: -1
[04/01/2024-17:14:52] [I] Calibration Profile Index: 0
[04/01/2024-17:14:52] [I] Weight Streaming: Disabled
[04/01/2024-17:14:52] [I] Debug Tensors:
[04/01/2024-17:14:52] [I] Input(s)s format: fp32:CHW
[04/01/2024-17:14:52] [I] Output(s)s format: fp32:CHW
[04/01/2024-17:14:52] [I] Input build shapes: model
[04/01/2024-17:14:52] [I] Input calibration shapes: model
[04/01/2024-17:14:52] [I] === System Options ===
[04/01/2024-17:14:52] [I] Device: 0
[04/01/2024-17:14:52] [I] DLACore:
[04/01/2024-17:14:52] [I] Plugins:
[04/01/2024-17:14:52] [I] setPluginsToSerialize:
[04/01/2024-17:14:52] [I] dynamicPlugins:
[04/01/2024-17:14:52] [I] ignoreParsedPluginLibs: 0
[04/01/2024-17:14:52] [I]
[04/01/2024-17:14:52] [I] === Inference Options ===
[04/01/2024-17:14:52] [I] Batch: Explicit
[04/01/2024-17:14:52] [I] Input inference shapes: model
[04/01/2024-17:14:52] [I] Iterations: 10
[04/01/2024-17:14:52] [I] Duration: 3s (+ 200ms warm up)
[04/01/2024-17:14:52] [I] Sleep time: 0ms
[04/01/2024-17:14:52] [I] Idle time: 0ms
[04/01/2024-17:14:52] [I] Inference Streams: 1
[04/01/2024-17:14:52] [I] ExposeDMA: Disabled
[04/01/2024-17:14:52] [I] Data transfers: Enabled
[04/01/2024-17:14:52] [I] Spin-wait: Disabled
[04/01/2024-17:14:52] [I] Multithreading: Disabled
[04/01/2024-17:14:52] [I] CUDA Graph: Disabled
[04/01/2024-17:14:52] [I] Separate profiling: Disabled
[04/01/2024-17:14:52] [I] Time Deserialize: Disabled
[04/01/2024-17:14:52] [I] Time Refit: Disabled
[04/01/2024-17:14:52] [I] NVTX verbosity: 0
[04/01/2024-17:14:52] [I] Persistent Cache Ratio: 0
[04/01/2024-17:14:52] [I] Optimization Profile Index: 0
[04/01/2024-17:14:52] [I] Weight Streaming Budget: -1 bytes
[04/01/2024-17:14:52] [I] Inputs:
[04/01/2024-17:14:52] [I] Debug Tensor Save Destinations:
[04/01/2024-17:14:52] [I] === Reporting Options ===
[04/01/2024-17:14:52] [I] Verbose: Enabled
[04/01/2024-17:14:52] [I] Averages: 10 inferences
[04/01/2024-17:14:52] [I] Percentiles: 90,95,99
[04/01/2024-17:14:52] [I] Dump refittable layers:Disabled
[04/01/2024-17:14:52] [I] Dump output: Disabled
[04/01/2024-17:14:52] [I] Profile: Disabled
[04/01/2024-17:14:52] [I] Export timing to JSON file:
[04/01/2024-17:14:52] [I] Export output to JSON file:
[04/01/2024-17:14:52] [I] Export profile to JSON file:
[04/01/2024-17:14:52] [I]
[04/01/2024-17:14:52] [I] === Device Information ===
[04/01/2024-17:14:52] [I] Available Devices:
[04/01/2024-17:14:52] [I]   Device 0: "NVIDIA GeForce RTX 3060" UUID: GPU-1571ea6f-72e0-2c07-bc4a-2334bc79a3eb
[04/01/2024-17:14:52] [I] Selected Device: NVIDIA GeForce RTX 3060
[04/01/2024-17:14:52] [I] Selected Device ID: 0
[04/01/2024-17:14:52] [I] Selected Device UUID: GPU-1571ea6f-72e0-2c07-bc4a-2334bc79a3eb
[04/01/2024-17:14:52] [I] Compute Capability: 8.6
[04/01/2024-17:14:52] [I] SMs: 28
[04/01/2024-17:14:52] [I] Device Global Memory: 12287 MiB
[04/01/2024-17:14:52] [I] Shared Memory per SM: 100 KiB
[04/01/2024-17:14:52] [I] Memory Bus Width: 192 bits (ECC disabled)
[04/01/2024-17:14:52] [I] Application Compute Clock Rate: 1.807 GHz
[04/01/2024-17:14:52] [I] Application Memory Clock Rate: 7.501 GHz
[04/01/2024-17:14:52] [I]
[04/01/2024-17:14:52] [I] Note: The application clock rates do not reflect the actual clock rates that the GPU is currently running at.
[04/01/2024-17:14:52] [I]
[04/01/2024-17:14:52] [I] TensorRT version: 10.0.0
[04/01/2024-17:14:52] [I] Loading standard plugins
[04/01/2024-17:14:52] [V] [TRT] Registered plugin creator - ::BatchedNMSDynamic_TRT version 1
[04/01/2024-17:14:52] [V] [TRT] Registered plugin creator - ::BatchedNMS_TRT version 1
[04/01/2024-17:14:52] [V] [TRT] Registered plugin creator - ::BatchTilePlugin_TRT version 1
[04/01/2024-17:14:52] [V] [TRT] Registered plugin creator - ::Clip_TRT version 1
[04/01/2024-17:14:52] [V] [TRT] Registered plugin creator - ::CoordConvAC version 1
[04/01/2024-17:14:52] [V] [TRT] Registered plugin creator - ::CropAndResizeDynamic version 1
[04/01/2024-17:14:52] [V] [TRT] Registered plugin creator - ::CropAndResize version 1
[04/01/2024-17:14:52] [V] [TRT] Registered plugin creator - ::DecodeBbox3DPlugin version 1
[04/01/2024-17:14:52] [V] [TRT] Registered plugin creator - ::DetectionLayer_TRT version 1
[04/01/2024-17:14:52] [V] [TRT] Registered plugin creator - ::EfficientNMS_Explicit_TF_TRT version 1
[04/01/2024-17:14:52] [V] [TRT] Registered plugin creator - ::EfficientNMS_Implicit_TF_TRT version 1
[04/01/2024-17:14:52] [V] [TRT] Registered plugin creator - ::EfficientNMS_ONNX_TRT version 1
[04/01/2024-17:14:52] [V] [TRT] Registered plugin creator - ::EfficientNMS_TRT version 1
[04/01/2024-17:14:52] [V] [TRT] Registered plugin creator - ::FlattenConcat_TRT version 1
[04/01/2024-17:14:52] [V] [TRT] Registered plugin creator - ::GenerateDetection_TRT version 1
[04/01/2024-17:14:52] [V] [TRT] Registered plugin creator - ::GridAnchor_TRT version 1
[04/01/2024-17:14:52] [V] [TRT] Registered plugin creator - ::GridAnchorRect_TRT version 1
[04/01/2024-17:14:52] [V] [TRT] Registered plugin creator - ::InstanceNormalization_TRT version 1
[04/01/2024-17:14:52] [V] [TRT] Registered plugin creator - ::InstanceNormalization_TRT version 2
[04/01/2024-17:14:52] [V] [TRT] Registered plugin creator - ::LReLU_TRT version 1
[04/01/2024-17:14:52] [V] [TRT] Registered plugin creator - ::ModulatedDeformConv2d version 1
[04/01/2024-17:14:52] [V] [TRT] Registered plugin creator - ::MultilevelCropAndResize_TRT version 1
[04/01/2024-17:14:52] [V] [TRT] Registered plugin creator - ::MultilevelProposeROI_TRT version 1
[04/01/2024-17:14:52] [V] [TRT] Registered plugin creator - ::MultiscaleDeformableAttnPlugin_TRT version 1
[04/01/2024-17:14:52] [V] [TRT] Registered plugin creator - ::NMSDynamic_TRT version 1
[04/01/2024-17:14:52] [V] [TRT] Registered plugin creator - ::NMS_TRT version 1
[04/01/2024-17:14:52] [V] [TRT] Registered plugin creator - ::Normalize_TRT version 1
[04/01/2024-17:14:52] [V] [TRT] Registered plugin creator - ::PillarScatterPlugin version 1
[04/01/2024-17:14:52] [V] [TRT] Registered plugin creator - ::PriorBox_TRT version 1
[04/01/2024-17:14:52] [V] [TRT] Registered plugin creator - ::ProposalDynamic version 1
[04/01/2024-17:14:52] [V] [TRT] Registered plugin creator - ::ProposalLayer_TRT version 1
[04/01/2024-17:14:52] [V] [TRT] Registered plugin creator - ::Proposal version 1
[04/01/2024-17:14:52] [V] [TRT] Registered plugin creator - ::PyramidROIAlign_TRT version 1
[04/01/2024-17:14:52] [V] [TRT] Registered plugin creator - ::Region_TRT version 1
[04/01/2024-17:14:52] [V] [TRT] Registered plugin creator - ::Reorg_TRT version 1
[04/01/2024-17:14:52] [V] [TRT] Registered plugin creator - ::ResizeNearest_TRT version 1
[04/01/2024-17:14:52] [V] [TRT] Registered plugin creator - ::ROIAlign_TRT version 1
[04/01/2024-17:14:52] [V] [TRT] Registered plugin creator - ::RPROI_TRT version 1
[04/01/2024-17:14:52] [V] [TRT] Registered plugin creator - ::ScatterND version 1
[04/01/2024-17:14:52] [V] [TRT] Registered plugin creator - ::SpecialSlice_TRT version 1
[04/01/2024-17:14:52] [V] [TRT] Registered plugin creator - ::Split version 1
[04/01/2024-17:14:52] [V] [TRT] Registered plugin creator - ::VoxelGeneratorPlugin version 1
[04/01/2024-17:14:52] [I] [TRT] [MemUsageChange] Init CUDA: CPU +84, GPU +0, now: CPU 6585, GPU 1045 (MiB)
[04/01/2024-17:14:52] [V] [TRT] Trying to load shared library nvinfer_builder_resource.dll
[04/01/2024-17:14:52] [V] [TRT] Loaded shared library nvinfer_builder_resource.dll
[04/01/2024-17:14:58] [I] [TRT] [MemUsageChange] Init builder kernel library: CPU +1464, GPU +266, now: CPU 9179, GPU 1311 (MiB)
[04/01/2024-17:14:59] [V] [TRT] CUDA lazy loading is enabled.

As you can see there are no errors that tell me why I can't generate the TRT model.
Has anyone encountered this situation before and can help me solve it?

Thank you in advance for your answer! :)

@lix19937
Copy link

lix19937 commented Apr 2, 2024

trtexec --onnx=./trt_v10_model/pu_3_inspyre.onnx --saveEngine=./trt_v10_model/pu_3_inspyre.plan --verbose 2>&1 | tee build.log get full log

@roxanacincan
Copy link
Author

roxanacincan commented Apr 2, 2024

Hello,

I used the command you suggested and it indeed created a log file.
The problem is that the file contains the exact information as the console, nothing more, nothing less.

Below i attached my build log file:
build.log

@lix19937
Copy link

lix19937 commented Apr 2, 2024

Try to use NV Official ResNet50.onnx instead of pu_3_inspyre.onnx.

If failure, maybe version/driver compatibility issues.
Also you can use https://github.com/NVIDIA/TensorRT/blob/release/8.6/samples/python/detectron2/build_engine.py to run your onnx and find which step stopped or exit.

@roxanacincan
Copy link
Author

After using the build_engine.py script i was able to solve the problem and generate the .plan file for my model.

Thank you for your help and have a great day! ^^

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants