Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Given an engine file, how to know what GPU model it is generated on? #4233

Open
yangdong02 opened this issue Nov 1, 2024 · 6 comments
Open
Assignees
Labels
Enhancement New feature or request triaged Issue has been triaged by maintainers

Comments

@yangdong02
Copy link

When I use trtexec and I mix TensorRT engine plan files across different GPU models, I can get a warning:

Using an engine plan file across different models of devices is not recommended and is likely to affect performance or even cause errors

I would also like to do this check in my own program, but I cannot find an API to get what GPU model an engine file is generated on. For a similar question, "what TensorRT version is an engine file generated on", I find that I can know the answer using #3073 (comment). I'm wondering if there is any similar trick for GPU model information. Thanks a lot!

@lix19937
Copy link

lix19937 commented Nov 4, 2024

Use trtexec --verbose --onnx=spec will print follow device info

[11/01/2024-15:38:18] [I] 
[11/01/2024-15:38:18] [I] === Device Information ===
[11/01/2024-15:38:18] [I] Selected Device: NVIDIA RTX 2000 Ada Generation Laptop GPU
[11/01/2024-15:38:18] [I] Compute Capability: 8.9
[11/01/2024-15:38:18] [I] SMs: 24
[11/01/2024-15:38:18] [I] Compute Clock Rate: 2.115 GHz
[11/01/2024-15:38:18] [I] Device Global Memory: 8187 MiB
[11/01/2024-15:38:18] [I] Shared Memory per SM: 100 KiB
[11/01/2024-15:38:18] [I] Memory Bus Width: 128 bits (ECC disabled)
[11/01/2024-15:38:18] [I] Memory Clock Rate: 8.001 GHz
[11/01/2024-15:38:18] [I] 
[11/01/2024-15:38:18] [I] TensorRT version: 8.5.10

There is a tips about version compatibility and hardware compatibility
https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#advanced.
If you want to get gpu info from engine file, it seems that not support by EngineInspector. https://docs.nvidia.com/deeplearning/tensorrt/api/python_api/infer/Core/EngineInspector.html

@yangdong02
Copy link
Author

Thank you for the reply!

If you want to get gpu info from engine file, it seems that not support by EngineInspector. https://docs.nvidia.com/deeplearning/tensorrt/api/python_api/infer/Core/EngineInspector.html

Yeah I tried searching documentation of ICudaEngine, EngineInspector, and engine file but I didn't find related information. But it has to be somewhere inside engine file, right? Because trtexec can emit a warning when the GPU model generating engine file differs from the GPU model doing inference.

@lix19937
Copy link

lix19937 commented Nov 4, 2024

@poweiw poweiw added the triaged Issue has been triaged by maintainers label Nov 5, 2024
@yangdong02
Copy link
Author

Yes, also there is a detailed desc https://forums.developer.nvidia.com/t/do-tensorrt-plan-files-are-portable-across-different-gpus-which-have-the-same-type/72756

Thanks. Now I know that the current available memory on the device at the time of creating the plan file is also encoded into engine file, and trtexec also checks this information.

However, This probably also have not quite answer my question? My question is, in engine file, where can I find the encoded GPU model information that generates it? Is this information also encoded in the first few bytes of an engine file? A hack like #3073 (comment) is enough for me. I know this information must have been saved somewhere in engine file because trtexec can find it and emit warnings. I'm wondering if there is any way that I can make my program find that.

@lix19937
Copy link

lix19937 commented Nov 6, 2024

Because the engine/plan encoder format is not release, user only get this info bit by bit from nv.

@kevinch-nv
Copy link
Collaborator

Currently GPU model information is not encoded in the TensorRT engine, only specific GPU properties, such as compute capability and memory capacity as mentioned in https://forums.developer.nvidia.com/t/do-tensorrt-plan-files-are-portable-across-different-gpus-which-have-the-same-type/72756.

We can look into embedding this information and providing an API to retrieve this in a future release.

In the meantime, I recommend embedding the device name in the serialized engine file name for future reference.

@kevinch-nv kevinch-nv self-assigned this Nov 6, 2024
@kevinch-nv kevinch-nv added the Enhancement New feature or request label Nov 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement New feature or request triaged Issue has been triaged by maintainers
Projects
None yet
Development

No branches or pull requests

4 participants