-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BertQA sample throws segementation fault (TensorRT 10.3) when running GPU Jetson Orin Nano #4220
Comments
You can use |
This is the RAM usage when I run the inference. 10-24-2024 14:19:17 RAM 4516/7620MB (lfb 1x4MB) CPU [6%@1510,6%@1510,5%@1510,6%@1510,100%@1510,8%@1510] GR3D_FREQ 99% [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] VDD_IN 4864mW/4538mW VDD_CPU_GPU_CV 1263mW/1006mW VDD_SOC 1461mW/1361mW 10-24-2024 14:19:18 RAM 4517/7620MB (lfb 1x4MB) CPU [11%@729,12%@729,17%@729,9%@729,99%@1510,3%@1510] GR3D_FREQ 0% [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] VDD_IN 4746mW/4545mW VDD_CPU_GPU_CV 1145mW/1011mW VDD_SOC 1421mW/1363mW |
I also tried increasing the swap memory by 4GB |
Can you try to use trtexec load engine to infer ? |
how can i do that? I am new tensorrt and was trying to run the sample application. I have tensorrt installed on my container
|
|
This is for loading the onnx model right? How do i do inference with a engine, using trtexec? i tired and gets this as log
I hope this means the engine doesnt have any probelm. but i still i have the segmentation fault when i try to run the sample inference.py |
It maybe your code has bug. You can use trtexec to get engine file, then use follow py https://github.com/lix19937/tensorrt-insight/blob/main/tool/infer_from_engine.py @krishnarajk |
Description
I tired running the bertQA sample in Jetson Orin nano with jetpack 6.1
I used Bert Base, because Bert Large kills itself when building the engine(may be because of memory issue).
The I used the inference.py, with the same sample given in the examples.
python3 inference.py -e engines/bert_base_128.engine -p "TensorRT is a high performance deep learning inference platform that delivers low latency and high throughput for apps such as recommenders, speech and image/video on NVIDIA GPUs. It includes parsers to import models, and plugins to support novel ops and layers before applying optimizations for inference. Today NVIDIA is open-sourcing parsers and plugins in TensorRT so that the deep learning community can customize and extend these components to take advantage of powerful TensorRT optimizations for your apps." -q "What is TensorRT?" -v models/fine-tuned/bert_tf_ckpt_base_qa_squad2_amp_128_v19.03.1/vocab.txt
It throws segmenation fault
`
[10/23/2024-13:30:07] [TRT] [I] Loaded engine size: 208 MiB
[10/23/2024-13:30:08] [TRT] [I] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +8, GPU +70, now: CPU 317, GPU 4590 (MiB)
[10/23/2024-13:30:08] [TRT] [I] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +7, GPU +64, now: CPU 109, GPU 4379 (MiB)
[10/23/2024-13:30:08] [TRT] [I] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +1, now: CPU 0, GPU 163 (MiB)
Passage: TensorRT is a high performance deep learning inference platform that delivers low latency and high throughput for apps such as recommenders, speech and image/video on NVIDIA GPUs. It includes parsers to import models, and plugins to support novel ops and layers before applying optimizations for inference. Today NVIDIA is open-sourcing parsers and plugins in TensorRT so that the deep learning community can customize and extend these components to take advantage of powerful TensorRT optimizations for your apps.
Question: What is TensorRT?
Segmentation fault (core dumped)
`
** https://github.com/NVIDIA/TensorRT/tree/release/10.3/demo/BERT#model-overview
** I dont use the OSS container, but installed these on device
Please help me over here.
Environment
TensorRT Version: 10.3
NVIDIA GPU: Amper, Jetson Orin nano
NVIDIA Driver Version: Jetpack 6.1
CUDA Version: 12.6
CUDNN Version:
Operating System: 22.04
Python Version (if applicable): 3.10
The text was updated successfully, but these errors were encountered: