Fix refinedet trt8 #1329

fantes · 2021-08-17T14:14:42Z

This PR addresses #1324

it also updates dependencies to TRT 8.x
BIG FAT WARNING :
TRT 8.0.1.x is subject to this bug : https://forums.developer.nvidia.com/t/build-engine-error-when-use-pointnet-like-structure-and-tensorrt-8-0-1-6/183569, which makes ssd models not working !!!

src/backends/tensorrt/tensorrtlib.cc

beniz · 2021-08-19T10:06:39Z

CMakeLists.txt

@@ -916,26 +916,22 @@ if (USE_TENSORRT)
    set(TENSORRT_INC_DIR /usr/include/x86_64-linux-gnu)
  endif()

-  if (NOT EXISTS "${TRTTESTDIR}/libnvinfer.so.7")
+  if (NOT EXISTS "${TRTTESTDIR}/libnvinfer.so.8")
    message(FATAL_ERROR "Could not find TensorRT ${TENSORRT_LIB_DIR}/libnvinfer.so.7, please provide tensorRT location as TENSORRT_DIR or (TENSORRT_LIB_DIR _and_ TENSORRT_INC_DIR)")


libnvinfer.so.8 instead

was already done, not pushed, sorry

beniz · 2021-08-19T13:48:51Z

@fantes I do confirme the docker does build with the latest NVidia+tensorrt container image, patch is below that could be integrated to this PR:

diff --git a/docker/gpu_tensorrt.Dockerfile b/docker/gpu_tensorrt.Dockerfile
index 59ce5317..03de8d62 100644
--- a/docker/gpu_tensorrt.Dockerfile
+++ b/docker/gpu_tensorrt.Dockerfile
@@ -1,5 +1,5 @@
 # syntax = docker/dockerfile:1.0-experimental
-FROM nvcr.io/nvidia/tensorrt:21.04-py3 AS build
+FROM nvcr.io/nvidia/tensorrt:21.07-py3 AS build
 
 ARG DEEPDETECT_RELEASE=OFF
 ARG DEEPDETECT_ARCH=gpu
@@ -110,7 +110,7 @@ RUN --mount=type=cache,target=/ccache/ mkdir build && cd build && ../build.sh
 RUN ./docker/get_libs.sh
 
 # Build final Docker image
-FROM nvcr.io/nvidia/tensorrt:21.04-py3 AS runtime
+FROM nvcr.io/nvidia/tensorrt:21.07-py3 AS runtime
 
 ARG DEEPDETECT_ARCH=gpu

beniz · 2021-08-23T08:57:26Z

CMakeLists.txt

+        if (EXISTS "${TRTTESTDIR}/libnvinfer.so.8")
+            set(TENSORRT_VERSION 21.08)
+            message(STATUS "Found TensorRT libraries version 8.x")
+        elseif (EXISTS "${TRTTESTDIR}/libnvinfer.so.8")


This is the same test twice, no ?

right, fixed

beniz · 2021-08-23T09:59:26Z

At the moment, several models appear to be broken:

Squeezenet SSD (and probably other SSDs as well), from the unit tests:

[2021-08-23 11:48:46.032] [imgserv] [error] [resources.cpp::~ScopedCudaEvent::438] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[2021-08-23 11:48:46.032] [imgserv] [error] [resources.cpp::~ScopedCudaEvent::438] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[2021-08-23 11:48:46.032] [imgserv] [error] [resources.cpp::~ScopedCudaEvent::438] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[2021-08-23 11:48:46.032] [imgserv] [error] /data1/beniz/code/deepdetect/build_trt/tensorrt-oss/src/tensorrt-oss/plugin/priorBoxPlugin/priorBoxPlugin.cpp (253) - Cuda Error in destroy: 700 (an illegal memory access was encountered)
terminate called after throwing an instance of 'nvinfer1::plugin::CudaError'
  what():  std::exception
Aborted (core dumped)

refinedet in fp16:

[2021-08-23 11:51:40.440] [imgserv] [info] --------------- Timing Runner: detection_out (PluginV2)
[2021-08-23 11:51:40.453] [imgserv] [info] Deleting timing cache: 489 entries, 505 hits
[2021-08-23 11:51:40.466] [imgserv] [info] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 0, GPU 2745 (MiB)
[2021-08-23 11:51:40.467] [imgserv] [error] 2: [pluginV2Runner.cpp::execute::267] Error Code 2: Internal Error (Assertion status == kSTATUS_SUCCESS failed.)
[2021-08-23 11:51:40.467] [imgserv] [error] 2: [builder.cpp::buildSerializedNetwork::417] Error Code 2: Internal Error (Assertion enginePtr != nullptr failed.)

model saving ? can't find the .bs model after tensorrt has compiled the model.

…SS21.08

fantes force-pushed the fix_refinedet_trt8 branch from 8a6d5ff to de7fed9 Compare August 17, 2021 14:16

Bycob reviewed Aug 17, 2021

View reviewed changes

src/backends/tensorrt/tensorrtlib.cc Show resolved Hide resolved

beniz added the mllib:tensorrt label Aug 17, 2021

fantes force-pushed the fix_refinedet_trt8 branch 2 times, most recently from d3fdad7 to a3c252a Compare August 18, 2021 07:37

fantes mentioned this pull request Aug 18, 2021

Different prediction with tensorrt on refinedet model for the version v0.18.0 #1324

Open

6 tasks

beniz closed this Aug 19, 2021

beniz reopened this Aug 19, 2021

beniz reviewed Aug 19, 2021

View reviewed changes

fantes force-pushed the fix_refinedet_trt8 branch from a3c252a to f5ab681 Compare August 19, 2021 10:07

mergify bot added the ci:docker label Aug 19, 2021

fantes force-pushed the fix_refinedet_trt8 branch from f5ab681 to ae9ce15 Compare August 19, 2021 10:08

fantes added the process:ready-to-review label Aug 19, 2021

beniz reviewed Aug 23, 2021

View reviewed changes

fantes force-pushed the fix_refinedet_trt8 branch from ae9ce15 to 22e7b08 Compare August 23, 2021 09:03

fantes force-pushed the fix_refinedet_trt8 branch 3 times, most recently from 469b066 to e166442 Compare August 23, 2021 14:17

Bycob approved these changes Aug 24, 2021

View reviewed changes

fix(TRT): make refinedet great again, also upgrades to TRT8.0.0/TRT-O…

01b54d7

…SS21.08

fantes force-pushed the fix_refinedet_trt8 branch from e166442 to 01b54d7 Compare August 25, 2021 11:38

beniz approved these changes Aug 25, 2021

View reviewed changes

mergify bot added the process:merge-queued label Aug 25, 2021

Merge branch 'master' into fix_refinedet_trt8

3be507c

mergify bot added process:merge-queued and removed process:merge-queued labels Aug 26, 2021

Merge branch 'master' into fix_refinedet_trt8

258b719

mergify bot merged commit bdff2ae into jolibrain:master Aug 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix refinedet trt8 #1329

Fix refinedet trt8 #1329

fantes commented Aug 17, 2021 •

edited

Loading

beniz Aug 19, 2021

fantes Aug 19, 2021

beniz commented Aug 19, 2021

beniz Aug 23, 2021

fantes Aug 23, 2021

beniz commented Aug 23, 2021

Fix refinedet trt8 #1329

Fix refinedet trt8 #1329

Conversation

fantes commented Aug 17, 2021 • edited Loading

beniz Aug 19, 2021

Choose a reason for hiding this comment

fantes Aug 19, 2021

Choose a reason for hiding this comment

beniz commented Aug 19, 2021

beniz Aug 23, 2021

Choose a reason for hiding this comment

fantes Aug 23, 2021

Choose a reason for hiding this comment

beniz commented Aug 23, 2021

fantes commented Aug 17, 2021 •

edited

Loading