-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cuda10.0+cudnn7.6.3+TensorRT-6.0.1.5 #28
Comments
@DaChaoXc |
@DaChaoXc 想问一下你这个是量化后的结果还是没有量化的结果呀? |
没有量化,是float32 |
谢谢,为什么我fp16才有你这样的结果呀,是什么原因? |
不太明白你说的结果是指什么 |
@DaChaoXc ,hi 我在nvidia nx上和你跑的程序一样,但运行时间比你慢很多,另外还有一个问题是,我把上面的程序int8量化后竟然比fp16还慢一倍,你知道是什么原因么?,谢谢. |
你的配置环境是什么样的? |
Hi,@DaChaoXc ,我的是Jetpack4.4,ubuntu18.04,cuda10.2,cudnn8.0,板子是nvidia Jetson xavier nx |
能编译通过,测试下来结果如下:
input-engine: model/yolov4.engine
image: dog.jpg
video: nuscenes_mini.mp4
cost: 382 ms
cost: 206.5 ms
cost: 148 ms
cost: 118.75 ms
cost: 101.4 ms
cost: 89.3333 ms
cost: 81 ms
cost: 74.5 ms
cost: 69.6667 ms
cost: 67.2 ms
cost: 63.7273 ms
cost: 60.8333 ms
cost: 58.3846 ms
cost: 56.2857 ms
cost: 54.4667 ms
cost: 52.875 ms
cost: 51.4706 ms
cost: 50.2222 ms
cost: 49.1053 ms
cost: 48.1 ms
cost: 47.1905 ms
cost: 46.3636 ms
cost: 45.6087 ms
cost: 44.9167 ms
cost: 44.28 ms
cost: 43.6923 ms
cost: 43.1481 ms
cost: 42.6429 ms
cost: 42.1724 ms
cost: 41.7333 ms
cost: 41.3226 ms
cost: 41.1875 ms
cost: 40.8182 ms
cost: 40.5588 ms
cost: 40.2857 ms
cost: 40 ms
cost: 39.7297 ms
cost: 39.5 ms
cost: 39.2308 ms
cost: 39 ms
cost: 38.7561 ms
感觉cuda9.0测试下来结果都在29ms左右,请问是什么原因呢?莫非源代码和cuda9.0绑定了?感谢
The text was updated successfully, but these errors were encountered: