Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于性能(耗时)问题 #7

Open
qingqinghu1026 opened this issue Dec 7, 2022 · 4 comments
Open

关于性能(耗时)问题 #7

qingqinghu1026 opened this issue Dec 7, 2022 · 4 comments

Comments

@qingqinghu1026
Copy link

qingqinghu1026 commented Dec 7, 2022

你好,感谢您提供相关代码。
在本仓库运行时,使用您提供的预训练模型,测试butterfly 2/3倍scale时,CPU耗时700ms,GPU(NVIDIA-SMI 515.65.01 Driver Version: 515.65.01 CUDA Version: 11.7)耗时400ms,
与论文中的结论27fps有差异,是什么原因呢

@qingqinghu1026 qingqinghu1026 changed the title pre-weight link 预训练模型网盘链接 关于性能(耗时)问题 Dec 7, 2022
@liu2g
Copy link

liu2g commented Feb 28, 2024

您好,我不是原作者,但是我也在这个模型针对性能进行研究。我的理解是原作者在测试耗时中排除了加载模型和图片的时间,我在这行这行前后加了

t0 = time.time()
# code
print((time.time() - t0)*1E3)

测量出的时间在30-60之间,和 27 FPS (~37 ms) 近似,所以和原作者的成果是基本符合的


Hello, I am not the original author, but I also happen to look into the performance of this model. My assumption is their measurement excluded the time to load the model and image, so I inserted the above code block before this line and after this line and got around 30 - 60 ms, which matches the 27 FPS result of the paper author.

@Sherlock-hh
Copy link

你好,我也测试了这个耗时 1080 cuda10.2,这个是我的测试结果,但是耗时好像也不大稳定,尤其是第一帧(256,192)在gpu上会要157ms,但是同样分辨率再往后测试速度又好像不要1ms。
image

@wangguoqing129
Copy link

FSRCNN原始论文的运行时间均是C++版本的代码,因此使用pytorch调用cuda的方式测试的结果无法匹配FSRCNN原始论文结果。具体需要自行查看FSRCNN原始论文

@wangguoqing129
Copy link

经过我的实验,在windows环境中交叉编译caffe 1.0.0版本,然后参考FSRCNN官方caffe训练脚本编写deploy测试txt网络文件后,得到的实验结果和FSRCNN原始论文接近,因为实验环境都是原生C++。windows caffe 1.0.0版本编译过程请自行查看caffe github官方仓库 windows分支。
https://github.com/BVLC/caffe/tree/windows

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants