Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

All .onnx files from some folders models: 4xNomos8kSCSRFormer, 4xNomos8kSCHAT-S, 4xNomos8kSCHAT-L with all the backend #79

Open
kedaitinh12 opened this issue Dec 21, 2023 · 6 comments

Comments

@kedaitinh12
Copy link

I can't use all .onnx files from some folders with all the backend:
https://github.com/Phhofm/models/tree/main/4xNomos8kSCSRFormer
https://github.com/Phhofm/models/tree/main/4xNomos8kSCHAT-S
https://github.com/Phhofm/models/tree/main/4xNomos8kSCHAT-L

Can you add support it to vs-mlrt? Thanks

@WolframRhodium
Copy link
Contributor

The NCNN_VK backend does not support these models, while other backends should work (tested 4xNomos8kSCSRFormer.onnx).

@kedaitinh12
Copy link
Author

Example with Cuda ver: only 4xNomos8kSCSRFormer_131616_onnxsim work and others (4xNomos8kSCSRFormer_136464_fp16_onnxsim, 4xNomos8kSCSRFormer_136464_fp16_onnxsim, 4xNomos8kSCSRFormer_131212_onnxsim_fp16, 4xNomos8kSCHAT-S, 4xNomos8kSCHAT-L) don't work for me

@WolframRhodium
Copy link
Contributor

*_fp16_* indicates that the onnx files require fp16 IO. 4xNomos8kSCHAT-S and 4xNomos8kSCHAT-L also requires fp16 IO, although the author does not encode that in the filename.

At present only OV_* backends support specifying non-fp32 IO format in onnx. I may add support for this feature in ORT_* backends, but not in the TRT backend.

Please be specific about what models do not work and what's the error message in the future. Otherwise I will simply ignore the issue.

@hooke007
Copy link
Contributor

In my previous test, I only remembered there are some problems with dynamic shape onnx files.
i.e. https://github.com/Phhofm/models/tree/main/2xHFA2kAVCOmniSR/onnx
It will report sth like 'negative value in shape'

@WolframRhodium
Copy link
Contributor

This is a limitation of this onnx file. The author did not enable dynamic shapes during the export of this model, and therefore you need to set tilesize=(256, 256) when using vsmlrt.inference with this model.

It is optional to set overlap=(8, 8) (or other values) to reduce seam artifacts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants