-
Notifications
You must be signed in to change notification settings - Fork 369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Benchmarking and Performance #255
Comments
Are you sure that BLIS is compiled for threaded execution?
|
@jeffhammond Thanks for the hint. Initially I thought threading is enabled by default however the actual default is |
Result for pthread with
Now BLIS looks comparative to OpenBLAS, and the overhead of thread creation for small matrices is obvious. Result for openmp.
The openmp threading model has less threading overhead for small matrices. @jeffhammond Have I correctly compiled BLIS this time? Or is there any way to further improve BLIS's performance? |
This looks right to me. OpenMP should have lower overhead than Pthreads because the former uses a thread pool whereas the latter cannot (unless BLIS implements its own thread pool). BLIS uses hand-written assembly so compiler flags related to code generation should have no effect on functions like DGEMM. You may find that flags related to inlining or link-time optimization help, but I would not expect a significant effect from that. |
The performance benchmark has been added and I'm satisfied with that result. Maybe we can close this issue now? |
Sure thing. Thanks for your patience on this issue. BTW, the tools that I used to create the new graphs on the Performance page are already included in the BLIS source distribution. They can be found in |
hello,
thanks. |
|
May this change, when binary128 and decimal{32,64,128} became a part of ISO/IEC 9899:202x Standard? |
@sav-ix what datatypes are part of ISO/IEC/IEEE doesn't have a major impact on what math libraries support. User and developer interest does. If you want to see new datatypes supported, I encourage you to create an issue specific to each, e.g. #234. I know there is some interest in developing binary128 support in BLIS already... |
@sav-ix Regarding:
This is something that would make a good third-party project. Why not start developing that yourself? |
It would be better to provide some script for users to compare the performance between different BLAS implementations.
I wrote one with Julia 1.0, but interestingly BLIS's performance is not as good as I thought...
Result:
System information
The text was updated successfully, but these errors were encountered: