-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
To do: Rewrite code cvt #360
Comments
FYI, the avx_test branch supports AVX2. Support for AVX512 is a straightforward extension of that. A visual explanation can be seen at https://www.youtube.com/watch?v=qXleSwCCEvY&list=PLHTh1InhhwT4qBc2aCJUKYn-vhmZOGh01&index=10 starting at time 41:00. Benchmark results can be seen starting at 48:53. Going to a larger register size does not always help. It is more beneficial when you expect long runs of ASCII. Good luck with your project! |
Oh, BobSteagall. Thank you. Before I wasn't a simd expert and has little knowledge. However, now it is extremely different since I think I have written a lot of vector extension code and can probably try something by myself on this since I have successfully written SIMD code. I am also very interested on working on wasm simd. For example, something like this or sha256, sha512 things.
Not every platform would necessarily have builtins like __builtin_ia32_pmovmskb128 to get masks. For example, I do not see how to get that for arm neon. I also find that getting masks for shifting may not be a good solution since sometimes std::countr_zero would screw up for random reasons. Just knowing zeros is not necessarily good enough for a lot of jobs like this. I am thinking about trying them myself.
This shows getting masks may not be a very good idea since they are relatively slow compared to just testing whether SIMD vectors are zero or not. |
@BobSteagall 's UTF-utils https://github.com/BobSteagall/utf_utils is too platform-specific and does not work with AVX512. Going to Rewrite it
The text was updated successfully, but these errors were encountered: