Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
AVX-512 support for RSA Signing (#1273)
This change adds AVX-512 support for RSA 2k, 3k and 4k signing. It is built around the use of AVX512_IFMA within the [(Almost) Montgomery Multiplication](https://eprint.iacr.org/2011/239) implementation that comprises the modular exponentiation part of the RSA algorithm. It is ported from the [OpenSSL patch](openssl/openssl#13750). On C6i instance, clang 12, Release build: Before: Did 832 RSA 2048 signing operations in 1009511us (824.2 ops/sec) Did 41000 RSA 2048 verify (same key) operations in 1019103us (40231.5 ops/sec) Did 30000 RSA 2048 verify (fresh key) operations in 1007956us (29763.2 ops/sec) Did 3684 RSA 2048 private key parse operations in 1067692us (3450.4 ops/sec) Did 340 RSA 3072 signing operations in 1051690us (323.3 ops/sec) Did 13000 RSA 3072 verify (same key) operations in 1087695us (11951.9 ops/sec) Did 16000 RSA 3072 verify (fresh key) operations in 1005781us (15908.0 ops/sec) Did 1870 RSA 3072 private key parse operations in 1017467us (1837.9 ops/sec) Did 128 RSA 4096 signing operations in 1015724us (126.0 ops/sec) Did 10000 RSA 4096 verify (same key) operations in 1071670us (9331.2 ops/sec) Did 6952 RSA 4096 verify (fresh key) operations in 1016484us (6839.3 ops/sec) Did 1110 RSA 4096 private key parse operations in 1092991us (1015.6 ops/sec) After: Did 1690 RSA 2048 signing operations in 1025072us (1648.7 ops/sec) Did 63000 RSA 2048 verify (same key) operations in 1008785us (62451.4 ops/sec) Did 54000 RSA 2048 verify (fresh key) operations in 1000298us (53983.9 ops/sec) Did 8000 RSA 2048 private key parse operations in 1000938us (7992.5 ops/sec) Did 550 RSA 3072 signing operations in 1012078us (543.4 ops/sec) Did 30000 RSA 3072 verify (same key) operations in 1022061us (29352.5 ops/sec) Did 27000 RSA 3072 verify (fresh key) operations in 1037663us (26020.0 ops/sec) Did 4140 RSA 3072 private key parse operations in 1006526us (4113.2 ops/sec) Did 253 RSA 4096 signing operations in 1050767us (240.8 ops/sec) Did 18000 RSA 4096 verify (same key) operations in 1057742us (17017.4 ops/sec) Did 15000 RSA 4096 verify (fresh key) operations in 1000483us (14992.8 ops/sec) Did 2510 RSA 4096 private key parse operations in 1004408us (2499.0 ops/sec) There is currently no support for 8k, so no change there. However, this could be a follow on if there is interest in that. Call-outs: This patch is primarily additive modulo a small logic change that occurs in `mod_exp()` in `rsa_impl.c`, where, previously, the calls to `mod_montgomery` and `BN_mod_exp_mont_consttime` were interleaved. Here, in order to make possible the parallel exponentiations, `r1` is kept around and a new `BIGNUM`, `r2`, is created on the context. --------- Co-authored-by: Nevine Ebeid <[email protected]> Co-authored-by: Nevine Ebeid <[email protected]>
- Loading branch information