Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
support AVX2 for run_container_to_uint32_array (#642)
* support AVX2 for run_container_to_uint32_array 1. support AVX for run_container_to_uint32_array 2. add dense range for run container baseline ``` number of values in container = 256 run_container_to_uint32_array(out, Bt, 1234): 3.64 cycles per operation number of values in container = 2018 run_container_to_uint32_array(out, Bt, 1234): 3.07 cycles per operation number of values in container = 14498 run_container_to_uint32_array(out, Bt, 1234): 3.47 cycles per operation number of values in container = 7826 run_container_to_uint32_array(out, Bt, 1234): 0.18 cycles per operation number of values in container = 8152 run_container_to_uint32_array(out, Bt, 1234): 0.18 cycles per operation number of values in container = 8189 run_container_to_uint32_array(out, Bt, 1234): 0.18 cycles per operation number of values in container = 8191 run_container_to_uint32_array(out, Bt, 1234): 0.18 cycles per operation ``` AVX2 version: ``` number of values in container = 256 run_container_to_uint32_array(out, Bt, 1234): 4.38 cycles per operation number of values in container = 2018 run_container_to_uint32_array(out, Bt, 1234): 3.77 cycles per operation number of values in container = 14498 run_container_to_uint32_array(out, Bt, 1234): 4.19 cycles per operation number of values in container = 7826 run_container_to_uint32_array(out, Bt, 1234): 0.10 cycles per operation number of values in container = 8152 run_container_to_uint32_array(out, Bt, 1234): 0.10 cycles per operation number of values in container = 8189 run_container_to_uint32_array(out, Bt, 1234): 0.10 cycles per operation number of values in container = 8191 run_container_to_uint32_array(out, Bt, 1234): 0.10 cycles per operation ``` SIMD version works well on dense case. However, if the length of each runs is small, a single operation will have an if additional overhead. * avoid regression when run length is small
- Loading branch information