Martin Kroeker
|
2a62d2df96
|
Enable use of AVX512 microkernels with NVIDIA HPC from version 22.3
|
2023-07-26 19:39:11 +02:00 |
Martin Kroeker
|
d17238599b
|
Add casts
|
2021-09-11 13:38:28 +02:00 |
Gengxin Xie
|
1b0f17eeed
|
align to 64, using SSE when input size is small
|
2020-09-03 14:25:54 +08:00 |
Gengxin Xie
|
448152cdd8
|
define __AVX2__ to ensure the haswell code compiled with avx2
|
2020-08-31 14:39:08 +08:00 |
Gengxin Xie
|
cb3c190a3a
|
Implementaion of dasum, sasum with AVX2 & AVX512 intrinsic
|
2020-08-31 11:44:08 +08:00 |