4 Commits

Author SHA1 Message Date
Martin Kroeker
60abcc3991 add proper return statement 2024-08-04 00:13:31 +02:00
gxw
f3cebb3ca3 x86: Fixed numpy CI failure when the target is ZEN. 2024-07-12 16:09:30 +08:00
Martin Kroeker
1abafcd9b2 handle corner cases involving NAN and/or INF 2024-06-06 23:59:43 +02:00
Bart Oldeman
5ceca1a4d8 Add sscal.c + microkernels for Haswell, Zen, Skylake and newer.
Unlike [dcz]scal, sscal still used the original GotoBLAS SSE code from scal_sse.S.
This code follows dscal as closely as possible, except for the inc_x > 1 code
for which a plain C loop is used much like the one in cscal.c, instead of an
adaptation of the SSE2 asm code of dscal.c (I tried but the performance wasn't
better than the plain C loop).
2022-12-06 14:05:49 -05:00