Commit Graph

11 Commits

Author SHA1 Message Date
Martin Kroeker
ed532dc75b remove another early exit for incx < 0 2024-03-12 18:47:00 +01:00
Martin Kroeker
02a025f9c1 remove early exit on negative inc_x 2024-03-11 22:52:18 +01:00
Martin Kroeker
8be68fa7f4 move declaration of sca to really keep the compiler from throwing it out (for now) 2023-04-15 12:02:39 +02:00
Martin Kroeker
3727672a74 Improve workaround and keep compilers from optimizing it out 2023-04-13 18:07:52 +02:00
Martin Kroeker
9e29598575 workaround fault with ssq=inf,scale=0 2022-07-02 23:47:17 +02:00
Gilles Gouaillardet
9d292d37b2 arm64: add the missing d9 register to the clobber list
Refs. numpy/numpy#18422

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2021-06-14 17:01:28 +09:00
Ashwin Sekhar T K
1b2508362b arm64: Fix nrm2 for input vectors with Inf
Fix double precision nrm2 kernels returning NaN when the
input vectors contain Inf/-Inf.
2021-01-01 02:49:37 -08:00
Craig Donner
c2545b0fd6 Fixed a few more unnecessary calls to num_cpu_avail.
I don't have as many benchmarks for these as for gemm, but it should still
make a difference for small matrices.
2018-06-11 10:17:16 +01:00
Ashwin Sekhar T K
4899d67f7d THUDNERX2T99: Fix clang compilation 2017-08-02 11:28:45 -07:00
Ashwin Sekhar T K
67473d09dd THUNDERX2T99: Bug Fixes in D/Z NRM2 and ZGEMM 2017-02-28 01:11:38 -08:00
Ashwin Sekhar T K
a3935f0dfb THUNDERX2T99: Add Optimized D/Z NRM2 Implementation 2017-02-23 10:02:15 -08:00