OpenBLAS/kernel
Chris Sidebottom 4f7b77e08a Remove unnecessary instructions from Advanced SIMD dot
The existing kernel was issuing extra instructions to organise the arguments into the same registers they would usually be in and similarly to put the result into the appropriate register.

This has an impact on smaller sized dots and seemed like a quick fix
2022-11-25 16:19:03 +00:00
..
alpha alpha: Remove include of version.h 2022-08-11 15:02:58 +01:00
arm Typo fix 2021-02-23 13:14:35 +01:00
arm64 Remove unnecessary instructions from Advanced SIMD dot 2022-11-25 16:19:03 +00:00
e2k Add default KERNEL file for Elbrus E2K arch 2022-01-22 18:59:36 +01:00
generic Add const attribute to lsame 2022-08-08 15:15:52 +02:00
ia64 Add ia64 implementation of ?sum 2019-03-30 22:18:03 +01:00
loongarch64 LoongArch64: Add DYNAMIC_ARCH support 2022-07-28 14:28:45 +08:00
mips MIPS64: Fixed failed utest dsdot:dsdot_n_1 when TARGET=I6500 2022-09-17 16:43:22 +08:00
mips64 fix copyobj declarations to work with DYNAMIC_ARCH 2022-09-29 08:47:14 +02:00
power change line endings from CRLF to LF 2022-11-16 22:24:01 +01:00
riscv64 Update RISC-V Intrinsic API. 2022-06-06 13:52:21 +08:00
simd fix the CI failure of lack the head 2020-11-12 17:35:17 +08:00
sparc fix DNRM2 returning INF instead of zero due to intermediate overflow 2022-07-19 10:19:27 +02:00
x86 initial support for Sapphire Rapids platform 2021-10-12 01:30:40 -07:00
x86_64 change line endings from CRLF to LF 2022-11-17 09:39:56 +01:00
zarch s390x: fix cscal and zscal implementations 2020-09-21 13:10:05 +02:00
CMakeLists.txt Fix generator rules for ?laswp_ncopy and ?neg_tcopy 2022-04-30 15:28:38 +02:00
Makefile Add -mfma to -mavx2 for Apple clang, and set AVX2 options for Zen as well 2022-09-13 22:39:27 +02:00
Makefile.L1 Conditionally add -mfma to compiler options where needed 2020-12-17 11:34:05 +01:00
Makefile.L2 Empirical workaround for numpy SVD NaN problem from issue 3318 2021-07-18 22:19:19 +02:00
Makefile.L3 Add Elbrus e2k architecture support 2022-01-22 18:55:10 +01:00
Makefile.LA Support NO_LAPACK=1 to build the lib without LAPACK functions. 2011-03-04 11:51:32 +08:00
setparam-ref.c Remove excess initializer (leftover from rework of PR 3793) 2022-10-31 16:57:03 +01:00