OpenBLAS/kernel
Rajalakshmi Srinivasaraghavan 346e30a46a POWER10: Improve axpy performance
This patch aligns the stores to 32 byte boundary for saxpy and daxpy
before entering into vector pair loop. Fox caxpy, changed the store
instructions to stxv to improve performance of unaligned cases.
2020-12-10 11:51:42 -06:00
..
alpha Add implementations of ssum/dsum and csum/zsum 2019-03-30 22:05:11 +01:00
arm Fix compilation with SolarisStudio 2020-12-06 19:14:16 +01:00
arm64 Merge pull request #2867 from Qiyu8/usimd-floatdot 2020-10-10 12:10:25 +02:00
generic Add the support for RISC-V Vector. 2020-10-15 16:09:02 +08:00
ia64 Add ia64 implementation of ?sum 2019-03-30 22:18:03 +01:00
mips Add msa support for loongson 2020-12-09 10:28:46 +08:00
mips64 Add msa support for loongson 2020-12-09 10:28:46 +08:00
power POWER10: Improve axpy performance 2020-12-10 11:51:42 -06:00
riscv64 Refs #2899 2020-11-10 09:38:04 +08:00
simd fix the CI failure of lack the head 2020-11-12 17:35:17 +08:00
sparc Work around DOT and SWAP test failures 2020-12-06 19:15:37 +01:00
x86 Enable COOPERLAKE build target 2020-08-13 06:18:00 +08:00
x86_64 Merge pull request #3016 from xiegengxin/complex-asum 2020-12-04 22:07:16 +01:00
zarch s390x: fix cscal and zscal implementations 2020-09-21 13:10:05 +02:00
CMakeLists.txt Change "HALF" and "sh" to "BFLOAT16" and "sb" 2020-10-12 00:00:55 +02:00
Makefile Add msa support for loongson 2020-12-09 10:28:46 +08:00
Makefile.L1 Fix build issues with bfloat16 2020-10-13 11:00:22 -05:00
Makefile.L2 Implementation of BF16 based gemv 2020-10-29 02:08:23 +08:00
Makefile.L3 Add msa support for loongson 2020-12-09 10:28:46 +08:00
Makefile.LA Support NO_LAPACK=1 to build the lib without LAPACK functions. 2011-03-04 11:51:32 +08:00
setparam-ref.c Add msa support for loongson 2020-12-09 10:28:46 +08:00