OpenBLAS/kernel
Marius Hillenbrand 71b6eaf459 s390x: Use new sgemm kernel also for strmm on Z14 and newer
Employ the newly added GEMM kernel also for STRMM on Z14. The
implementation in C with vector intrinsics exploits FP32 SIMD operations
and thereby gains performance over the existing assembly code. Extend
the implementation for handling triangular matrix multiplication,
accordingly. As added benefit, the more flexible C code enables us to
adjust register blocking in the subsequent commit.

Tested via make -C test / ctest / utest and by a couple of additional
unit tests that exercise blocking.

Signed-off-by: Marius Hillenbrand <mhillen@linux.ibm.com>
2020-05-12 15:59:51 +02:00
..
alpha Add implementations of ssum/dsum and csum/zsum 2019-03-30 22:05:11 +01:00
arm Make ARMV7 compile with xcode and add a CI job for it (#2537) 2020-04-02 10:30:37 +02:00
arm64 ARM64: Improve DAXPY for ThunderX2 2020-05-07 09:22:50 -07:00
generic Fix DYNAMIC_ARCH compilation errors 2020-04-15 09:09:50 -05:00
ia64 Add ia64 implementation of ?sum 2019-03-30 22:18:03 +01:00
mips Delete KERNEL.1004K 2020-04-19 15:44:30 +02:00
mips64 Fix compilation problem on loongson platform 2020-04-09 19:28:15 +08:00
power Fix cmake compilation issue - POWER9 2020-05-08 20:31:56 -05:00
sparc Add SPARC implementation of ?sum 2019-03-30 22:25:06 +01:00
x86 Fix unwanted case-sensitivity in x86 LSAME for (AMD) processors without CMOV 2019-08-13 10:19:10 +02:00
x86_64 Work around excessive LAPACK test failures on Skylake-X 2020-05-09 23:49:18 +02:00
zarch s390x: Use new sgemm kernel also for strmm on Z14 and newer 2020-05-12 15:59:51 +02:00
CMakeLists.txt Make building the bfloat16 functions conditional on option BUILD_HALF (#2590) 2020-05-01 09:58:30 +02:00
Makefile Add variable for gcc >=9 test 2019-11-29 23:47:23 +01:00
Makefile.L1 Add ?sum 2019-03-30 22:01:13 +01:00
Makefile.L2 Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
Makefile.L3 Make building the bfloat16 functions conditional on option BUILD_HALF (#2590) 2020-05-01 09:58:30 +02:00
Makefile.LA Support NO_LAPACK=1 to build the lib without LAPACK functions. 2011-03-04 11:51:32 +08:00
common_param.h Fix warnings in clang and export symbol 2020-04-15 19:15:23 -05:00
setparam-ref.c Make building the bfloat16 functions conditional on option BUILD_HALF (#2590) 2020-05-01 09:58:30 +02:00