OpenBLAS/kernel
Marius Hillenbrand 07c334e7be s390x: Factor out small block sizes for SGEMM/DGEMM on z14
For small register blockings that are too small to fill up vector
registers with column vectors, we currently use a generic code block.
Replace that with instantiations of the generic code as individual
functions, so that the compiler can optimize each one separately.

Signed-off-by: Marius Hillenbrand <mhillen@linux.ibm.com>
2020-08-11 12:56:39 +02:00
..
alpha Add implementations of ssum/dsum and csum/zsum 2019-03-30 22:05:11 +01:00
arm Use OPENBLAS_MAKE_COMPLEX_FLOAT on PPC only 2020-07-23 20:40:13 +00:00
arm64 ARM64: Add THUNDERX3T110 Target 2020-07-26 23:32:24 -07:00
generic powerpc: Optimized SHGEMM kernel for POWER10 2020-06-25 22:19:08 -05:00
ia64 Add ia64 implementation of ?sum 2019-03-30 22:18:03 +01:00
mips Delete KERNEL.1004K 2020-04-19 15:44:30 +02:00
mips64 Fix compilation problem on loongson platform 2020-04-09 19:28:15 +08:00
power dgemv optimization for POWER10 2020-07-29 18:59:32 -05:00
sparc Add SPARC implementation of ?sum 2019-03-30 22:25:06 +01:00
x86 Fix unwanted case-sensitivity in x86 LSAME for (AMD) processors without CMOV 2019-08-13 10:19:10 +02:00
x86_64 Multiply by 2 instead of left-shifting a potentially negative number 2020-08-02 18:29:56 +02:00
zarch s390x: Factor out small block sizes for SGEMM/DGEMM on z14 2020-08-11 12:56:39 +02:00
CMakeLists.txt powerpc: Add support for future processor 2020-06-11 15:47:20 -05:00
Makefile Fix compilation issues with clang on POWER 2020-07-27 14:11:07 -05:00
Makefile.L1 Add ?sum 2019-03-30 22:01:13 +01:00
Makefile.L2 Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
Makefile.L3 fix trailing whitespace 2020-07-14 18:20:03 +02:00
Makefile.LA Support NO_LAPACK=1 to build the lib without LAPACK functions. 2011-03-04 11:51:32 +08:00
setparam-ref.c Make building the bfloat16 functions conditional on option BUILD_HALF (#2590) 2020-05-01 09:58:30 +02:00