OpenBLAS/kernel/zarch
Marius Hillenbrand 43c0d4f312 s390x: Add vectorized sgemm kernel for Z14 and newer
Add a new GEMM kernel implementation to exploit the FP32 SIMD
operations introduced with z14 and employ it for SGEMM on z14 and newer
architectures.

The SIMD extensions introduced with z13 support operations on
double-sized scalars in vector registers. Thus, the existing SGEMM code
would extend floats to doubles before operating on them. z14 extended
SIMD support to operations on 32-bit floats. By employing these
instructions, we can operate on twice the number of scalars per
instruction (four floats in each vector registers) and avoid the
conversion operations.

The code is written in C with explicit vectorization. In experiments,
this kernel improves performance on z14 and z15 by around 2x over the
current implementation in assembly. The flexibilty of the C code paves
the way for adjustments in subsequent commits.

Tested via make -C test / ctest / utest and by a couple of additional
unit tests that exercise blocking (e.g., partial register blocks with
fewer than UNROLL_M rows and/or fewer than UNROLL_N columns).

Signed-off-by: Marius Hillenbrand <mhillen@linux.ibm.com>
2020-05-12 15:59:51 +02:00
..
KERNEL Init IBM z system (s390x) porting. 2016-04-15 18:02:24 -04:00
KERNEL.Z13 add in runtime cpu detection for zarch (#2349) 2019-12-31 18:03:27 +01:00
KERNEL.Z14 s390x: Add vectorized sgemm kernel for Z14 and newer 2020-05-12 15:59:51 +02:00
KERNEL.ZARCH_GENERIC add in runtime cpu detection for zarch (#2349) 2019-12-31 18:03:27 +01:00
Makefile Init IBM z system (s390x) porting. 2016-04-15 18:02:24 -04:00
camax.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
camin.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
casum.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
caxpy.c [ZARCH] Fix caxpy 2019-02-13 12:54:35 +02:00
ccopy.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
cdot.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
cgemv_n_4.c [ZARCH] Modify constraints 2019-02-13 21:06:25 +02:00
cgemv_t_4.c [ZARCH] Fix cgemv_t_4 2019-02-12 13:12:28 +02:00
ckernelMacrosV.S strmm and ctrmm 2017-03-13 01:23:16 +04:00
crot.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
cscal.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
csum.c Change bad usage of "asum" to "sum" in ZARCH versions of ?sum 2019-11-21 13:49:13 +01:00
cswap.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
ctrmm4x4V.S strmm and ctrmm 2017-03-13 01:23:16 +04:00
damax.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
damax_z13.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
damin.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
damin_z13.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
dasum.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
daxpy.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
dcopy.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
ddot.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
dgemv_n_4.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
dgemv_t_4.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
dmax.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
dmax_z13.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
dmin.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
dmin_z13.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
drot.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
dscal.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
dsdot.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
dsum.c Change bad usage of "asum" to "sum" in ZARCH versions of ?sum 2019-11-21 13:49:13 +01:00
dswap.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
gemm8x4V.S changed to conventional register save area 2017-03-01 03:13:21 +04:00
gemm_vec.c s390x: Add vectorized sgemm kernel for Z14 and newer 2020-05-12 15:59:51 +02:00
icamax.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
icamin.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
idamax.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
idamin.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
idmax.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
idmin.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
isamax.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
isamin.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
ismax.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
ismin.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
izamax.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
izamin.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
kernelMacros.S ztrmm kernel. 2017-02-26 06:14:12 +04:00
samax.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
samin.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
sasum.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
saxpy.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
scopy.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
sdot.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
sgemv_n_4.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
sgemv_t_4.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
skernelMacros.S strmm and ctrmm 2017-03-13 01:23:16 +04:00
smax.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
smin.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
srot.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
sscal.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
ssum.c Change bad usage of "asum" to "sum" in ZARCH versions of ?sum 2019-11-21 13:49:13 +01:00
sswap.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
strmm8x4V.S strmm and ctrmm 2017-03-13 01:23:16 +04:00
trmm8x4V.S changed to conventional register save area 2017-03-01 03:13:21 +04:00
zamax.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
zamax_z13.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
zamin.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
zamin_z13.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
zasum.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
zaxpy.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
zcopy.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
zdot.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
zgemv_n_4.c [ZARCH] Modify constraints 2019-02-13 21:06:25 +02:00
zgemv_t_4.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
zkernelMacrosV.S ztrmm kernel. 2017-02-26 06:14:12 +04:00
zrot.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
zscal.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
zsum.c Change bad usage of "asum" to "sum" in ZARCH versions of ?sum 2019-11-21 13:49:13 +01:00
zswap.c [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
ztrmm4x4V.S changed to conventional register save area 2017-03-01 03:13:21 +04:00