OpenBLAS/cmake
Rajalakshmi Srinivasaraghavan 7eb55504b1 RFC : Add half precision gemm for bfloat16 in OpenBLAS
This patch adds support for bfloat16 data type matrix multiplication kernel.
For architectures that don't support bfloat16, it is defined as unsigned short
(2 bytes).  Default unroll sizes can be changed as per architecture as done for
SGEMM and for now 8 and 4 are used for M and N.  Size of ncopy/tcopy can be
changed as per architecture requirement and for now, size 2 is used.

Added shgemm in kernel/power/KERNEL.POWER9 and tested in powerpc64le and
powerpc64.  For reference, added a small test compare_sgemm_shgemm.c to compare
sgemm and shgemm output.

This patch does not cover OpenBLAS test, benchmark and lapack tests for shgemm.
Complex type implementation can be discussed and added once this is approved.
2020-04-14 14:55:08 -05:00
..
OpenBLASConfig.cmake.in Add template for OpenBLASConfig.cmake 2018-06-10 09:25:46 +02:00
arch.cmake Add Neoverse-N1 core 2020-02-29 03:22:04 +00:00
cc.cmake fixes #2480 2020-03-02 17:22:28 +01:00
export.cmake Ninja complains that file openblas.def does not exist 2017-07-29 21:00:32 +05:30
f_check.cmake Allow using compilers other than gfortran in conjunction with 2017-11-06 14:39:12 -06:00
fc.cmake ifort and pgfort need "recursive" for compiling LAPACK as well 2020-04-01 15:38:07 +02:00
kernel.cmake Misc. typo fixes 2019-04-29 17:03:56 -04:00
lapack.cmake [WIP] Update LAPACK to 3.9.0 (#2353) 2020-01-01 13:18:53 +01:00
lapacke.cmake [WIP] Update LAPACK to 3.9.0 (#2353) 2020-01-01 13:18:53 +01:00
openblas.pc.in Allow to install the 'interfare64' version concurrently with the regular version 2018-09-15 21:00:03 -07:00
os.cmake Add -lm and disable EXPRECISION support on *BSD 2019-04-02 09:38:18 +02:00
prebuild.cmake RFC : Add half precision gemm for bfloat16 in OpenBLAS 2020-04-14 14:55:08 -05:00
system.cmake RFC : Add half precision gemm for bfloat16 in OpenBLAS 2020-04-14 14:55:08 -05:00
system_check.cmake Use proper extension on the avx512 testcase filename 2020-03-20 23:05:53 +01:00
utils.cmake Misc. typo fixes 2019-04-29 17:03:56 -04:00