OpenBLAS/lapack
Rajalakshmi Srinivasaraghavan 7eb55504b1 RFC : Add half precision gemm for bfloat16 in OpenBLAS
This patch adds support for bfloat16 data type matrix multiplication kernel.
For architectures that don't support bfloat16, it is defined as unsigned short
(2 bytes).  Default unroll sizes can be changed as per architecture as done for
SGEMM and for now 8 and 4 are used for M and N.  Size of ncopy/tcopy can be
changed as per architecture requirement and for now, size 2 is used.

Added shgemm in kernel/power/KERNEL.POWER9 and tested in powerpc64le and
powerpc64.  For reference, added a small test compare_sgemm_shgemm.c to compare
sgemm and shgemm output.

This patch does not cover OpenBLAS test, benchmark and lapack tests for shgemm.
Complex type implementation can be discussed and added once this is approved.
2020-04-14 14:55:08 -05:00
..
getf2 Refs #723. Avoid out of boundary for getf2. 2016-01-26 09:14:57 -06:00
getrf RFC : Add half precision gemm for bfloat16 in OpenBLAS 2020-04-14 14:55:08 -05:00
getrs Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
laswp add missing brackets to silence indentation warnings gcc721 2018-01-19 23:11:12 +01:00
lauu2 Fix lapack complex implementation of lauu2 and potf2 for Android (use FLOAT instead of FLOAT[2] as imaginary part is not used). 2016-02-04 16:59:56 -05:00
lauum prepared lapack/lauum for UNROLL values, that are not a power of two 2017-01-11 07:29:17 +01:00
potf2 Fix lapack complex implementation of lauu2 and potf2 for Android (use FLOAT instead of FLOAT[2] as imaginary part is not used). 2016-02-04 16:59:56 -05:00
potrf prepared lapack/potrf functions for UNROLL values, that are not a power of two 2017-01-10 10:50:28 +01:00
trti2 LAPACK helpers in C that need care too 2018-01-02 14:38:50 +01:00
trtri address minor warnings from gcc7 2019-09-07 10:21:08 +03:00
trtrs fix Makefile 2019-09-10 17:11:01 -04:00
CMakeLists.txt Correct generation of GETRF files by the CMAKE build 2020-02-15 19:29:14 +01:00
Makefile add missing objects 2019-09-08 11:14:49 -04:00