OpenBLAS/test
Rajalakshmi Srinivasaraghavan 7eb55504b1 RFC : Add half precision gemm for bfloat16 in OpenBLAS
This patch adds support for bfloat16 data type matrix multiplication kernel.
For architectures that don't support bfloat16, it is defined as unsigned short
(2 bytes).  Default unroll sizes can be changed as per architecture as done for
SGEMM and for now 8 and 4 are used for M and N.  Size of ncopy/tcopy can be
changed as per architecture requirement and for now, size 2 is used.

Added shgemm in kernel/power/KERNEL.POWER9 and tested in powerpc64le and
powerpc64.  For reference, added a small test compare_sgemm_shgemm.c to compare
sgemm and shgemm output.

This patch does not cover OpenBLAS test, benchmark and lapack tests for shgemm.
Complex type implementation can be discussed and added once this is approved.
2020-04-14 14:55:08 -05:00
..
CMakeLists.txt Remove _static usages for tests 2017-08-20 00:13:46 +10:00
LICENSE Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
Makefile Do not attempt to run test without fortran 2020-03-13 20:11:19 +01:00
cblat1.f Misc. typo fixes 2019-04-29 17:03:56 -04:00
cblat2.dat Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
cblat2.f Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
cblat3.dat Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
cblat3.f Fixed #46. Initialize variables in cblat3.f and zblat3.f. 2011-09-05 16:30:55 +00:00
cblat3_3m.dat disabled SYMM3M and HEMM3M functions because segment violations 2014-09-20 15:27:40 +02:00
cblat3_3m.f disabled SYMM3M and HEMM3M functions because segment violations 2014-09-20 15:27:40 +02:00
compare_sgemm_shgemm.c RFC : Add half precision gemm for bfloat16 in OpenBLAS 2020-04-14 14:55:08 -05:00
dblat1.f Misc. typo fixes 2019-04-29 17:03:56 -04:00
dblat2.dat Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
dblat2.f Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
dblat3.dat Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
dblat3.f Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
get_threading_model.c Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
sblat1.f Misc. typo fixes 2019-04-29 17:03:56 -04:00
sblat2.dat Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
sblat2.f Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
sblat3.dat Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
sblat3.f Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
zblat1.f Misc. typo fixes 2019-04-29 17:03:56 -04:00
zblat2.dat Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
zblat2.f Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
zblat3.dat Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
zblat3.f Fixed #46. Initialize variables in cblat3.f and zblat3.f. 2011-09-05 16:30:55 +00:00
zblat3_3m.dat bugfix for GEMM3M functions 2014-09-21 11:41:43 +02:00
zblat3_3m.f added GEMM3M tests 2014-09-21 10:55:08 +02:00