Commit Graph

5 Commits

Author SHA1 Message Date
Martin Kroeker 13c28889a2
Update "cosmetic fixes for non-C99 compilers" 2020-06-06 15:22:27 +02:00
Martin Kroeker 28915eed72
Cosmetic fixes for non-C99 compilers 2020-06-05 10:05:34 +02:00
Rajalakshmi Srinivasaraghavan 8efba9b7c0 Improve shgemm test
This patch adds another check to test shgemm results.
2020-05-11 17:15:10 -05:00
Rajalakshmi Srinivasaraghavan 564b0d39ef Add test for shgemm
This patch has Makefile changes to add test for shgemm which
compares sgemm and shgemm result.
2020-04-29 13:40:34 -05:00
Rajalakshmi Srinivasaraghavan 7eb55504b1 RFC : Add half precision gemm for bfloat16 in OpenBLAS
This patch adds support for bfloat16 data type matrix multiplication kernel.
For architectures that don't support bfloat16, it is defined as unsigned short
(2 bytes).  Default unroll sizes can be changed as per architecture as done for
SGEMM and for now 8 and 4 are used for M and N.  Size of ncopy/tcopy can be
changed as per architecture requirement and for now, size 2 is used.

Added shgemm in kernel/power/KERNEL.POWER9 and tested in powerpc64le and
powerpc64.  For reference, added a small test compare_sgemm_shgemm.c to compare
sgemm and shgemm output.

This patch does not cover OpenBLAS test, benchmark and lapack tests for shgemm.
Complex type implementation can be discussed and added once this is approved.
2020-04-14 14:55:08 -05:00