Commit Graph

17 Commits

Author SHA1 Message Date
Wangyang Guo
3dc6052c7e initial support for Sapphire Rapids platform 2021-10-12 01:30:40 -07:00
Martin Kroeker
5d26223f4a remove extra/intermediate size step of min_jj from PR747 2020-12-08 20:59:56 +01:00
Chen, Guobing
e740c4873d Enable COOPERLAKE build target
Enable new build target platform -- COOPERLAKE. This target platform
supports all the SKYLAKEX supported ISAs + avx512bf16. So all the
SKYLAKEX specific kernels/drivers and related code are now extended
to be also active on COOPERLAKE. Besides, new BF16 related kernels
are active under this target.
2020-08-13 06:18:00 +08:00
Rajalakshmi Srinivasaraghavan
7eb55504b1 RFC : Add half precision gemm for bfloat16 in OpenBLAS
This patch adds support for bfloat16 data type matrix multiplication kernel.
For architectures that don't support bfloat16, it is defined as unsigned short
(2 bytes).  Default unroll sizes can be changed as per architecture as done for
SGEMM and for now 8 and 4 are used for M and N.  Size of ncopy/tcopy can be
changed as per architecture requirement and for now, size 2 is used.

Added shgemm in kernel/power/KERNEL.POWER9 and tested in powerpc64le and
powerpc64.  For reference, added a small test compare_sgemm_shgemm.c to compare
sgemm and shgemm output.

This patch does not cover OpenBLAS test, benchmark and lapack tests for shgemm.
Complex type implementation can be discussed and added once this is approved.
2020-04-14 14:55:08 -05:00
wjc404
1c3e20ce48 Update level3.c 2020-02-04 20:30:23 +08:00
Andrew
11a627c54e remove surplus parentheses to silence clang5 2018-01-01 20:56:26 +01:00
Andrew
bfc2a88594 remove unused buffer 2017-12-22 00:55:40 +01:00
Werner Saar
a2672d5589 prepared driver/level3 functions for UNROLL values, that are not a power of two 2017-01-09 10:38:15 +01:00
Werner Saar
b07d733a71 added updates for syrk and syr2k 2016-01-21 13:16:44 +01:00
wernsaar
1d33547222 optimized zgemm kernel for haswell 2014-07-27 11:51:42 +02:00
Timothy Gu
6c2ead30f0 Remove all trailing whitespace except lapack-netlib
Signed-off-by: Timothy Gu <timothygu99@gmail.com>
2014-06-27 12:05:18 -07:00
wernsaar
c947ab85dc changed level3.c 2013-12-01 13:46:30 +01:00
wernsaar
2840d56aeb added dgemm_kernel for Piledriver 2013-10-19 09:47:15 +02:00
wernsaar
6f008abcef replaced defined(DOUBLE) by !defined(XDOUBLE) 2013-07-09 18:17:50 +02:00
wernsaar
25491e42f9 New dgemm kernel for BULLDOZER: dgemm_kernel_8x2_bulldozer.S 2013-06-08 09:40:17 +02:00
Xianyi Zhang
1b97ec1a7c Added DEBUG option in Makefile.rule. Fixed DEBUG typo mistakes. 2011-02-26 11:19:54 +08:00
Xianyi Zhang
342bbc3871 Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00