Commit Graph

12 Commits

Author SHA1 Message Date
Xianyi Zhang 255b6dd0fa Merge branch 'develop' into small_matrices 2020-08-28 21:38:58 +08:00
Xianyi Zhang 712ca43069 Change a1b0 gemm to b0 gemm. 2020-08-28 07:55:27 +08:00
Martin Kroeker 75eeb265d7
[WIP] Refactor the driver code for direct SGEMM (#2782)
Move "direct SGEMM" functionality out of the SkylakeX SGEMM kernel and make it available
(on x86_64 targets only for now) in DYNAMIC_ARCH builds
* Add  sgemm_direct targets in the kernel Makefile.L3 and CMakeLists.txt
* Add direct_sgemm functions to the gotoblas struct in common_param.h
* Move sgemm_direct_performant helper to separate file
* Update gemm.c  to macros for sgemm_direct to support dynamic_arch naming via common_s,h
* (Conditionally) add sgemm_direct functions in setparam-ref.c
2020-08-19 14:51:09 +02:00
Xianyi Zhang 43bef4aaac Add alpha=1.0 beta=0.0 for small gemm. 2020-04-28 22:35:36 +08:00
Xianyi Zhang aae6af94bb Add small marix optimization kernel interface.
make SMALL_MATRIX_OPT=1
2020-04-28 19:02:41 +08:00
Martin Kroeker 5c42287c4f
Add declarations for ?sum and cblas_?sum 2019-03-30 21:58:03 +01:00
Martin Koehler 711ca33bc6 Improved Ximatcopy when lda==ldb.
The Ximatcopy functions create a copy of the input matrix
although they seem to work inplace. The new routines
XIMATCOPY_K_YY perform the operations inplace if the leading
dimension does not change.
2015-09-07 14:36:16 +02:00
Martin Koehler 39cc6b21d3 Add ATLAS-style ?geadd function 2015-02-16 13:46:20 +01:00
wernsaar 7bfb3011e8 Ref #51: added blas extension somatcopy 2014-06-09 20:21:13 +02:00
wernsaar faf3ac0aad Ref #285: added axpby kernels 2014-06-08 11:54:24 +02:00
wernsaar 9db0fb8b02 bugfix for sdsdot 2014-02-28 14:59:36 +01:00
Xianyi Zhang 342bbc3871 Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00