Commit Graph

327 Commits

Author SHA1 Message Date
Martin Kroeker 7e93ab1b9e
Fix info code returned for invalid ldb 2023-07-09 17:00:25 +02:00
Martin Kroeker bb862b82d5
Fix integer overflow in multithreading threshold calculation for SYMM/SYRK (#4116)
* Fix potential integer overflow
2023-06-29 23:59:25 +02:00
Martin Kroeker c3a2d407a0
Merge pull request #4048 from imzhuhl/spr_sbgemm_fix
Sapphire Rapids sbgemm fix
2023-06-17 20:47:09 +02:00
Angelika Schwarz 899c3a6f6a Improve input argument checks of gemmt
* Fix return value for invalid info
* Add missing checks for ldA, ldB
* Use reference-LAPACK like checks (ie ld=0,nrows=0 is invalid)
2023-05-26 08:51:27 +02:00
Honglin Zhu 71e4125795 Fix syscall error on non-x86 platform 2023-05-22 21:59:59 +08:00
Honglin Zhu 90f041e348 Invoke the syscall to allow the use of amx tiles 2023-05-19 10:48:18 +08:00
Ken Ho df1b1f6a91 More detailed error message in [z]imatcopy.c. 2023-05-12 09:41:52 -07:00
Ken Ho 7a86c437b5 Change some "if" statements to "else if" following suggestion by @mmuetzel. 2023-05-10 09:13:04 -07:00
Ken Ho 33ab415f68 Bug fix and improvements for [z]imatcopy interface. 2023-05-08 14:43:56 -07:00
Martin Kroeker 1f6f7328eb
remove redundant declaration 2023-04-27 09:14:12 +02:00
Martin Kroeker 7152d6b06d
fix cblas_gemmt 2023-04-27 08:36:20 +02:00
Martin Kroeker 38d7a7b562
Fix ?GEMMT 2023-04-16 00:07:58 +02:00
Martin Kroeker 912d713b52
redo lost edit 2023-03-28 18:31:04 +02:00
Martin Kroeker dc15c18efc
Fix build failures seen with the NO_LAPACK option - cspr/csymv/csyr belong on the LAPACK list 2023-03-28 16:33:09 +02:00
H. Vetinari f2659516ef remove unqualified ifdef's for NO_LAPACK(E) 2023-03-28 19:01:31 +11:00
Martin Kroeker f2d6b1c70e
Add multithreading threshold 2023-03-26 00:25:28 +01:00
Martin Kroeker a495ffc554
Rework multithreading threshold 2023-03-26 00:23:57 +01:00
Martin Kroeker 244147495a
Do not use multithreading for small workloads 2023-03-23 23:13:02 +01:00
Martin Kroeker ab32f832a8
fix stray blank on continuation line 2023-03-21 08:29:05 +01:00
Martin Kroeker e359787e28
restore C/Z SPMV, SPR, SYR,SYMV 2023-03-21 07:43:03 +01:00
Martin Kroeker f10c266b4d
Fix stride in shortcut path for small N 2022-12-08 21:02:01 +01:00
Martin Kroeker 8c99d5d1b6
Merge pull request #3796 from martin-frbg/gemmt
Add a trivial GEMMT implementation based on a looped GEMV
2022-11-12 19:06:05 +01:00
Martin Kroeker e6204d254f
Update CMakeLists.txt 2022-11-08 16:21:11 +01:00
Martin Kroeker 1b77764182
Conditionally leave out bits of LAPACK to be overridden by ReLAPACK 2022-11-08 12:02:59 +01:00
Martin Kroeker c970717157
fix missing t in xgemmt rule
Co-authored-by: Alexis <35051714+amontoison@users.noreply.github.com>
2022-11-01 13:51:20 +01:00
Martin Kroeker e7fd8d21a6
Add GEMMT based on looped GEMV 2022-10-26 15:33:58 +02:00
Martin Kroeker a3e02742f2
Add USE_PERL fallback option for create script used with FUNCTION_PROFILE 2022-05-22 18:32:19 +02:00
Martin Kroeker f1c570a5f1
Add back original PERL-based script under new name 2022-05-22 18:29:01 +02:00
Owen Rafferty 42c7a27e6b
rewrite perl scripts in universal shell 2022-05-18 19:00:15 -05:00
Martin Kroeker 7656aba00e
Merge pull request #3493 from martin-frbg/casts+cleanup
WIP casts and cleanups
2022-02-06 23:55:06 +01:00
Martin Kroeker d2b5fbf80f
Exclude some complex (LAPACK) functions when NO_LAPACK is set 2022-01-27 22:02:08 +01:00
Martin Kroeker 64365c919e
fix function typecasts 2021-12-21 18:47:35 +01:00
gxw 25f99fa9f8 Add cblas_{c/z}srot cblas_{c/z}rotg support 2021-11-01 20:19:13 +08:00
Martin Kroeker 4b3769823a
Revert #3252 2021-10-24 23:57:06 +02:00
Martin Kroeker 2845f54eb8
Remove dangerous optimization from previous #3252 - buffer is never unused here 2021-10-20 10:50:02 +02:00
Martin Kroeker c35739db5e
Add separate entries for BFLOAT16 functions and fix missing cblas_xerbla 2021-09-14 16:15:57 +02:00
Martin Kroeker 1085775bc6
really remove the unused variable 2021-09-11 15:05:55 +02:00
Martin Kroeker 20581bf303
Remove unused variable 2021-09-11 14:36:27 +02:00
Wangyang Guo 4289cf048d sbgemm: avoid falling into SGEMM_KERNEL_DIRECT 2021-09-07 21:30:46 +08:00
Wangyang Guo 2e44ca0136 sbgemm: add missing cblas_sbgemm definition 2021-08-30 17:40:30 +08:00
Wangyang Guo 1d83ca4bca Small Matrix: support BFLOAT16 data type 2021-08-30 17:40:20 +08:00
Wangyang Guo c17d6dacb2 Small Matrix: skip compile in unimplemented data type 2021-08-05 05:46:13 +00:00
Wangyang Guo aa50185647 Small Matrix: better handle with GEMM3M marco 2021-08-05 02:45:53 +00:00
Wangyang Guo 478d1086c1 Small Matrix: support DYNAMIC_ARCH build 2021-08-04 03:12:41 +00:00
Wangyang Guo 5dc7c3c8e5 Small Matrix: add GEMM_SMALL_MATRIX_PERMIT to tune small matrics case 2021-08-02 07:06:54 +00:00
Xianyi Zhang 6022e5629c Refs #2587 fix small matrix c/zgemm bug. 2021-08-02 07:06:54 +00:00
Xianyi Zhang 57ed58cefe Refs #2587 Add small matrix optimization reference kernel for c/zgemm. 2021-08-02 07:06:54 +00:00
Xianyi Zhang 17d32a4a82 Change a1b0 gemm to b0 gemm. 2021-08-02 07:06:54 +00:00
Xianyi Zhang 4271cfcc6f Fix gemm interface bug for small matrix. 2021-08-02 07:06:51 +00:00
Xianyi Zhang be3349405d Add alpha=1.0 beta=0.0 for small gemm. 2021-08-02 07:01:47 +00:00