Martin Kroeker
|
7e93ab1b9e
|
Fix info code returned for invalid ldb
|
2023-07-09 17:00:25 +02:00 |
Martin Kroeker
|
bb862b82d5
|
Fix integer overflow in multithreading threshold calculation for SYMM/SYRK (#4116)
* Fix potential integer overflow
|
2023-06-29 23:59:25 +02:00 |
Martin Kroeker
|
c3a2d407a0
|
Merge pull request #4048 from imzhuhl/spr_sbgemm_fix
Sapphire Rapids sbgemm fix
|
2023-06-17 20:47:09 +02:00 |
Angelika Schwarz
|
899c3a6f6a
|
Improve input argument checks of gemmt
* Fix return value for invalid info
* Add missing checks for ldA, ldB
* Use reference-LAPACK like checks (ie ld=0,nrows=0 is invalid)
|
2023-05-26 08:51:27 +02:00 |
Honglin Zhu
|
71e4125795
|
Fix syscall error on non-x86 platform
|
2023-05-22 21:59:59 +08:00 |
Honglin Zhu
|
90f041e348
|
Invoke the syscall to allow the use of amx tiles
|
2023-05-19 10:48:18 +08:00 |
Ken Ho
|
df1b1f6a91
|
More detailed error message in [z]imatcopy.c.
|
2023-05-12 09:41:52 -07:00 |
Ken Ho
|
7a86c437b5
|
Change some "if" statements to "else if" following suggestion by @mmuetzel.
|
2023-05-10 09:13:04 -07:00 |
Ken Ho
|
33ab415f68
|
Bug fix and improvements for [z]imatcopy interface.
|
2023-05-08 14:43:56 -07:00 |
Martin Kroeker
|
1f6f7328eb
|
remove redundant declaration
|
2023-04-27 09:14:12 +02:00 |
Martin Kroeker
|
7152d6b06d
|
fix cblas_gemmt
|
2023-04-27 08:36:20 +02:00 |
Martin Kroeker
|
38d7a7b562
|
Fix ?GEMMT
|
2023-04-16 00:07:58 +02:00 |
Martin Kroeker
|
912d713b52
|
redo lost edit
|
2023-03-28 18:31:04 +02:00 |
Martin Kroeker
|
dc15c18efc
|
Fix build failures seen with the NO_LAPACK option - cspr/csymv/csyr belong on the LAPACK list
|
2023-03-28 16:33:09 +02:00 |
H. Vetinari
|
f2659516ef
|
remove unqualified ifdef's for NO_LAPACK(E)
|
2023-03-28 19:01:31 +11:00 |
Martin Kroeker
|
f2d6b1c70e
|
Add multithreading threshold
|
2023-03-26 00:25:28 +01:00 |
Martin Kroeker
|
a495ffc554
|
Rework multithreading threshold
|
2023-03-26 00:23:57 +01:00 |
Martin Kroeker
|
244147495a
|
Do not use multithreading for small workloads
|
2023-03-23 23:13:02 +01:00 |
Martin Kroeker
|
ab32f832a8
|
fix stray blank on continuation line
|
2023-03-21 08:29:05 +01:00 |
Martin Kroeker
|
e359787e28
|
restore C/Z SPMV, SPR, SYR,SYMV
|
2023-03-21 07:43:03 +01:00 |
Martin Kroeker
|
f10c266b4d
|
Fix stride in shortcut path for small N
|
2022-12-08 21:02:01 +01:00 |
Martin Kroeker
|
8c99d5d1b6
|
Merge pull request #3796 from martin-frbg/gemmt
Add a trivial GEMMT implementation based on a looped GEMV
|
2022-11-12 19:06:05 +01:00 |
Martin Kroeker
|
e6204d254f
|
Update CMakeLists.txt
|
2022-11-08 16:21:11 +01:00 |
Martin Kroeker
|
1b77764182
|
Conditionally leave out bits of LAPACK to be overridden by ReLAPACK
|
2022-11-08 12:02:59 +01:00 |
Martin Kroeker
|
c970717157
|
fix missing t in xgemmt rule
Co-authored-by: Alexis <35051714+amontoison@users.noreply.github.com>
|
2022-11-01 13:51:20 +01:00 |
Martin Kroeker
|
e7fd8d21a6
|
Add GEMMT based on looped GEMV
|
2022-10-26 15:33:58 +02:00 |
Martin Kroeker
|
a3e02742f2
|
Add USE_PERL fallback option for create script used with FUNCTION_PROFILE
|
2022-05-22 18:32:19 +02:00 |
Martin Kroeker
|
f1c570a5f1
|
Add back original PERL-based script under new name
|
2022-05-22 18:29:01 +02:00 |
Owen Rafferty
|
42c7a27e6b
|
rewrite perl scripts in universal shell
|
2022-05-18 19:00:15 -05:00 |
Martin Kroeker
|
7656aba00e
|
Merge pull request #3493 from martin-frbg/casts+cleanup
WIP casts and cleanups
|
2022-02-06 23:55:06 +01:00 |
Martin Kroeker
|
d2b5fbf80f
|
Exclude some complex (LAPACK) functions when NO_LAPACK is set
|
2022-01-27 22:02:08 +01:00 |
Martin Kroeker
|
64365c919e
|
fix function typecasts
|
2021-12-21 18:47:35 +01:00 |
gxw
|
25f99fa9f8
|
Add cblas_{c/z}srot cblas_{c/z}rotg support
|
2021-11-01 20:19:13 +08:00 |
Martin Kroeker
|
4b3769823a
|
Revert #3252
|
2021-10-24 23:57:06 +02:00 |
Martin Kroeker
|
2845f54eb8
|
Remove dangerous optimization from previous #3252 - buffer is never unused here
|
2021-10-20 10:50:02 +02:00 |
Martin Kroeker
|
c35739db5e
|
Add separate entries for BFLOAT16 functions and fix missing cblas_xerbla
|
2021-09-14 16:15:57 +02:00 |
Martin Kroeker
|
1085775bc6
|
really remove the unused variable
|
2021-09-11 15:05:55 +02:00 |
Martin Kroeker
|
20581bf303
|
Remove unused variable
|
2021-09-11 14:36:27 +02:00 |
Wangyang Guo
|
4289cf048d
|
sbgemm: avoid falling into SGEMM_KERNEL_DIRECT
|
2021-09-07 21:30:46 +08:00 |
Wangyang Guo
|
2e44ca0136
|
sbgemm: add missing cblas_sbgemm definition
|
2021-08-30 17:40:30 +08:00 |
Wangyang Guo
|
1d83ca4bca
|
Small Matrix: support BFLOAT16 data type
|
2021-08-30 17:40:20 +08:00 |
Wangyang Guo
|
c17d6dacb2
|
Small Matrix: skip compile in unimplemented data type
|
2021-08-05 05:46:13 +00:00 |
Wangyang Guo
|
aa50185647
|
Small Matrix: better handle with GEMM3M marco
|
2021-08-05 02:45:53 +00:00 |
Wangyang Guo
|
478d1086c1
|
Small Matrix: support DYNAMIC_ARCH build
|
2021-08-04 03:12:41 +00:00 |
Wangyang Guo
|
5dc7c3c8e5
|
Small Matrix: add GEMM_SMALL_MATRIX_PERMIT to tune small matrics case
|
2021-08-02 07:06:54 +00:00 |
Xianyi Zhang
|
6022e5629c
|
Refs #2587 fix small matrix c/zgemm bug.
|
2021-08-02 07:06:54 +00:00 |
Xianyi Zhang
|
57ed58cefe
|
Refs #2587 Add small matrix optimization reference kernel for c/zgemm.
|
2021-08-02 07:06:54 +00:00 |
Xianyi Zhang
|
17d32a4a82
|
Change a1b0 gemm to b0 gemm.
|
2021-08-02 07:06:54 +00:00 |
Xianyi Zhang
|
4271cfcc6f
|
Fix gemm interface bug for small matrix.
|
2021-08-02 07:06:51 +00:00 |
Xianyi Zhang
|
be3349405d
|
Add alpha=1.0 beta=0.0 for small gemm.
|
2021-08-02 07:01:47 +00:00 |