kseniyazaytseva
|
f89e0034a4
|
Fix LAPACK usage from BLAS
|
2024-01-18 23:22:26 +03:00 |
Martin Kroeker
|
f7cf637d7a
|
redo lost edit
|
2024-01-18 23:22:26 +03:00 |
Martin Kroeker
|
85548e66ca
|
Fix build failures seen with the NO_LAPACK option - cspr/csymv/csyr belong on the LAPACK list
|
2024-01-18 23:22:26 +03:00 |
Martin Kroeker
|
f129161453
|
restore C/Z SPMV, SPR, SYR,SYMV
|
2024-01-18 23:22:26 +03:00 |
Martin Kroeker
|
5b4df851d7
|
fix stray blank on continuation line
|
2024-01-18 23:20:15 +03:00 |
kseniyazaytseva
|
ff41cf5c49
|
Fix BLAS, BLAS-like functions and Generic RISC-V kernels
* Fixed gemmt, imatcopy, zimatcopy_cnc functions
* Fixed cblas_cscal testing in ctest
* Removed rotmg unreacheble code
* Added zero size checks
|
2024-01-18 23:19:52 +03:00 |
Martin Kroeker
|
8c99d5d1b6
|
Merge pull request #3796 from martin-frbg/gemmt
Add a trivial GEMMT implementation based on a looped GEMV
|
2022-11-12 19:06:05 +01:00 |
Martin Kroeker
|
e6204d254f
|
Update CMakeLists.txt
|
2022-11-08 16:21:11 +01:00 |
Martin Kroeker
|
1b77764182
|
Conditionally leave out bits of LAPACK to be overridden by ReLAPACK
|
2022-11-08 12:02:59 +01:00 |
Martin Kroeker
|
c970717157
|
fix missing t in xgemmt rule
Co-authored-by: Alexis <35051714+amontoison@users.noreply.github.com>
|
2022-11-01 13:51:20 +01:00 |
Martin Kroeker
|
e7fd8d21a6
|
Add GEMMT based on looped GEMV
|
2022-10-26 15:33:58 +02:00 |
Martin Kroeker
|
a3e02742f2
|
Add USE_PERL fallback option for create script used with FUNCTION_PROFILE
|
2022-05-22 18:32:19 +02:00 |
Martin Kroeker
|
f1c570a5f1
|
Add back original PERL-based script under new name
|
2022-05-22 18:29:01 +02:00 |
Owen Rafferty
|
42c7a27e6b
|
rewrite perl scripts in universal shell
|
2022-05-18 19:00:15 -05:00 |
Martin Kroeker
|
7656aba00e
|
Merge pull request #3493 from martin-frbg/casts+cleanup
WIP casts and cleanups
|
2022-02-06 23:55:06 +01:00 |
Martin Kroeker
|
d2b5fbf80f
|
Exclude some complex (LAPACK) functions when NO_LAPACK is set
|
2022-01-27 22:02:08 +01:00 |
Martin Kroeker
|
64365c919e
|
fix function typecasts
|
2021-12-21 18:47:35 +01:00 |
gxw
|
25f99fa9f8
|
Add cblas_{c/z}srot cblas_{c/z}rotg support
|
2021-11-01 20:19:13 +08:00 |
Martin Kroeker
|
4b3769823a
|
Revert #3252
|
2021-10-24 23:57:06 +02:00 |
Martin Kroeker
|
2845f54eb8
|
Remove dangerous optimization from previous #3252 - buffer is never unused here
|
2021-10-20 10:50:02 +02:00 |
Martin Kroeker
|
c35739db5e
|
Add separate entries for BFLOAT16 functions and fix missing cblas_xerbla
|
2021-09-14 16:15:57 +02:00 |
Martin Kroeker
|
1085775bc6
|
really remove the unused variable
|
2021-09-11 15:05:55 +02:00 |
Martin Kroeker
|
20581bf303
|
Remove unused variable
|
2021-09-11 14:36:27 +02:00 |
Wangyang Guo
|
4289cf048d
|
sbgemm: avoid falling into SGEMM_KERNEL_DIRECT
|
2021-09-07 21:30:46 +08:00 |
Wangyang Guo
|
2e44ca0136
|
sbgemm: add missing cblas_sbgemm definition
|
2021-08-30 17:40:30 +08:00 |
Wangyang Guo
|
1d83ca4bca
|
Small Matrix: support BFLOAT16 data type
|
2021-08-30 17:40:20 +08:00 |
Wangyang Guo
|
c17d6dacb2
|
Small Matrix: skip compile in unimplemented data type
|
2021-08-05 05:46:13 +00:00 |
Wangyang Guo
|
aa50185647
|
Small Matrix: better handle with GEMM3M marco
|
2021-08-05 02:45:53 +00:00 |
Wangyang Guo
|
478d1086c1
|
Small Matrix: support DYNAMIC_ARCH build
|
2021-08-04 03:12:41 +00:00 |
Wangyang Guo
|
5dc7c3c8e5
|
Small Matrix: add GEMM_SMALL_MATRIX_PERMIT to tune small matrics case
|
2021-08-02 07:06:54 +00:00 |
Xianyi Zhang
|
6022e5629c
|
Refs #2587 fix small matrix c/zgemm bug.
|
2021-08-02 07:06:54 +00:00 |
Xianyi Zhang
|
57ed58cefe
|
Refs #2587 Add small matrix optimization reference kernel for c/zgemm.
|
2021-08-02 07:06:54 +00:00 |
Xianyi Zhang
|
17d32a4a82
|
Change a1b0 gemm to b0 gemm.
|
2021-08-02 07:06:54 +00:00 |
Xianyi Zhang
|
4271cfcc6f
|
Fix gemm interface bug for small matrix.
|
2021-08-02 07:06:51 +00:00 |
Xianyi Zhang
|
be3349405d
|
Add alpha=1.0 beta=0.0 for small gemm.
|
2021-08-02 07:01:47 +00:00 |
Xianyi Zhang
|
0a2077901c
|
Add small marix optimization kernel interface.
make SMALL_MATRIX_OPT=1
|
2021-08-02 07:01:47 +00:00 |
Martin Kroeker
|
1dea57ab25
|
Revert PR #3250 (shortcut without buffer allocation) as it is unsafe on some x86_64
|
2021-07-14 20:32:57 +02:00 |
Martin Kroeker
|
7bb59fceb7
|
Clean up some warnings
|
2021-07-11 16:00:29 +02:00 |
Martin Kroeker
|
4ed99c2ce3
|
Merge pull request #3292 from martin-frbg/syrk_limit
Add lower limit for multithreading in xSYRK
|
2021-07-07 20:46:28 +02:00 |
Martin Kroeker
|
8186963d8c
|
Add lower limit for multithreading
|
2021-07-04 17:00:26 +02:00 |
Martin Kroeker
|
726c44242b
|
Add lower threshold for multithreading
|
2021-07-01 17:41:05 +02:00 |
Martin Kroeker
|
1b5620b66e
|
Add lower threshold for multithreading in ?potrf and ?potri
|
2021-06-26 23:47:41 +02:00 |
Martin Kroeker
|
baf03a0937
|
Merge pull request #3252 from martin-frbg/more_shortcuts
Further shortcuts for (small) cases that do not need buffer allocation
|
2021-06-15 16:14:20 +02:00 |
Martin Kroeker
|
7aab5e826c
|
Merge pull request #3250 from martin-frbg/gemv-shortcut
Add shortcut for small-size S/D GEMV_N with increments of one
|
2021-06-15 14:50:14 +02:00 |
Martin Kroeker
|
f84197c1a7
|
Add shortcuts for (small) cases that do not need expensive buffer allocation
|
2021-05-29 22:28:00 +02:00 |
Martin Kroeker
|
734bd265a8
|
revert symv changes for now
|
2021-05-29 15:40:03 +02:00 |
Martin Kroeker
|
1217eb910d
|
Fix copy-paste errors in variables used
|
2021-05-28 09:38:48 +02:00 |
Martin Kroeker
|
d6d7a6685d
|
Add shortcuts for (small) cases that do not need expensive buffer allocation
|
2021-05-27 22:39:18 +02:00 |
Martin Kroeker
|
f0e7345fb8
|
Add shortcut for small-size gemv_n with increments of one
|
2021-05-26 22:02:34 +02:00 |
Martin Kroeker
|
03297ff9f0
|
Add fast path for small xSYR with INCX==1
|
2021-05-22 20:41:18 +02:00 |