Commit Graph

301 Commits

Author SHA1 Message Date
Martin Kroeker
a3e02742f2 Add USE_PERL fallback option for create script used with FUNCTION_PROFILE 2022-05-22 18:32:19 +02:00
Martin Kroeker
f1c570a5f1 Add back original PERL-based script under new name 2022-05-22 18:29:01 +02:00
Owen Rafferty
42c7a27e6b rewrite perl scripts in universal shell 2022-05-18 19:00:15 -05:00
Martin Kroeker
7656aba00e Merge pull request #3493 from martin-frbg/casts+cleanup
WIP casts and cleanups
2022-02-06 23:55:06 +01:00
Martin Kroeker
d2b5fbf80f Exclude some complex (LAPACK) functions when NO_LAPACK is set 2022-01-27 22:02:08 +01:00
Martin Kroeker
64365c919e fix function typecasts 2021-12-21 18:47:35 +01:00
gxw
25f99fa9f8 Add cblas_{c/z}srot cblas_{c/z}rotg support 2021-11-01 20:19:13 +08:00
Martin Kroeker
4b3769823a Revert #3252 2021-10-24 23:57:06 +02:00
Martin Kroeker
2845f54eb8 Remove dangerous optimization from previous #3252 - buffer is never unused here 2021-10-20 10:50:02 +02:00
Martin Kroeker
c35739db5e Add separate entries for BFLOAT16 functions and fix missing cblas_xerbla 2021-09-14 16:15:57 +02:00
Martin Kroeker
1085775bc6 really remove the unused variable 2021-09-11 15:05:55 +02:00
Martin Kroeker
20581bf303 Remove unused variable 2021-09-11 14:36:27 +02:00
Wangyang Guo
4289cf048d sbgemm: avoid falling into SGEMM_KERNEL_DIRECT 2021-09-07 21:30:46 +08:00
Wangyang Guo
2e44ca0136 sbgemm: add missing cblas_sbgemm definition 2021-08-30 17:40:30 +08:00
Wangyang Guo
1d83ca4bca Small Matrix: support BFLOAT16 data type 2021-08-30 17:40:20 +08:00
Wangyang Guo
c17d6dacb2 Small Matrix: skip compile in unimplemented data type 2021-08-05 05:46:13 +00:00
Wangyang Guo
aa50185647 Small Matrix: better handle with GEMM3M marco 2021-08-05 02:45:53 +00:00
Wangyang Guo
478d1086c1 Small Matrix: support DYNAMIC_ARCH build 2021-08-04 03:12:41 +00:00
Wangyang Guo
5dc7c3c8e5 Small Matrix: add GEMM_SMALL_MATRIX_PERMIT to tune small matrics case 2021-08-02 07:06:54 +00:00
Xianyi Zhang
6022e5629c Refs #2587 fix small matrix c/zgemm bug. 2021-08-02 07:06:54 +00:00
Xianyi Zhang
57ed58cefe Refs #2587 Add small matrix optimization reference kernel for c/zgemm. 2021-08-02 07:06:54 +00:00
Xianyi Zhang
17d32a4a82 Change a1b0 gemm to b0 gemm. 2021-08-02 07:06:54 +00:00
Xianyi Zhang
4271cfcc6f Fix gemm interface bug for small matrix. 2021-08-02 07:06:51 +00:00
Xianyi Zhang
be3349405d Add alpha=1.0 beta=0.0 for small gemm. 2021-08-02 07:01:47 +00:00
Xianyi Zhang
0a2077901c Add small marix optimization kernel interface.
make SMALL_MATRIX_OPT=1
2021-08-02 07:01:47 +00:00
Martin Kroeker
1dea57ab25 Revert PR #3250 (shortcut without buffer allocation) as it is unsafe on some x86_64 2021-07-14 20:32:57 +02:00
Martin Kroeker
7bb59fceb7 Clean up some warnings 2021-07-11 16:00:29 +02:00
Martin Kroeker
4ed99c2ce3 Merge pull request #3292 from martin-frbg/syrk_limit
Add lower limit for multithreading in xSYRK
2021-07-07 20:46:28 +02:00
Martin Kroeker
8186963d8c Add lower limit for multithreading 2021-07-04 17:00:26 +02:00
Martin Kroeker
726c44242b Add lower threshold for multithreading 2021-07-01 17:41:05 +02:00
Martin Kroeker
1b5620b66e Add lower threshold for multithreading in ?potrf and ?potri 2021-06-26 23:47:41 +02:00
Martin Kroeker
baf03a0937 Merge pull request #3252 from martin-frbg/more_shortcuts
Further shortcuts for (small) cases that do not need buffer allocation
2021-06-15 16:14:20 +02:00
Martin Kroeker
7aab5e826c Merge pull request #3250 from martin-frbg/gemv-shortcut
Add shortcut for small-size S/D GEMV_N with increments of one
2021-06-15 14:50:14 +02:00
Martin Kroeker
f84197c1a7 Add shortcuts for (small) cases that do not need expensive buffer allocation 2021-05-29 22:28:00 +02:00
Martin Kroeker
734bd265a8 revert symv changes for now 2021-05-29 15:40:03 +02:00
Martin Kroeker
1217eb910d Fix copy-paste errors in variables used 2021-05-28 09:38:48 +02:00
Martin Kroeker
d6d7a6685d Add shortcuts for (small) cases that do not need expensive buffer allocation 2021-05-27 22:39:18 +02:00
Martin Kroeker
f0e7345fb8 Add shortcut for small-size gemv_n with increments of one 2021-05-26 22:02:34 +02:00
Martin Kroeker
03297ff9f0 Add fast path for small xSYR with INCX==1 2021-05-22 20:41:18 +02:00
Gordon Fossum
8b599836db Add error message token for SBGEMM in gemm.c 2021-05-04 13:55:02 -05:00
Martin Kroeker
904b221f03 Add cast to prevent overflow of intermediate result 2021-05-01 14:47:22 +02:00
Martin Kroeker
c5fb91f1bc Fix division by zero in the non-x86 codepath 2021-04-29 09:47:18 +02:00
Harmen Stoppels
ec6b354c32 use /usr/bin/env perl 2021-02-24 14:07:20 +01:00
Martin Kroeker
bd906e3410 fix copy-paste error in build rules for cblas_crotg and cblas_zrotg 2021-01-30 16:46:25 +01:00
Alex Henrie
f1bf2603e6 Remove dead assignment to dflag in rotmg functions 2021-01-14 19:40:32 -07:00
Alex Henrie
6f32991eae Don't define the mode variable when not needed in gemm functions 2021-01-14 19:40:31 -07:00
Martin Kroeker
a8f249458d Build CBLAS interfaces for CROTG and ZROTG as well 2021-01-13 00:29:38 +01:00
Martin Kroeker
ac3e2a3fdd Add CBLAS interfaces for csrot and zdrot 2021-01-12 23:22:00 +01:00
Martin Kroeker
857afcc41d Use ifeq instead of ifdef for user-definable build options 2020-11-22 16:31:44 +01:00
Chen, Guobing
a7b1f9b1bb Implementation of BF16 based gemv
1. Add a new API -- sbgemv to support bfloat16 based gemv
2. Implement a generic kernel for sbgemv
3. Implement an avx512-bf16 based kernel for sbgemv

Signed-off-by: Chen, Guobing <guobing.chen@intel.com>
2020-10-29 02:08:23 +08:00