|  Martin Kroeker | f10c266b4d | Fix stride in shortcut path for small N | 2022-12-08 21:02:01 +01:00 | 
				
					
						|  Martin Kroeker | 8c99d5d1b6 | Merge pull request #3796 from martin-frbg/gemmt Add a trivial GEMMT implementation based on a looped GEMV | 2022-11-12 19:06:05 +01:00 | 
				
					
						|  Martin Kroeker | e6204d254f | Update CMakeLists.txt | 2022-11-08 16:21:11 +01:00 | 
				
					
						|  Martin Kroeker | 1b77764182 | Conditionally leave out bits of LAPACK to be overridden by ReLAPACK | 2022-11-08 12:02:59 +01:00 | 
				
					
						|  Martin Kroeker | c970717157 | fix missing t in xgemmt rule Co-authored-by: Alexis <35051714+amontoison@users.noreply.github.com> | 2022-11-01 13:51:20 +01:00 | 
				
					
						|  Martin Kroeker | e7fd8d21a6 | Add GEMMT based on looped GEMV | 2022-10-26 15:33:58 +02:00 | 
				
					
						|  Martin Kroeker | a3e02742f2 | Add USE_PERL fallback option for create script used with FUNCTION_PROFILE | 2022-05-22 18:32:19 +02:00 | 
				
					
						|  Martin Kroeker | f1c570a5f1 | Add back original PERL-based script under new name | 2022-05-22 18:29:01 +02:00 | 
				
					
						|  Owen Rafferty | 42c7a27e6b | rewrite perl scripts in universal shell | 2022-05-18 19:00:15 -05:00 | 
				
					
						|  Martin Kroeker | 7656aba00e | Merge pull request #3493 from martin-frbg/casts+cleanup WIP casts and cleanups | 2022-02-06 23:55:06 +01:00 | 
				
					
						|  Martin Kroeker | d2b5fbf80f | Exclude some complex (LAPACK) functions when NO_LAPACK is set | 2022-01-27 22:02:08 +01:00 | 
				
					
						|  Martin Kroeker | 64365c919e | fix function typecasts | 2021-12-21 18:47:35 +01:00 | 
				
					
						|  gxw | 25f99fa9f8 | Add cblas_{c/z}srot cblas_{c/z}rotg support | 2021-11-01 20:19:13 +08:00 | 
				
					
						|  Martin Kroeker | 4b3769823a | Revert #3252 | 2021-10-24 23:57:06 +02:00 | 
				
					
						|  Martin Kroeker | 2845f54eb8 | Remove dangerous optimization from previous #3252 - buffer is never unused here | 2021-10-20 10:50:02 +02:00 | 
				
					
						|  Martin Kroeker | c35739db5e | Add separate entries for BFLOAT16 functions and fix missing cblas_xerbla | 2021-09-14 16:15:57 +02:00 | 
				
					
						|  Martin Kroeker | 1085775bc6 | really remove the unused variable | 2021-09-11 15:05:55 +02:00 | 
				
					
						|  Martin Kroeker | 20581bf303 | Remove unused variable | 2021-09-11 14:36:27 +02:00 | 
				
					
						|  Wangyang Guo | 4289cf048d | sbgemm: avoid falling into SGEMM_KERNEL_DIRECT | 2021-09-07 21:30:46 +08:00 | 
				
					
						|  Wangyang Guo | 2e44ca0136 | sbgemm: add missing cblas_sbgemm definition | 2021-08-30 17:40:30 +08:00 | 
				
					
						|  Wangyang Guo | 1d83ca4bca | Small Matrix: support BFLOAT16 data type | 2021-08-30 17:40:20 +08:00 | 
				
					
						|  Wangyang Guo | c17d6dacb2 | Small Matrix: skip compile in unimplemented data type | 2021-08-05 05:46:13 +00:00 | 
				
					
						|  Wangyang Guo | aa50185647 | Small Matrix: better handle with GEMM3M marco | 2021-08-05 02:45:53 +00:00 | 
				
					
						|  Wangyang Guo | 478d1086c1 | Small Matrix: support DYNAMIC_ARCH build | 2021-08-04 03:12:41 +00:00 | 
				
					
						|  Wangyang Guo | 5dc7c3c8e5 | Small Matrix: add GEMM_SMALL_MATRIX_PERMIT to tune small matrics case | 2021-08-02 07:06:54 +00:00 | 
				
					
						|  Xianyi Zhang | 6022e5629c | Refs #2587 fix small matrix c/zgemm bug. | 2021-08-02 07:06:54 +00:00 | 
				
					
						|  Xianyi Zhang | 57ed58cefe | Refs #2587 Add small matrix optimization reference kernel for c/zgemm. | 2021-08-02 07:06:54 +00:00 | 
				
					
						|  Xianyi Zhang | 17d32a4a82 | Change a1b0 gemm to b0 gemm. | 2021-08-02 07:06:54 +00:00 | 
				
					
						|  Xianyi Zhang | 4271cfcc6f | Fix gemm interface bug for small matrix. | 2021-08-02 07:06:51 +00:00 | 
				
					
						|  Xianyi Zhang | be3349405d | Add alpha=1.0 beta=0.0 for small gemm. | 2021-08-02 07:01:47 +00:00 | 
				
					
						|  Xianyi Zhang | 0a2077901c | Add small marix optimization kernel interface. make SMALL_MATRIX_OPT=1 | 2021-08-02 07:01:47 +00:00 | 
				
					
						|  Martin Kroeker | 1dea57ab25 | Revert PR #3250 (shortcut without buffer allocation) as it is unsafe on some x86_64 | 2021-07-14 20:32:57 +02:00 | 
				
					
						|  Martin Kroeker | 7bb59fceb7 | Clean up some warnings | 2021-07-11 16:00:29 +02:00 | 
				
					
						|  Martin Kroeker | 4ed99c2ce3 | Merge pull request #3292 from martin-frbg/syrk_limit Add lower limit for multithreading in xSYRK | 2021-07-07 20:46:28 +02:00 | 
				
					
						|  Martin Kroeker | 8186963d8c | Add lower limit for multithreading | 2021-07-04 17:00:26 +02:00 | 
				
					
						|  Martin Kroeker | 726c44242b | Add lower threshold for multithreading | 2021-07-01 17:41:05 +02:00 | 
				
					
						|  Martin Kroeker | 1b5620b66e | Add lower threshold for multithreading in ?potrf and ?potri | 2021-06-26 23:47:41 +02:00 | 
				
					
						|  Martin Kroeker | baf03a0937 | Merge pull request #3252 from martin-frbg/more_shortcuts Further shortcuts for (small) cases that do not need buffer allocation | 2021-06-15 16:14:20 +02:00 | 
				
					
						|  Martin Kroeker | 7aab5e826c | Merge pull request #3250 from martin-frbg/gemv-shortcut Add shortcut for small-size S/D GEMV_N with increments of one | 2021-06-15 14:50:14 +02:00 | 
				
					
						|  Martin Kroeker | f84197c1a7 | Add shortcuts for (small) cases that do not need expensive buffer allocation | 2021-05-29 22:28:00 +02:00 | 
				
					
						|  Martin Kroeker | 734bd265a8 | revert symv changes for now | 2021-05-29 15:40:03 +02:00 | 
				
					
						|  Martin Kroeker | 1217eb910d | Fix copy-paste errors in variables used | 2021-05-28 09:38:48 +02:00 | 
				
					
						|  Martin Kroeker | d6d7a6685d | Add shortcuts for (small) cases that do not need expensive buffer allocation | 2021-05-27 22:39:18 +02:00 | 
				
					
						|  Martin Kroeker | f0e7345fb8 | Add shortcut for small-size gemv_n with increments of one | 2021-05-26 22:02:34 +02:00 | 
				
					
						|  Martin Kroeker | 03297ff9f0 | Add fast path for small xSYR with INCX==1 | 2021-05-22 20:41:18 +02:00 | 
				
					
						|  Gordon Fossum | 8b599836db | Add error message token for SBGEMM in gemm.c | 2021-05-04 13:55:02 -05:00 | 
				
					
						|  Martin Kroeker | 904b221f03 | Add cast to prevent overflow of intermediate result | 2021-05-01 14:47:22 +02:00 | 
				
					
						|  Martin Kroeker | c5fb91f1bc | Fix division by zero in the non-x86 codepath | 2021-04-29 09:47:18 +02:00 | 
				
					
						|  Harmen Stoppels | ec6b354c32 | use /usr/bin/env perl | 2021-02-24 14:07:20 +01:00 | 
				
					
						|  Martin Kroeker | bd906e3410 | fix copy-paste error in build rules for cblas_crotg and cblas_zrotg | 2021-01-30 16:46:25 +01:00 |