eea006a688 
								
							 
						 
						
							
							
								
								Wrap SVE header with __has_include check  
							
							
							
						 
						
							2022-12-01 12:07:55 +00:00  
				
					
						
							
							
								 
						
							
								fd4f52c797 
								
							 
						 
						
							
							
								
								Add SVE implementation for sdot/ddot  
							
							... 
							
							
							
							This adds an SVE implementation to sdot/ddot when available, falling back to the previous Advanced SIMD kernel where there's no SVE implementation for the kernel.
All the targets were essentially treating `dot_thunderx2t99.c` as the Advanced SIMD implementation so I've renamed it to better fit with the feature detection. 
							
						 
						
							2022-12-01 12:07:50 +00:00  
				
					
						
							
							
								 
						
							
								fdac8a97c1 
								
							 
						 
						
							
							
								
								Add sbgemm_ncopy_8 and sbgemm_tcopy_4  
							
							
							
						 
						
							2022-11-29 04:46:14 -05:00  
				
					
						
							
							
								 
						
							
								135718eafc 
								
							 
						 
						
							
							
								
								Improve the performance of sbgemm_tcopy on neoversen2  
							
							
							
						 
						
							2022-11-28 04:17:54 -05:00  
				
					
						
							
							
								 
						
							
								4f7b77e08a 
								
							 
						 
						
							
							
								
								Remove unnecessary instructions from Advanced SIMD dot  
							
							... 
							
							
							
							The existing kernel was issuing extra instructions to organise the arguments into the same registers they would usually be in and similarly to put the result into the appropriate register.
This has an impact on smaller sized dots and seemed like a quick fix 
							
						 
						
							2022-11-25 16:19:03 +00:00  
				
					
						
							
							
								 
						
							
								f73cfb7e2c 
								
							 
						 
						
							
							
								
								change line endings from CRLF to LF  
							
							
							
						 
						
							2022-11-17 09:39:56 +01:00  
				
					
						
							
							
								 
						
							
								1688c7da43 
								
							 
						 
						
							
							
								
								change line endings from CRLF to LF  
							
							
							
						 
						
							2022-11-16 22:24:01 +01:00  
				
					
						
							
							
								 
						
							
								6c1043eb41 
								
							 
						 
						
							
							
								
								Add [cz]scal microkernels for SKYLAKEX  
							
							... 
							
							
							
							These are as similar to dscal_microk_skylakex-2.c as possible
for consistency.
Note that before this change SKYLAKEX+ uses generic C functions for
cscal/zscal via commit 2271c350#2610  (which is masked by
commit 086d87a30#3799  disables FMAs (in turn enabled
by `-march=skylake-avx512`) in the plain C code which fixes excessive
LAPACK test failures more nicely. 
							
						 
						
							2022-11-09 08:57:03 -05:00  
				
					
						
							
							
								 
						
							
								c9d78dc3b2 
								
							 
						 
						
							
							
								
								Remove excess initializer (leftover from rework of PR 3793)  
							
							
							
						 
						
							2022-10-31 16:57:03 +01:00  
				
					
						
							
							
								 
						
							
								65338a9493 
								
							 
						 
						
							
							
								
								Merge pull request  #3799  from bartoldeman/cscal-zscal-no-fma  
							
							... 
							
							
							
							x86_64: prevent GCC and Clang from generating FMAs in cscal/zscal. 
							
						 
						
							2022-10-30 18:56:10 +01:00  
				
					
						
							
							
								 
						
							
								79066b6bf3 
								
							 
						 
						
							
							
								
								Change file name to match the norm and delete useless code.  
							
							
							
						 
						
							2022-10-28 17:09:39 +08:00  
				
					
						
							
							
								 
						
							
								e7e3aa2948 
								
							 
						 
						
							
							
								
								x86_64: prevent GCC and Clang from generating FMAs in cscal/zscal.  
							
							... 
							
							
							
							If e.g. -march=haswell is set in CFLAGS, GCC generates FMAs by default, which
is inconsistent with the microkernels, none of which use FMAs. These
inconsistencies cause a few failures in the LAPACK testcases, where
eigenvalue results with/without eigenvectors are compared.
Moreover using FMAs for multiplication of complex numbers can give surprising
results, see 22aa81f22aa81f 
							
						 
						
							2022-10-27 18:16:43 -04:00  
				
					
						
							
							
								 
						
							
								4989e039a5 
								
							 
						 
						
							
							
								
								Define SBGEMM_ALIGN_K for DYNAMIC_ARCH build  
							
							
							
						 
						
							2022-10-27 14:10:26 +08:00  
				
					
						
							
							
								 
						
							
								843e9fd0b9 
								
							 
						 
						
							
							
								
								Fix typo error  
							
							
							
						 
						
							2022-10-26 17:06:33 +08:00  
				
					
						
							
							
								 
						
							
								b00d5b9746 
								
							 
						 
						
							
							
								
								New sbgemm implementation for Neoverse N2  
							
							... 
							
							
							
							1. Use UZP instructions but not gather load and scatter store instructions to get lower latency.
    2. Padding k to a power of 4. 
							
						 
						
							2022-10-26 15:09:41 +08:00  
				
					
						
							
							
								 
						
							
								f6f35a4288 
								
							 
						 
						
							
							
								
								fix copyobj declarations to work with DYNAMIC_ARCH  
							
							
							
						 
						
							2022-09-29 08:47:14 +02:00  
				
					
						
							
							
								 
						
							
								b1d69fb3ac 
								
							 
						 
						
							
							
								
								Add MIPS64_GENERIC as a copy of GENERIC  
							
							
							
						 
						
							2022-09-17 23:52:32 +02:00  
				
					
						
							
							
								 
						
							
								edea1bcfaf 
								
							 
						 
						
							
							
								
								MIPS64: Fixed failed utest dsdot:dsdot_n_1 when TARGET=I6500  
							
							
							
						 
						
							2022-09-17 16:43:22 +08:00  
				
					
						
							
							
								 
						
							
								101a2c77c3 
								
							 
						 
						
							
							
								
								Fix warnings  
							
							
							
						 
						
							2022-09-15 09:19:19 +02:00  
				
					
						
							
							
								 
						
							
								23d59baaf1 
								
							 
						 
						
							
							
								
								Add -mfma to -mavx2 for Apple clang, and set AVX2 options for Zen as well  
							
							
							
						 
						
							2022-09-13 22:39:27 +02:00  
				
					
						
							
							
								 
						
							
								365936ae1b 
								
							 
						 
						
							
							
								
								MIPS64: Using the macro MTC rather than MTC1  
							
							
							
						 
						
							2022-09-13 16:39:40 +08:00  
				
					
						
							
							
								 
						
							
								739c3c44a7 
								
							 
						 
						
							
							
								
								Work around windows/osx gcc12 x86_64 tree-optimizer problem and add an osx/gcc12 build to Azure CI ( #3745 )  
							
							... 
							
							
							
							Add pragma to disable the gcc tree-optimizer for some x86_64 S and Z kernels with gcc12 on OSX or Windows 
							
						 
						
							2022-09-03 15:01:22 +02:00  
				
					
						
							
							
								 
						
							
								bd30120ba7 
								
							 
						 
						
							
							
								
								Merge pull request  #3720  from FlyGoat/mips64  
							
							... 
							
							
							
							Make it work on general MIPS64 processors 
							
						 
						
							2022-08-19 20:24:27 +02:00  
				
					
						
							
							
								 
						
							
								a50b29c540 
								
							 
						 
						
							
							
								
								Provide a fallback MIPS64_GENERIC target  
							
							... 
							
							
							
							It is really dangerous to fallback to Loongson core on other
MIPS64 processors.
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com> 
							
						 
						
							2022-08-12 13:13:28 +01:00  
				
					
						
							
							
								 
						
							
								50c4eeb97d 
								
							 
						 
						
							
							
								
								alpha: Remove include of version.h  
							
							... 
							
							
							
							It will be defined by preprocessor argument.
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com> 
							
						 
						
							2022-08-11 15:02:58 +01:00  
				
					
						
							
							
								 
						
							
								802e71bf05 
								
							 
						 
						
							
							
								
								Add const attribute to lsame  
							
							
							
						 
						
							2022-08-08 15:15:52 +02:00  
				
					
						
							
							
								 
						
							
								fbfe1daf6e 
								
							 
						 
						
							
							
								
								LoongArch64: Add DYNAMIC_ARCH support  
							
							
							
						 
						
							2022-07-28 14:28:45 +08:00  
				
					
						
							
							
								 
						
							
								cd8e57040c 
								
							 
						 
						
							
							
								
								Merge pull request  #3691  from martin-frbg/issue3679-sparc  
							
							... 
							
							
							
							SPARC: fix DNRM2 returning INF instead of zero due to intermediate overflow 
							
						 
						
							2022-07-25 15:41:15 +02:00  
				
					
						
							
							
								 
						
							
								6c118b7977 
								
							 
						 
						
							
							
								
								Fix DNRM2 returning INF instead of zero due to intermediate overflow  
							
							
							
						 
						
							2022-07-24 17:42:31 +02:00  
				
					
						
							
							
								 
						
							
								c43ec53bdd 
								
							 
						 
						
							
							
								
								Merge pull request  #3690  from RajalakshmiSR/cdotp10  
							
							... 
							
							
							
							POWER: Fix complex dot function failures 
							
						 
						
							2022-07-19 13:59:16 +02:00  
				
					
						
							
							
								 
						
							
								b7c65d08cb 
								
							 
						 
						
							
							
								
								Merge pull request  #3689  from RajalakshmiSR/dgemvgcc10  
							
							... 
							
							
							
							POWER10: dgemv builtin rename 
							
						 
						
							2022-07-19 10:25:01 +02:00  
				
					
						
							
							
								 
						
							
								06ef015234 
								
							 
						 
						
							
							
								
								fix DNRM2 returning INF instead of zero due to intermediate overflow  
							
							
							
						 
						
							2022-07-19 10:19:27 +02:00  
				
					
						
							
							
								 
						
							
								a612e78a97 
								
							 
						 
						
							
							
								
								POWER: Fix complex dot function failures  
							
							... 
							
							
							
							There are some test failures in complex dot functions when compiling with gcc12.
The machine constraints used now do not update all the four elements in the
expected result array. Fixing this with a reduced level of optimization.
This is not changing any performance numbers but will be converted to C code in future. 
							
						 
						
							2022-07-18 14:48:43 -05:00  
				
					
						
							
							
								 
						
							
								432fd99445 
								
							 
						 
						
							
							
								
								POWER10: dgemv builtin rename  
							
							... 
							
							
							
							Add check to use correct builtin name for older versions
of gcc10 compilers. 
							
						 
						
							2022-07-18 09:48:01 -05:00  
				
					
						
							
							
								 
						
							
								4dd05e526b 
								
							 
						 
						
							
							
								
								LoongArch64: Fix dnrm2_tiny testcase failure  
							
							
							
						 
						
							2022-07-15 11:18:59 +08:00  
				
					
						
							
							
								 
						
							
								cce4b1d956 
								
							 
						 
						
							
							
								
								MIPS64: Fix dnrm2_tiny testcase failure  
							
							
							
						 
						
							2022-07-11 19:18:38 +08:00  
				
					
						
							
							
								 
						
							
								e12d474780 
								
							 
						 
						
							
							
								
								Eliminate uses of CREAL on left-hand side of assignments  
							
							
							
						 
						
							2022-07-05 00:01:09 +02:00  
				
					
						
							
							
								 
						
							
								9e29598575 
								
							 
						 
						
							
							
								
								workaround fault with ssq=inf,scale=0  
							
							
							
						 
						
							2022-07-02 23:47:17 +02:00  
				
					
						
							
							
								 
						
							
								123e0dfb62 
								
							 
						 
						
							
							
								
								Neoverse N2 sbgemm:  
							
							... 
							
							
							
							1. Modify the algorithm to resolve multithreading failures
    2. No memory allocation in sbgemm kernel
    3. Optimize when alpha == 1.0f 
							
						 
						
							2022-06-29 10:14:21 +08:00  
				
					
						
							
							
								 
						
							
								bc3728475f 
								
							 
						 
						
							
							
								
								format code  
							
							
							
						 
						
							2022-06-29 10:14:21 +08:00  
				
					
						
							
							
								 
						
							
								55d686d41e 
								
							 
						 
						
							
							
								
								neoverse n2 sbgemm:  
							
							... 
							
							
							
							implement ncopy tcopy kernel_8x4 
							
						 
						
							2022-06-29 10:14:21 +08:00  
				
					
						
							
							
								 
						
							
								04593bb27c 
								
							 
						 
						
							
							
								
								neoverse n2 sbgemm: init file  
							
							
							
						 
						
							2022-06-29 10:14:21 +08:00  
				
					
						
							
							
								 
						
							
								be5500e704 
								
							 
						 
						
							
							
								
								Merge pull request  #3669  from VFerrari/fix_small_matrix_kernel  
							
							... 
							
							
							
							POWER: fix issues with the small matrix kernel 
							
						 
						
							2022-06-28 16:09:36 +02:00  
				
					
						
							
							
								 
						
							
								92275a7902 
								
							 
						 
						
							
							
								
								Merge pull request  #3642  from nursik/develop  
							
							... 
							
							
							
							Add ARM64 support for Windows 
							
						 
						
							2022-06-28 16:05:11 +02:00  
				
					
						
							
							
								 
						
							
								cac634fce3 
								
							 
						 
						
							
							
								
								POWER10: Fix multithreading check when USE_THREAD=0  
							
							... 
							
							
							
							This patch fixes an issue when OpenBLAS is compiled for TARGET=POWER10
and the flag USE_THREAD is set to 0.
The function `num_cpu_avail` is only available when USE_THREAD=1,
so SMP is defined. 
							
						 
						
							2022-06-25 03:46:46 -03:00  
				
					
						
							
							
								 
						
							
								9283c7c0b5 
								
							 
						 
						
							
							
								
								Merge pull request  #3655  from RajalakshmiSR/zgemmasmp10  
							
							... 
							
							
							
							POWER10: Fix ZGEMM testcase failures 
							
						 
						
							2022-06-18 20:52:26 +02:00  
				
					
						
							
							
								 
						
							
								f191bc652b 
								
							 
						 
						
							
							
								
								POWER10: Fix ZGEMM testcase failures  
							
							... 
							
							
							
							This patch fixes storing and restoring non volatile registers
in zgemm POWER10 kernel. 
							
						 
						
							2022-06-17 08:18:08 -05:00  
				
					
						
							
							
								 
						
							
								8419d538ff 
								
							 
						 
						
							
							
								
								POWER10: convert dgemv inline assembly  
							
							... 
							
							
							
							This patch makes use of compiler builtins and matches with assembly
performance. Tested with clang14 and gcc12. 
							
						 
						
							2022-06-09 10:42:57 -05:00  
				
					
						
							
							
								 
						
							
								5e9a912591 
								
							 
						 
						
							
							
								
								Merge branch 'develop' into risc-v  
							
							
							
						 
						
							2022-06-06 14:12:09 +08:00  
				
					
						
							
							
								 
						
							
								968e1f51d8 
								
							 
						 
						
							
							
								
								Update RISC-V Intrinsic API.  
							
							
							
						 
						
							2022-06-06 13:52:21 +08:00  
				
					
						
							
							
								 
						
							
								1bb7993a97 
								
							 
						 
						
							
							
								
								Fix MSVC ARM64 build. Add generic kernel for ARM64  
							
							
							
						 
						
							2022-06-02 16:53:54 +02:00  
				
					
						
							
							
								 
						
							
								dc49edd4e6 
								
							 
						 
						
							
							
								
								Revert "roll back DGEMM kernel ... for DYNAMIC_ARCH"  
							
							
							
						 
						
							2022-05-20 11:23:30 +02:00  
				
					
						
							
							
								 
						
							
								b62173c5a0 
								
							 
						 
						
							
							
								
								POWER10: Changing store instructions for Level1 functions  
							
							... 
							
							
							
							This patch changes 32 bytes stores to two 16 bytes stores
to fix a recent degradation due to 32 bytes stores. 
							
						 
						
							2022-05-12 11:17:33 -05:00  
				
					
						
							
							
								 
						
							
								84cb58b7fb 
								
							 
						 
						
							
							
								
								Fix generator rules for ?laswp_ncopy and ?neg_tcopy  
							
							
							
						 
						
							2022-04-30 15:28:38 +02:00  
				
					
						
							
							
								 
						
							
								05dcfa176e 
								
							 
						 
						
							
							
								
								fix undefined prefetchsizes  
							
							
							
						 
						
							2022-04-16 10:04:27 +02:00  
				
					
						
							
							
								 
						
							
								2bbb9f05c7 
								
							 
						 
						
							
							
								
								fix undefined prefetchsize  
							
							
							
						 
						
							2022-04-16 10:00:10 +02:00  
				
					
						
							
							
								 
						
							
								115bc9b98f 
								
							 
						 
						
							
							
								
								CortexX1 is ARMV8 like A7x  
							
							
							
						 
						
							2022-03-28 17:28:29 +02:00  
				
					
						
							
							
								 
						
							
								b3b4672c30 
								
							 
						 
						
							
							
								
								Add initial support for Phytium FT2000 series and ARMV9 Cortex 510/710/X1/X2  
							
							
							
						 
						
							2022-03-27 15:29:20 +02:00  
				
					
						
							
							
								 
						
							
								40302558ed 
								
							 
						 
						
							
							
								
								Remove extraneous (and wrong) definition of sbgemm_r on x86_64  
							
							
							
						 
						
							2022-03-23 20:05:32 +01:00  
				
					
						
							
							
								 
						
							
								5cc1111383 
								
							 
						 
						
							
							
								
								fix unsafe read of Y in assembly kernel  
							
							
							
						 
						
							2022-03-11 11:56:33 -06:00  
				
					
						
							
							
								 
						
							
								45786b05da 
								
							 
						 
						
							
							
								
								Merge branch 'develop' into risc-v  
							
							
							
						 
						
							2022-02-28 11:48:02 +08:00  
				
					
						
							
							
								 
						
							
								225683218c 
								
							 
						 
						
							
							
								
								Small Matrix: use proper inline asm input constraint for AVX512 mask  
							
							
							
						 
						
							2022-02-28 03:22:31 +00:00  
				
					
						
							
							
								 
						
							
								9c626e466e 
								
							 
						 
						
							
							
								
								really fix definition of SHUFFLE_MAGIC_NO  
							
							
							
						 
						
							2022-02-25 15:36:02 +01:00  
				
					
						
							
							
								 
						
							
								0698212c8c 
								
							 
						 
						
							
							
								
								Remove stray $  
							
							
							
						 
						
							2022-02-25 15:33:02 +01:00  
				
					
						
							
							
								 
						
							
								9d7429406f 
								
							 
						 
						
							
							
								
								Declare SHUFFLE_MAGIC_NO as const to placate clang  
							
							
							
						 
						
							2022-02-25 10:05:36 +01:00  
				
					
						
							
							
								 
						
							
								d9894f45d3 
								
							 
						 
						
							
							
								
								Define sbgemm_r to fix DYNAMIC_ARCH builds  
							
							
							
						 
						
							2022-02-25 10:04:00 +01:00  
				
					
						
							
							
								 
						
							
								522f809825 
								
							 
						 
						
							
							
								
								Merge pull request  #3542  from martin-frbg/issue3540  
							
							... 
							
							
							
							Fix compilation for CooperLake on Windows/clang 
							
						 
						
							2022-02-24 00:00:00 +01:00  
				
					
						
							
							
								 
						
							
								abbc947edb 
								
							 
						 
						
							
							
								
								Fix compilation of Skylake AVX512 kernels with GCC 6  
							
							
							
						 
						
							2022-02-23 22:51:59 +00:00  
				
					
						
							
							
								 
						
							
								c62f8e2c01 
								
							 
						 
						
							
							
								
								Prevent compiler attempts to use k0 as mask register  
							
							
							
						 
						
							2022-02-23 20:12:20 +01:00  
				
					
						
							
							
								 
						
							
								80eb581c83 
								
							 
						 
						
							
							
								
								Fix non-portable u_int64_t  
							
							
							
						 
						
							2022-02-23 20:10:59 +01:00  
				
					
						
							
							
								 
						
							
								73ffabe6ba 
								
							 
						 
						
							
							
								
								Guard uses of _mm512_reduce_add_p?  
							
							
							
						 
						
							2022-02-23 20:06:14 +01:00  
				
					
						
							
							
								 
						
							
								7656aba00e 
								
							 
						 
						
							
							
								
								Merge pull request  #3493  from martin-frbg/casts+cleanup  
							
							... 
							
							
							
							WIP casts and cleanups 
							
						 
						
							2022-02-06 23:55:06 +01:00  
				
					
						
							
							
								 
						
							
								addc2a7aaa 
								
							 
						 
						
							
							
								
								Add proper defaults for IMIN/IMAX  
							
							
							
						 
						
							2022-01-27 19:56:32 +01:00  
				
					
						
							
							
								 
						
							
								299d4d70a3 
								
							 
						 
						
							
							
								
								Add default KERNEL file for Elbrus E2K arch  
							
							
							
						 
						
							2022-01-22 18:59:36 +01:00  
				
					
						
							
							
								 
						
							
								3492bea602 
								
							 
						 
						
							
							
								
								Create Makefile  
							
							
							
						 
						
							2022-01-22 18:57:28 +01:00  
				
					
						
							
							
								 
						
							
								898cf5faf3 
								
							 
						 
						
							
							
								
								Add Elbrus e2k architecture support  
							
							
							
						 
						
							2022-01-22 18:55:10 +01:00  
				
					
						
							
							
								 
						
							
								c1c0d5ce1d 
								
							 
						 
						
							
							
								
								Merge pull request  #3492  from binebrank/arm_sve_zgemm  
							
							... 
							
							
							
							SVE zgemm&cgemm (and other BLAS 3 complex) 
							
						 
						
							2022-01-18 21:36:33 +01:00  
				
					
						
							
							
								 
						
							
								19d435b1b3 
								
							 
						 
						
							
							
								
								update armv8sve + contributors  
							
							
							
						 
						
							2022-01-18 08:28:31 +01:00  
				
					
						
							
							
								 
						
							
								f158d59087 
								
							 
						 
						
							
							
								
								adapt CMake  
							
							
							
						 
						
							2022-01-17 22:36:48 +01:00  
				
					
						
							
							
								 
						
							
								b6a445cfd8 
								
							 
						 
						
							
							
								
								adapt Makefile for SVE trsm  
							
							
							
						 
						
							2022-01-16 21:40:56 +01:00  
				
					
						
							
							
								 
						
							
								0fb6cc07bf 
								
							 
						 
						
							
							
								
								fix ztrsm lt/ut copy  
							
							
							
						 
						
							2022-01-16 21:39:57 +01:00  
				
					
						
							
							
								 
						
							
								f1315288a8 
								
							 
						 
						
							
							
								
								add sve ztrsm  
							
							
							
						 
						
							2022-01-15 22:27:25 +01:00  
				
					
						
							
							
								 
						
							
								aaa2b1a861 
								
							 
						 
						
							
							
								
								fix sve dtrsm kernels  
							
							
							
						 
						
							2022-01-15 21:02:14 +01:00  
				
					
						
							
							
								 
						
							
								8071e179f1 
								
							 
						 
						
							
							
								
								add remaining sve trsm copy kernels  
							
							
							
						 
						
							2022-01-11 21:16:38 +01:00  
				
					
						
							
							
								 
						
							
								f87468ac91 
								
							 
						 
						
							
							
								
								trsm_lncopy_sve  
							
							
							
						 
						
							2022-01-10 21:45:37 +01:00  
				
					
						
							
							
								 
						
							
								e8939b3d30 
								
							 
						 
						
							
							
								
								sve trsmRN and trsmRT  
							
							
							
						 
						
							2022-01-10 20:42:20 +01:00  
				
					
						
							
							
								 
						
							
								098672b51b 
								
							 
						 
						
							
							
								
								add trsm_kernel_LT_sve  
							
							
							
						 
						
							2022-01-09 20:11:47 +01:00  
				
					
						
							
							
								 
						
							
								be7e55880c 
								
							 
						 
						
							
							
								
								sve trsm_kernel_LN  
							
							
							
						 
						
							2022-01-09 19:40:04 +01:00  
				
					
						
							
							
								 
						
							
								b6b024232d 
								
							 
						 
						
							
							
								
								Merge pull request  #3508  from snadampal/v1_n2  
							
							... 
							
							
							
							OpenBLAS: aarch64: Add neoverse-v1/n2 architecture specifics 
							
						 
						
							2022-01-09 14:50:26 +01:00  
				
					
						
							
							
								 
						
							
								19c8f615dc 
								
							 
						 
						
							
							
								
								OpenBLAS: aarch64: Add neoverse-v1/n2 architecture specifics  
							
							
							
						 
						
							2022-01-07 00:28:17 +00:00  
				
					
						
							
							
								 
						
							
								bb33446b40 
								
							 
						 
						
							
							
								
								fix makefile.L3  
							
							
							
						 
						
							2022-01-06 10:26:11 +01:00  
				
					
						
							
							
								 
						
							
								f33543d029 
								
							 
						 
						
							
							
								
								combine zchemm into single file  
							
							
							
						 
						
							2022-01-05 14:42:37 +01:00  
				
					
						
							
							
								 
						
							
								0c91d043ae 
								
							 
						 
						
							
							
								
								adapt CMake for SVE  
							
							
							
						 
						
							2022-01-05 14:36:39 +01:00  
				
					
						
							
							
								 
						
							
								39ab219704 
								
							 
						 
						
							
							
								
								sve copy functions for cgemm chemm zsymm  
							
							
							
						 
						
							2022-01-05 09:12:22 +01:00  
				
					
						
							
							
								 
						
							
								18102ae8c3 
								
							 
						 
						
							
							
								
								add cgemm ctrmm sve kernels  
							
							
							
						 
						
							2022-01-05 09:09:18 +01:00  
				
					
						
							
							
								 
						
							
								87537b8c55 
								
							 
						 
						
							
							
								
								modify sve zgemmcopy kernels  
							
							
							
						 
						
							2022-01-05 09:07:28 +01:00  
				
					
						
							
							
								 
						
							
								d30157d891 
								
							 
						 
						
							
							
								
								update configuration of kernels for A64FX and ARMV8SVE  
							
							
							
						 
						
							2022-01-05 09:00:54 +01:00  
				
					
						
							
							
								 
						
							
								07fa6fa3b1 
								
							 
						 
						
							
							
								
								configure Makefile for sve  
							
							
							
						 
						
							2022-01-05 08:57:51 +01:00  
				
					
						
							
							
								 
						
							
								2e2c02b762 
								
							 
						 
						
							
							
								
								fix sve ztrmm kernel  
							
							
							
						 
						
							2022-01-04 14:42:07 +01:00  
				
					
						
							
							
								 
						
							
								68c414d3a6 
								
							 
						 
						
							
							
								
								ztrmm sve copy functions  
							
							
							
						 
						
							2022-01-04 14:40:59 +01:00