|  Nursultan Zarlyk | 1bb7993a97 | Fix MSVC ARM64 build. Add generic kernel for ARM64 | 2022-06-02 16:53:54 +02:00 | 
				
					
						|  Martin Kroeker | dc49edd4e6 | Revert "roll back DGEMM kernel ... for DYNAMIC_ARCH" | 2022-05-20 11:23:30 +02:00 | 
				
					
						|  Rajalakshmi Srinivasaraghavan | b62173c5a0 | POWER10: Changing store instructions for Level1 functions This patch changes 32 bytes stores to two 16 bytes stores
to fix a recent degradation due to 32 bytes stores. | 2022-05-12 11:17:33 -05:00 | 
				
					
						|  Martin Kroeker | 84cb58b7fb | Fix generator rules for ?laswp_ncopy and ?neg_tcopy | 2022-04-30 15:28:38 +02:00 | 
				
					
						|  Martin Kroeker | 05dcfa176e | fix undefined prefetchsizes | 2022-04-16 10:04:27 +02:00 | 
				
					
						|  Martin Kroeker | 2bbb9f05c7 | fix undefined prefetchsize | 2022-04-16 10:00:10 +02:00 | 
				
					
						|  Martin Kroeker | 115bc9b98f | CortexX1 is ARMV8 like A7x | 2022-03-28 17:28:29 +02:00 | 
				
					
						|  Martin Kroeker | b3b4672c30 | Add initial support for Phytium FT2000 series and ARMV9 Cortex 510/710/X1/X2 | 2022-03-27 15:29:20 +02:00 | 
				
					
						|  Martin Kroeker | 40302558ed | Remove extraneous (and wrong) definition of sbgemm_r on x86_64 | 2022-03-23 20:05:32 +01:00 | 
				
					
						|  Caroline Newcombe | 5cc1111383 | fix unsafe read of Y in assembly kernel | 2022-03-11 11:56:33 -06:00 | 
				
					
						|  Xianyi Zhang | 45786b05da | Merge branch 'develop' into risc-v | 2022-02-28 11:48:02 +08:00 | 
				
					
						|  Wangyang Guo | 225683218c | Small Matrix: use proper inline asm input constraint for AVX512 mask | 2022-02-28 03:22:31 +00:00 | 
				
					
						|  Martin Kroeker | 9c626e466e | really fix definition of SHUFFLE_MAGIC_NO | 2022-02-25 15:36:02 +01:00 | 
				
					
						|  Martin Kroeker | 0698212c8c | Remove stray $ | 2022-02-25 15:33:02 +01:00 | 
				
					
						|  Martin Kroeker | 9d7429406f | Declare SHUFFLE_MAGIC_NO as const to placate clang | 2022-02-25 10:05:36 +01:00 | 
				
					
						|  Martin Kroeker | d9894f45d3 | Define sbgemm_r to fix DYNAMIC_ARCH builds | 2022-02-25 10:04:00 +01:00 | 
				
					
						|  Martin Kroeker | 522f809825 | Merge pull request #3542 from martin-frbg/issue3540 Fix compilation for CooperLake on Windows/clang | 2022-02-24 00:00:00 +01:00 | 
				
					
						|  Mosè Giordano | abbc947edb | Fix compilation of Skylake AVX512 kernels with GCC 6 | 2022-02-23 22:51:59 +00:00 | 
				
					
						|  Martin Kroeker | c62f8e2c01 | Prevent compiler attempts to use k0 as mask register | 2022-02-23 20:12:20 +01:00 | 
				
					
						|  Martin Kroeker | 80eb581c83 | Fix non-portable u_int64_t | 2022-02-23 20:10:59 +01:00 | 
				
					
						|  Martin Kroeker | 73ffabe6ba | Guard uses of _mm512_reduce_add_p? | 2022-02-23 20:06:14 +01:00 | 
				
					
						|  Martin Kroeker | 7656aba00e | Merge pull request #3493 from martin-frbg/casts+cleanup WIP casts and cleanups | 2022-02-06 23:55:06 +01:00 | 
				
					
						|  Martin Kroeker | addc2a7aaa | Add proper defaults for IMIN/IMAX | 2022-01-27 19:56:32 +01:00 | 
				
					
						|  Martin Kroeker | 299d4d70a3 | Add default KERNEL file for Elbrus E2K arch | 2022-01-22 18:59:36 +01:00 | 
				
					
						|  Martin Kroeker | 3492bea602 | Create Makefile | 2022-01-22 18:57:28 +01:00 | 
				
					
						|  Martin Kroeker | 898cf5faf3 | Add Elbrus e2k architecture support | 2022-01-22 18:55:10 +01:00 | 
				
					
						|  Martin Kroeker | c1c0d5ce1d | Merge pull request #3492 from binebrank/arm_sve_zgemm SVE zgemm&cgemm (and other BLAS 3 complex) | 2022-01-18 21:36:33 +01:00 | 
				
					
						|  Bine Brank | 19d435b1b3 | update armv8sve + contributors | 2022-01-18 08:28:31 +01:00 | 
				
					
						|  Bine Brank | f158d59087 | adapt CMake | 2022-01-17 22:36:48 +01:00 | 
				
					
						|  Bine Brank | b6a445cfd8 | adapt Makefile for SVE trsm | 2022-01-16 21:40:56 +01:00 | 
				
					
						|  Bine Brank | 0fb6cc07bf | fix ztrsm lt/ut copy | 2022-01-16 21:39:57 +01:00 | 
				
					
						|  Bine Brank | f1315288a8 | add sve ztrsm | 2022-01-15 22:27:25 +01:00 | 
				
					
						|  Bine Brank | aaa2b1a861 | fix sve dtrsm kernels | 2022-01-15 21:02:14 +01:00 | 
				
					
						|  Bine Brank | 8071e179f1 | add remaining sve trsm copy kernels | 2022-01-11 21:16:38 +01:00 | 
				
					
						|  Bine Brank | f87468ac91 | trsm_lncopy_sve | 2022-01-10 21:45:37 +01:00 | 
				
					
						|  Bine Brank | e8939b3d30 | sve trsmRN and trsmRT | 2022-01-10 20:42:20 +01:00 | 
				
					
						|  Bine Brank | 098672b51b | add trsm_kernel_LT_sve | 2022-01-09 20:11:47 +01:00 | 
				
					
						|  Bine Brank | be7e55880c | sve trsm_kernel_LN | 2022-01-09 19:40:04 +01:00 | 
				
					
						|  Martin Kroeker | b6b024232d | Merge pull request #3508 from snadampal/v1_n2 OpenBLAS: aarch64: Add neoverse-v1/n2 architecture specifics | 2022-01-09 14:50:26 +01:00 | 
				
					
						|  Sunita Nadampalli | 19c8f615dc | OpenBLAS: aarch64: Add neoverse-v1/n2 architecture specifics | 2022-01-07 00:28:17 +00:00 | 
				
					
						|  Bine Brank | bb33446b40 | fix makefile.L3 | 2022-01-06 10:26:11 +01:00 | 
				
					
						|  Bine Brank | f33543d029 | combine zchemm into single file | 2022-01-05 14:42:37 +01:00 | 
				
					
						|  Bine Brank | 0c91d043ae | adapt CMake for SVE | 2022-01-05 14:36:39 +01:00 | 
				
					
						|  Bine Brank | 39ab219704 | sve copy functions for cgemm chemm zsymm | 2022-01-05 09:12:22 +01:00 | 
				
					
						|  Bine Brank | 18102ae8c3 | add cgemm ctrmm sve kernels | 2022-01-05 09:09:18 +01:00 | 
				
					
						|  Bine Brank | 87537b8c55 | modify sve zgemmcopy kernels | 2022-01-05 09:07:28 +01:00 | 
				
					
						|  Bine Brank | d30157d891 | update configuration of kernels for A64FX and ARMV8SVE | 2022-01-05 09:00:54 +01:00 | 
				
					
						|  Bine Brank | 07fa6fa3b1 | configure Makefile for sve | 2022-01-05 08:57:51 +01:00 | 
				
					
						|  Bine Brank | 2e2c02b762 | fix sve ztrmm kernel | 2022-01-04 14:42:07 +01:00 | 
				
					
						|  Bine Brank | 68c414d3a6 | ztrmm sve copy functions | 2022-01-04 14:40:59 +01:00 | 
				
					
						|  Bine Brank | ce329ab686 | add sve zhemm copy routines | 2022-01-03 15:56:05 +01:00 | 
				
					
						|  Bine Brank | 0140373802 | add sve ztrmm | 2022-01-02 19:15:33 +01:00 | 
				
					
						|  Bine Brank | f7b6912868 | ztrmm sve copy kernels | 2021-12-30 21:00:16 +01:00 | 
				
					
						|  Bine Brank | 40b14e4957 | fix zgemm kernel | 2021-12-29 11:42:04 +01:00 | 
				
					
						|  Bine Brank | 6ec4aab875 | zgemm sve copy routines | 2021-12-26 17:05:46 +01:00 | 
				
					
						|  Bine Brank | 878064f394 | sve zgemm kernel | 2021-12-26 08:44:05 +01:00 | 
				
					
						|  Bine Brank | 683a7548bf | added macros for sve zgemm kernels | 2021-12-25 11:46:41 +01:00 | 
				
					
						|  Martin Kroeker | 7b146e590c | fix function typecast | 2021-12-24 20:01:52 +01:00 | 
				
					
						|  Martin Kroeker | e9a0e52201 | fix function typecast | 2021-12-24 20:00:50 +01:00 | 
				
					
						|  Martin Kroeker | d1ee6ff73f | fix function typecasts | 2021-12-21 18:45:28 +01:00 | 
				
					
						|  Bine Brank | e3c9947c0f | prepare kernel for sve zgemm | 2021-12-21 11:19:27 +01:00 | 
				
					
						|  gxw | 8d9b9c6b2a | loongarch64: Optimize dgemm_kernel | 2021-12-21 09:33:06 +08:00 | 
				
					
						|  Wu Zhigang | 92b7b949dd | fix bug in zscal function memset can not be used in zscal because of
the stride parameters.
Signed-off-by: Wu Zhigang <zhigang.wu@starfivetech.com> | 2021-12-15 01:23:30 -08:00 | 
				
					
						|  Martin Kroeker | b0a590f4fe | Merge pull request #3475 from wjc404/optimize-A53-dgemm optimize cgemm on ARM cortex A53 & cortex A55 | 2021-12-12 19:09:08 +01:00 | 
				
					
						|  Martin Kroeker | f4d1f0333b | Merge pull request #3474 from rafaelcfsousa/rafael/cmake_power Add CMake support for Power | 2021-12-12 19:08:27 +01:00 | 
				
					
						|  Jia-Chen | b610d2de37 | optimize cgemm on ARM cortex A53 & cortex A55 | 2021-12-12 17:22:52 +08:00 | 
				
					
						|  Martin Kroeker | 697e2752d7 | Merge pull request #3464 from binebrank/arm_sve_sgemm Add sgemm part for Arm SVE | 2021-12-11 20:35:22 +01:00 | 
				
					
						|  Bine Brank | a8f62a347b | fix UNROLL_MN and add to targets for SVE | 2021-12-11 16:37:23 +01:00 | 
				
					
						|  Bine Brank | 774267fdac | adjust Makefile.L3 for SVE | 2021-12-11 16:35:08 +01:00 | 
				
					
						|  Rafael Cardoso Fernandes Sousa | 23a7561353 | Fix error cmake (small kernels) | 2021-12-09 09:57:39 -06:00 | 
				
					
						|  Martin Kroeker | 5378046abd | roll back DGEMM kernels to 4x8 when compiling for DYNAMIC_ARCH | 2021-12-06 19:43:54 +01:00 | 
				
					
						|  Bine Brank | a1fea1fe2a | sgemm v2x8 SVE kernel | 2021-12-05 18:47:29 +01:00 | 
				
					
						|  Bine Brank | abe1ce3434 | strmm sve v1x8 kernel | 2021-12-05 14:03:08 +01:00 | 
				
					
						|  Martin Kroeker | 54d321d742 | Merge pull request #3466 from rafaelcfsousa/rafael/small_matrix_p10 [POWER] Add small matrix for sgemm/dgemm on Power10 | 2021-12-03 12:12:20 +01:00 | 
				
					
						|  Martin Kroeker | 0882db30a2 | Merge pull request #3455 from cenewcombe/develop Fix unsafe read during final iteration of zsymv_L_sse2.S | 2021-12-03 10:01:20 +01:00 | 
				
					
						|  Bine Brank | 0de36f7b5c | trmm sve copy fucntions for single precision | 2021-11-29 21:25:05 +01:00 | 
				
					
						|  Rafael Cardoso Fernandes Sousa | c78fdcc80d | [POWER] Add support for SMALL_MATRIX_OPT | 2021-11-28 12:41:16 -06:00 | 
				
					
						|  Bine Brank | 86ae89bf33 | add sgemm kernel and copy functions for sgemm and ssymm | 2021-11-28 18:12:47 +01:00 | 
				
					
						|  Martin Kroeker | 454edd741c | Merge pull request #3425 from binebrank/arm_sve_dgemm Add dgemm kernel for arm64 SVE | 2021-11-26 16:14:55 +01:00 | 
				
					
						|  Martin Kroeker | bcfbdc81b2 | Merge pull request #3459 from rafaelcfsousa/fix_cmake Fix issues when building OpenBLAS with cmake | 2021-11-26 15:19:24 +01:00 | 
				
					
						|  Bine Brank | 1af73ce38e | Adapt CMake for SVE | 2021-11-26 10:35:01 +01:00 | 
				
					
						|  Martin Kroeker | e7fca060db | Merge pull request #3457 from wjc404/optimize-A53-dgemm MOD: optimize zgemm on cortex-A53/cortex-A55 | 2021-11-26 10:30:47 +01:00 | 
				
					
						|  Jia-Chen | 5c1cd5e0c2 | MOD: add comments to a53 zgemm kernel | 2021-11-25 22:48:48 +08:00 | 
				
					
						|  Rafael Cardoso Fernandes Sousa | d5c9353f1b | Modify the order that cmake set the KERNEL variables (generic now is fallback) | 2021-11-24 20:08:35 -06:00 | 
				
					
						|  Jia-Chen | 9f59b19fcd | MOD: optimize zgemm on cortex-A53/cortex-A55 | 2021-11-24 21:51:45 +08:00 | 
				
					
						|  Bine Brank | 531a28b6a0 | removed unused code (compiler warnings) | 2021-11-22 10:12:34 +01:00 | 
				
					
						|  Bine Brank | 9b9cb90bb1 | modify Makefile for SVE copy | 2021-11-22 09:54:20 +01:00 | 
				
					
						|  Bine Brank | 9388f05a3c | configure SVE Makefile | 2021-11-21 18:33:43 +01:00 | 
				
					
						|  Bine Brank | b58d4f31ab | some clean-up & commentary | 2021-11-21 14:56:27 +01:00 | 
				
					
						|  Martin Kroeker | b7df500106 | Add generic mips32 target | 2021-11-20 17:31:51 +01:00 | 
				
					
						|  Bine Brank | e6ed4be02e | symm SVE copy rutines | 2021-11-20 16:35:29 +01:00 | 
				
					
						|  Caroline Newcombe | feeb8283a5 | Fix unsafe read during final iteration of zsymv_L_sse2.S | 2021-11-19 14:29:32 -06:00 | 
				
					
						|  Jia-Chen | 302f22693a | MOD: optimize normal DGEMM on ARMV8 cortex-A53 & cortex-A55 | 2021-11-18 21:14:43 +08:00 | 
				
					
						|  Bine Brank | 3c7eed0e53 | add remaining trmm copy rutines for SVE | 2021-11-14 16:00:10 +01:00 | 
				
					
						|  Bine Brank | 7d996b1c36 | dtrmm_utcopy sve function | 2021-11-13 18:48:53 +01:00 | 
				
					
						|  Bine Brank | ab7917910d | add v2x8 kernel + fix sve dtrmm | 2021-11-07 20:37:51 +01:00 | 
				
					
						|  Bine Brank | 7093372e32 | add ARMV8SVE target | 2021-11-01 22:53:21 +01:00 | 
				
					
						|  Bine Brank | a8fbdbac34 | fix sve dgemm kernel + sve dtrmm | 2021-10-31 10:24:25 +01:00 | 
				
					
						|  Bine Brank | 746b4f0f17 | added SVE ncopy and tcopy | 2021-10-30 12:11:44 +02:00 | 
				
					
						|  Bine Brank | 1a10d3e09d | add sve dgemm prototype | 2021-10-27 16:37:18 +02:00 | 
				
					
						|  Martin Kroeker | 22bf5c27ba | Add basic support for the Fujitsu A64FX (#3415) * Add initial support for Fujitsu A64FX as generic ARMV8 | 2021-10-18 15:00:19 +02:00 | 
				
					
						|  Wangyang Guo | 63a103ba6e | sbgemm: spr: disable small matrix path by default | 2021-10-17 19:08:03 -07:00 | 
				
					
						|  Wangyang Guo | 82194ea9d2 | sbgemm: spr: implement otcopy_16 | 2021-10-17 19:08:03 -07:00 | 
				
					
						|  Wangyang Guo | 8632380a96 | sbgemm: spr: reuse ncopy_16 from cooperlake as incopy | 2021-10-17 19:08:03 -07:00 | 
				
					
						|  Wangyang Guo | 6bc8204ce5 | sbgemm: spr: optimization for tmp_c buffer | 2021-10-17 19:08:03 -07:00 | 
				
					
						|  Wangyang Guo | f018aa342a | sbgemm: spr: kernel handle alpha != 1.0 | 2021-10-17 19:08:03 -07:00 | 
				
					
						|  Wangyang Guo | a52456b168 | sbgemm: spr: oncopy: use tile load/store instead | 2021-10-17 19:08:03 -07:00 | 
				
					
						|  Wangyang Guo | f2485352a6 | sbgemm: spr: only load A once in tail_k handling | 2021-10-17 19:08:03 -07:00 | 
				
					
						|  Wangyang Guo | 9ab33228bb | sbgemm: spr: process k2 and odd k at the same time | 2021-10-17 19:08:03 -07:00 | 
				
					
						|  Wangyang Guo | 10d52646e2 | sbgemm: spr: oncopy: avoid handling too much pointer at a time | 2021-10-17 19:08:03 -07:00 | 
				
					
						|  Wangyang Guo | 88154ed02d | sbgemm: spr: reduce tile conf loading by seperate tail k handling | 2021-10-17 19:08:03 -07:00 | 
				
					
						|  Wangyang Guo | a70bfb52d5 | sbgemm: spr: kernel works for NN case when alpha is 1.0 | 2021-10-17 19:08:03 -07:00 | 
				
					
						|  Wangyang Guo | 6051c86741 | sbgemm: spr: kernel works for m32 in NN case | 2021-10-17 19:08:03 -07:00 | 
				
					
						|  Wangyang Guo | d0b253ac6e | sbgemm: spr: implement oncopy_16 | 2021-10-17 19:08:03 -07:00 | 
				
					
						|  Wangyang Guo | 1d48b7cb16 | sbgemm: spr: add dummy source files | 2021-10-17 19:08:03 -07:00 | 
				
					
						|  Wangyang Guo | 3dc6052c7e | initial support for Sapphire Rapids platform | 2021-10-12 01:30:40 -07:00 | 
				
					
						|  Martin Kroeker | 8c20ca345a | Use Neoverse's current mix of ThunderX2 kernels for Vortex as well | 2021-10-06 11:06:43 +02:00 | 
				
					
						|  Martin Kroeker | 8e4c209002 | Merge pull request #3398 from kavanabhat/aix_p10_gnuas Big Endian Changes for Power10 kernels | 2021-10-05 18:59:47 +02:00 | 
				
					
						|  kavanabhat | 9cc95e5657 | AIX changes for P10 with GNU Compiler | 2021-10-01 05:18:35 -05:00 | 
				
					
						|  kavanabhat | fe3c778c51 | AIX changes for P10 with GNU Compiler | 2021-09-30 06:06:27 -05:00 | 
				
					
						|  Wangyang Guo | ee5ca8a328 | x86_64: BFLOAT16: fix build warning | 2021-09-28 18:30:06 +08:00 | 
				
					
						|  Martin Kroeker | 90cc944625 | Move alphaI to x22 to leave x18 unused (reserved on OSX) | 2021-09-17 09:53:18 +02:00 | 
				
					
						|  Martin Kroeker | 590fbff06e | move alpha to x19/x20 to leave x18 unused for OSX | 2021-09-17 09:42:17 +02:00 | 
				
					
						|  Martin Kroeker | 380940271b | Move temp to x21 to leave x18 unused (reserved on OSX) | 2021-09-17 09:28:19 +02:00 | 
				
					
						|  Martin Kroeker | 7d75177446 | Move temp to x21 to leave x18 unused (reserved on OSX) | 2021-09-17 09:24:11 +02:00 | 
				
					
						|  Martin Kroeker | 0a4ac4b585 | Use x21 for I to leave x18 unused (reserved on OSX) | 2021-09-17 09:19:51 +02:00 | 
				
					
						|  Martin Kroeker | 7d4a221579 | Remove unused TEMP2 and reshuffle to leave x18 unused (reserved on OSX) | 2021-09-17 09:18:25 +02:00 | 
				
					
						|  Martin Kroeker | d3a9c7ef7f | Merge pull request #3382 from rafaelcfsousa/rafael/cwarnings [POWER] Remove unused variable warnings. | 2021-09-17 09:15:16 +02:00 | 
				
					
						|  Martin Kroeker | 8dfa61a61c | Initialize abs_mask1 with itself to silence a gcc warning | 2021-09-15 22:11:35 +02:00 | 
				
					
						|  Martin Kroeker | 99aa10b3ff | Initialize abs_mask1 with itself to silence a gcc warning actual initialization is via the _mm_cmpeq_ep18, which I've seen claimed to be the fastest way to set an xmm register to all 1s | 2021-09-15 22:10:43 +02:00 | 
				
					
						|  Rafael Cardoso Fernandes Sousa | b751edf624 | Fix unused variable warnings on Power | 2021-09-15 13:36:07 -05:00 | 
				
					
						|  Martin Kroeker | 80346b8813 | Merge pull request #3379 from martin-frbg/issue3369-2 Add casts to fix compiler warnings for SkylakeX sasum/dasum | 2021-09-15 07:18:57 +02:00 | 
				
					
						|  Martin Kroeker | ce036a2fc0 | Add casts | 2021-09-14 21:41:53 +02:00 | 
				
					
						|  Martin Kroeker | ddf106f769 | Add dedicated entries for BFLOAT16 kernels | 2021-09-14 16:17:18 +02:00 | 
				
					
						|  Martin Kroeker | af8843875a | Merge pull request #3376 from martin-frbg/issue3370 Fix a few harmless compiler warnings | 2021-09-12 00:01:31 +02:00 | 
				
					
						|  Martin Kroeker | 0925dfe2c9 | One instance of kernel_4x1 is used even on SKX | 2021-09-11 15:30:19 +02:00 | 
				
					
						|  Martin Kroeker | 7d873a329f | Add ifdefs around conditionally used functions | 2021-09-11 14:38:47 +02:00 | 
				
					
						|  Martin Kroeker | ef24712030 | Move a conditionally used variable | 2021-09-11 14:37:44 +02:00 | 
				
					
						|  Martin Kroeker | d17238599b | Add casts | 2021-09-11 13:38:28 +02:00 | 
				
					
						|  Wangyang Guo | 59a1114d03 | sbgemm: cooperlake: tuning for small matrix | 2021-09-07 21:30:46 +08:00 | 
				
					
						|  Wangyang Guo | 682d66555d | sbgemm: cooperlake: implement ncopy_16 | 2021-09-07 21:30:46 +08:00 | 
				
					
						|  Wangyang Guo | beccb83b16 | sbgemm: cooperlake: add n24 kernel for tcopy_4 | 2021-09-07 21:30:46 +08:00 | 
				
					
						|  Wangyang Guo | 5fcacad32b | sbgemm: cooperlake: implement tcopy_4 | 2021-09-07 21:30:46 +08:00 | 
				
					
						|  Wangyang Guo | bb1c4fa5bd | sbgemm: cooperlake: prefetch A & B | 2021-09-07 21:30:46 +08:00 | 
				
					
						|  Wangyang Guo | 7a2d1601ec | sbgemm: cooperlake: unroll core loop by 2 | 2021-09-07 21:30:46 +08:00 | 
				
					
						|  Wangyang Guo | 45fdf951b6 | sbgemm: cooperlake: reorder ptr increase for performance | 2021-09-07 21:30:46 +08:00 | 
				
					
						|  Wangyang Guo | cece3541ab | sbgemm: cooperlake: fix bug in m64n12 | 2021-09-07 21:30:46 +08:00 | 
				
					
						|  Wangyang Guo | 9df0953cde | sbgemm: cooperlake: kernel works for NN | 2021-09-07 21:30:45 +08:00 | 
				
					
						|  Wangyang Guo | 2ec9f3a8aa | sbgemm: cooperlake: change kernel size to 16x4 | 2021-09-07 21:30:45 +08:00 | 
				
					
						|  Wangyang Guo | ef8f5fecc8 | sbgemm: cooperlake: implement sbgemm_tcopy_32 | 2021-09-07 21:30:45 +08:00 |