Commit Graph

108 Commits

Author SHA1 Message Date
gxw bb31bbef52 LoongArch64: Opt somatcopy_ct with LASX 2024-10-17 11:45:13 +00:00
gxw b37129341b LoongArch64: Opt somatcopy_cn with LASX 2024-10-17 11:27:55 +00:00
gxw acf6cab304 LoongArch64: Opt somatcopy_rn with LASX 2024-10-17 09:50:02 +00:00
gxw 15edb441bf LoongArch64: Opt somatcopy_rt with LASX 2024-10-17 09:15:42 +00:00
Martin Kroeker 9783dd07ab
Rename KERNEL.LOONGSONGENERIC to KERNEL.LA64_GENERIC 2024-10-06 22:43:11 +02:00
Martin Kroeker de421b7764
Merge pull request #4904 from XiWeiGu/la64_cross_cmake
LoongArch64: Enable cmake cross-compilation
2024-10-03 15:53:57 +02:00
gxw 30af9278dc LoongArch64: Enable cmake cross-compilation 2024-09-29 10:13:30 +08:00
gxw 48698b2b1d LoongArch64: Rename core
Use microarchitecture name instead of meaningless strings to name the core,
the legacy core is still retained.
1. Rename LOONGSONGENERIC to LA64_GENERIC
2. Rename LOONGSON3R5 to LA464
3. Rename LOONGSON2K1000 to LA264
2024-09-29 09:35:21 +08:00
Martin Kroeker e05d98d00a
expressly use fld.d/fst.d for floating point registers instead of LD/ST macros 2024-08-15 22:14:29 +02:00
gxw 3f39c8f94f LoongArch: Fixed numpy CI failure 2024-07-15 11:43:08 +08:00
gxw af73ae6208 LoongArch: Fixed issue 4728 2024-06-06 16:43:09 +08:00
gxw 8ab2e9ec65 LoongArch: DGEMM small matrix opt 2024-06-04 16:52:45 +08:00
Martin Kroeker 8da6f7e5f2
Merge pull request #4686 from XiWeiGu/loongarch64_dgemm_kernel_16x6
Loongarch64: Improving the Performance and Stability of dgemm
2024-05-10 11:29:12 +02:00
gxw f9a26240a7 loongarch64: Fixed icamax_lsx 2024-05-10 14:16:40 +08:00
gxw cb0f707409 loongarch64: Fixed utest fork:safety 2024-05-10 14:16:36 +08:00
Martin Kroeker b45d8e1ab2
remove stray comma 2024-05-09 12:33:19 +02:00
gxw 6017ad7146 loongarch64: Update dgemm_kernel_16x4 to dgemm_kernel_16x6 2024-05-08 10:10:26 +08:00
Martin Kroeker 992b71fea2
remove stray comma 2024-04-23 21:52:26 +02:00
gxw 7cd438a5ac loongarch64: Fixed clang compilation issues 2024-04-23 19:19:11 +08:00
gxw 96607cbb98 loongarch: Fixed dzamax
Initialize the registers to prevent sporadic errors.
2024-03-25 23:17:53 -04:00
gxw 50869f6ca8 loongarch: Fixed zrot LSX opt 2024-03-19 10:08:11 +08:00
gxw b5eb9d6bac loongarch: Fixed {sc/dz}amax LSX opt 2024-03-19 09:56:11 +08:00
gxw ad13e04669 loongarch: Fixed {s/d/sc/dz}amin LSX opt 2024-03-19 09:18:44 +08:00
gxw bbf82cb624 loongarch: Fixed {s/d}axpby LSX opt 2024-03-18 17:51:42 +08:00
gxw ac460eb42a loongarch: Fixed i{c/z}amin LSX opt 2024-03-18 17:15:58 +08:00
gxw 60e251a1f8 loongarch: Fixed {sc/dz}amax LASX opt 2024-03-16 14:52:17 +08:00
gxw a10dde5554 loongarch: Fixed {s/d/sc/dz}amin LASX opt 2024-03-16 14:52:14 +08:00
gxw 6534d378b7 loongarch: Fixed {s/d/c/z}sum LASX opt 2024-03-16 14:52:10 +08:00
gxw 6159cffc58 loongarch: Fixed i{s/c/z}amin LASX opt 2024-03-16 14:52:06 +08:00
gxw 7d755912b9 loongarch: Fixed {s/d/c/z}axpby LASX opt 2024-03-16 14:51:56 +08:00
pengxu 680a77fafc Optimized ssymv and dsymv kernel LSX for LoongArch 2024-03-05 20:36:59 +08:00
pengxu 6546600342 Optimized ssymv and dsymv kernel LASX for LoongArch 2024-03-04 16:18:39 +08:00
Martin Kroeker 577d480c62
Merge pull request #4529 from ErnstPeng/feature-branch
Optimized sgemv and dgemv kernel LSX for LoongArch
2024-02-28 13:49:54 +01:00
pengxu b2db064285 Optimized sgemv and dgemv kernel LSX for LoongArch 2024-02-28 18:07:27 +08:00
gxw 8e05c053be LoongArch64:Fixed the failed test cases test_{c/z}gemv_n in test_extensions 2024-02-27 22:19:26 -05:00
gxw 3f22fc2233 LoongArch64: Add zgemv LSX opt 2024-02-27 22:19:04 -05:00
gxw c508a10cf2 LoongArch64: Add cgemv LSX opt 2024-02-27 22:17:30 -05:00
gxw 8dea25ffff LoongArch64: Fixed utest kernel_regress:skx_avx 2024-02-26 02:04:37 -05:00
gxw 990507e3b8 LoongArch64: Opt zgemv with LASX 2024-02-22 11:58:02 +08:00
gxw d51ffec3a2 LoongArch64: Opt cgemv with LASX 2024-02-22 11:56:04 +08:00
pengxu 4787a55c64 Optimized cgemm kernel 16x4 LASX for LoongArch 2024-02-21 15:28:47 +08:00
pengxu fe3da43b7d Optimized zgemm kernel 8*4 LASX, 4*4 LSX and cgemm kernel 8*4 LSX for LoongArch 2024-02-06 11:49:01 +08:00
Martin Kroeker b537528feb
Merge pull request #4480 from XiWeiGu/loongarch64-fixed-{s/d}amin-lsx
LoongArch64: Fixed {s/d}amin LSX optimization
2024-02-05 06:24:50 +01:00
gxw adde725321 LoongArch64: Fixed {s/d}amin LSX optimization 2024-02-04 14:44:47 +08:00
gxw 7bc93d95a1 LoongArch64: Opt {c/z}axpby 2024-02-04 11:23:31 +08:00
gxw 1e1f487dc7 LoongArch64: Fixed {s/d}axpby 2024-02-04 09:41:37 +08:00
Martin Kroeker 98c9ff3194
Merge pull request #4464 from XiWeiGu/loongarch64-zscal
LoongArch64: Handle NAN and INF
2024-01-30 22:53:29 +01:00
gxw 83ce97a4ca LoongArch64: Handle NAN and INF 2024-01-30 17:17:30 +08:00
gxw a79d117405 LoogArch64: Fixed bug for {s/d}amin 2024-01-30 11:32:57 +08:00
gxw 276e3ebf9e LoongArch64: Add dzamax and dzamin opt 2024-01-26 10:03:50 +08:00