Commit Graph

78 Commits

Author SHA1 Message Date
pengxu 680a77fafc Optimized ssymv and dsymv kernel LSX for LoongArch 2024-03-05 20:36:59 +08:00
pengxu 6546600342 Optimized ssymv and dsymv kernel LASX for LoongArch 2024-03-04 16:18:39 +08:00
Martin Kroeker 577d480c62
Merge pull request #4529 from ErnstPeng/feature-branch
Optimized sgemv and dgemv kernel LSX for LoongArch
2024-02-28 13:49:54 +01:00
pengxu b2db064285 Optimized sgemv and dgemv kernel LSX for LoongArch 2024-02-28 18:07:27 +08:00
gxw 8e05c053be LoongArch64:Fixed the failed test cases test_{c/z}gemv_n in test_extensions 2024-02-27 22:19:26 -05:00
gxw 3f22fc2233 LoongArch64: Add zgemv LSX opt 2024-02-27 22:19:04 -05:00
gxw c508a10cf2 LoongArch64: Add cgemv LSX opt 2024-02-27 22:17:30 -05:00
gxw 8dea25ffff LoongArch64: Fixed utest kernel_regress:skx_avx 2024-02-26 02:04:37 -05:00
gxw 990507e3b8 LoongArch64: Opt zgemv with LASX 2024-02-22 11:58:02 +08:00
gxw d51ffec3a2 LoongArch64: Opt cgemv with LASX 2024-02-22 11:56:04 +08:00
pengxu 4787a55c64 Optimized cgemm kernel 16x4 LASX for LoongArch 2024-02-21 15:28:47 +08:00
pengxu fe3da43b7d Optimized zgemm kernel 8*4 LASX, 4*4 LSX and cgemm kernel 8*4 LSX for LoongArch 2024-02-06 11:49:01 +08:00
Martin Kroeker b537528feb
Merge pull request #4480 from XiWeiGu/loongarch64-fixed-{s/d}amin-lsx
LoongArch64: Fixed {s/d}amin LSX optimization
2024-02-05 06:24:50 +01:00
gxw adde725321 LoongArch64: Fixed {s/d}amin LSX optimization 2024-02-04 14:44:47 +08:00
gxw 7bc93d95a1 LoongArch64: Opt {c/z}axpby 2024-02-04 11:23:31 +08:00
gxw 1e1f487dc7 LoongArch64: Fixed {s/d}axpby 2024-02-04 09:41:37 +08:00
Martin Kroeker 98c9ff3194
Merge pull request #4464 from XiWeiGu/loongarch64-zscal
LoongArch64: Handle NAN and INF
2024-01-30 22:53:29 +01:00
gxw 83ce97a4ca LoongArch64: Handle NAN and INF 2024-01-30 17:17:30 +08:00
gxw a79d117405 LoogArch64: Fixed bug for {s/d}amin 2024-01-30 11:32:57 +08:00
gxw 276e3ebf9e LoongArch64: Add dzamax and dzamin opt 2024-01-26 10:03:50 +08:00
pengxu a5d0d21378 loongarch64: Add zgemm and cgemm optimization 2023-12-29 18:06:26 +08:00
gxw 546f13558c loongarch64: Add {c/z}swap and {c/z}sum optimization 2023-12-29 17:30:57 +08:00
Hao Chen edabb93668 loongarch64: Refine axpby optimization functions. 2023-12-29 17:30:57 +08:00
Hao Chen 1ec5dded43 loongarch64: Add c/zrot optimization functions.
Signed-off-by: Hao Chen <chenhao@loongson.cn>
2023-12-29 17:30:57 +08:00
Hao Chen 3c53ded315 loongarch64: Add c/znrm2 optimization functions. 2023-12-29 17:30:57 +08:00
Hao Chen fbd612f8c4 loongarch64: Add ic/zamin optimization functions. 2023-12-29 17:30:57 +08:00
Hao Chen d97272cb35 loongarch64: Add c/zdot optimization functions. 2023-12-29 17:30:57 +08:00
Hao Chen 65a0aeb128 loongarch64: Add c/zcopy optimization functions.
Signed-off-by: Hao Chen <chenhao@loongson.cn>
2023-12-29 17:30:57 +08:00
Hao Chen 2a34fb4b80 loongarch64: Add and refine scal optimization functions.
Signed-off-by: Hao Chen <chenhao@loongson.cn>
2023-12-29 17:30:57 +08:00
Hao Chen 8785e948b5 loongarch64: Add camin optimization function. 2023-12-29 17:30:57 +08:00
Hao Chen 0753848e03 loongarch64: Refine and add axpy optimization functions.
Signed-off-by: Hao Chen <chenhao@loongson.cn>
2023-12-29 17:30:57 +08:00
Hao Chen 06fd5b5995 loongarch64: Add and Refine asum optimization functions. 2023-12-29 17:30:57 +08:00
guxiwei e771be185e Optimize copy functions with lsx.
Signed-off-by: Hao Chen <chenhao@loongson.cn>
2023-12-29 17:30:57 +08:00
Hao Chen 179ed51d3b Add dgemm_kernel_8x4.S file. 2023-12-29 17:30:57 +08:00
Hao Chen 173a65d4e6 loongarch64: Add and refine iamax optimization functions. 2023-12-29 17:30:57 +08:00
zhoupeng ea70e165c7 loongarch64: Refine rot optimization. 2023-12-29 17:30:57 +08:00
zhoupeng 116aee7527 loongarch64: Refine imin optimization. 2023-12-29 17:30:57 +08:00
zhoupeng 8be2654193 loongarch64: Refine imax optimization. 2023-12-29 17:30:57 +08:00
zhoupeng 154baad454 loongarch64: Refine iamin optimization. 2023-12-29 17:30:57 +08:00
Shiyou Yin 36c12c4971 loongarch64: Refine copy,swap,nrm2,sum optimization. 2023-12-29 17:30:57 +08:00
Shiyou Yin c6996a80e9 loongarch64: Refine amax,amin,max,min optimization. 2023-12-29 17:30:57 +08:00
yancheng d32f38fb37 loongarch64: Add optimizations for nrm2. 2023-12-07 14:36:26 +08:00
yancheng f9b468990e loongarch64: Add optimizations for rot. 2023-12-07 14:36:26 +08:00
yancheng c80e7e27d1 loongarch64: Add optimizations for sum and asum. 2023-12-07 14:36:26 +08:00
yancheng d4c96a35a8 loongarch64: Add optimizations for axpy and axpby. 2023-12-07 14:36:26 +08:00
yancheng 360acc0a41 loongarch64: Add optimizations for swap. 2023-12-07 14:36:26 +08:00
yancheng 174c25766b loongarch64: Add optimizations for copy. 2023-12-07 14:36:26 +08:00
yancheng 49829b2b7d loongarch64: Add optimizations for iamin. 2023-12-07 14:36:07 +08:00
yancheng be83f5e4e0 loongarch64: Add optimizations for iamax. 2023-12-07 14:36:07 +08:00
yancheng e3fb2b5afa loongarch64: Add optimizations for imin. 2023-12-07 14:36:07 +08:00