OpenBLAS/kernel/loongarch64
pengxu a5d0d21378 loongarch64: Add zgemm and cgemm optimization 2023-12-29 18:06:26 +08:00
..
KERNEL LoongArch64: Add DYNAMIC_ARCH support 2022-07-28 14:28:45 +08:00
KERNEL.LOONGSON2K1000 loongarch64: Add zgemm and cgemm optimization 2023-12-29 18:06:26 +08:00
KERNEL.LOONGSON3R5 loongarch64: Add zgemm and cgemm optimization 2023-12-29 18:06:26 +08:00
KERNEL.generic LoongArch64: Add dgemv_t_8_lasx.S and dgemv_n_8_lasx.S V2 2023-08-07 11:20:42 +08:00
Makefile Add support for LOONGARCH64 2021-07-27 15:29:12 +08:00
amax.S Add support for LOONGARCH64 2021-07-27 15:29:12 +08:00
amax_lasx.S loongarch64: Refine amax,amin,max,min optimization. 2023-12-29 17:30:57 +08:00
amax_lsx.S loongarch64: Refine amax,amin,max,min optimization. 2023-12-29 17:30:57 +08:00
amin.S Add support for LOONGARCH64 2021-07-27 15:29:12 +08:00
amin_lasx.S loongarch64: Refine amax,amin,max,min optimization. 2023-12-29 17:30:57 +08:00
amin_lsx.S loongarch64: Refine amax,amin,max,min optimization. 2023-12-29 17:30:57 +08:00
asum.S Delete the macro instruction "li" and use "li.d" instead 2021-08-12 17:02:54 +08:00
asum_lasx.S loongarch64: Add and Refine asum optimization functions. 2023-12-29 17:30:57 +08:00
asum_lsx.S loongarch64: Add and Refine asum optimization functions. 2023-12-29 17:30:57 +08:00
axpby_lasx.S loongarch64: Refine axpby optimization functions. 2023-12-29 17:30:57 +08:00
axpby_lsx.S loongarch64: Refine axpby optimization functions. 2023-12-29 17:30:57 +08:00
axpy_lasx.S loongarch64: Refine and add axpy optimization functions. 2023-12-29 17:30:57 +08:00
axpy_lsx.S loongarch64: Refine and add axpy optimization functions. 2023-12-29 17:30:57 +08:00
camax_lasx.S loongarch64: Add and refine iamax optimization functions. 2023-12-29 17:30:57 +08:00
camax_lsx.S loongarch64: Add and refine iamax optimization functions. 2023-12-29 17:30:57 +08:00
camin_lasx.S loongarch64: Add camin optimization function. 2023-12-29 17:30:57 +08:00
camin_lsx.S loongarch64: Add camin optimization function. 2023-12-29 17:30:57 +08:00
casum_lasx.S loongarch64: Add and Refine asum optimization functions. 2023-12-29 17:30:57 +08:00
casum_lsx.S loongarch64: Add and Refine asum optimization functions. 2023-12-29 17:30:57 +08:00
caxpy_lasx.S loongarch64: Refine and add axpy optimization functions. 2023-12-29 17:30:57 +08:00
caxpy_lsx.S loongarch64: Refine and add axpy optimization functions. 2023-12-29 17:30:57 +08:00
ccopy_lasx.S loongarch64: Add c/zcopy optimization functions. 2023-12-29 17:30:57 +08:00
ccopy_lsx.S loongarch64: Add c/zcopy optimization functions. 2023-12-29 17:30:57 +08:00
cdot_lasx.S loongarch64: Add c/zdot optimization functions. 2023-12-29 17:30:57 +08:00
cdot_lsx.S loongarch64: Add c/zdot optimization functions. 2023-12-29 17:30:57 +08:00
cgemm_kernel_2x2_lasx.S loongarch64: Add zgemm and cgemm optimization 2023-12-29 18:06:26 +08:00
cgemm_kernel_2x2_lsx.S loongarch64: Add zgemm and cgemm optimization 2023-12-29 18:06:26 +08:00
cgemm_ncopy_2_lasx.S loongarch64: Add zgemm and cgemm optimization 2023-12-29 18:06:26 +08:00
cgemm_ncopy_2_lsx.S loongarch64: Add zgemm and cgemm optimization 2023-12-29 18:06:26 +08:00
cgemm_tcopy_2_lasx.S loongarch64: Add zgemm and cgemm optimization 2023-12-29 18:06:26 +08:00
cgemm_tcopy_2_lsx.S loongarch64: Add zgemm and cgemm optimization 2023-12-29 18:06:26 +08:00
cnrm2.S Allow negative INCX (API change from version 3.10 of the reference implementation) 2023-08-10 16:50:44 +02:00
cnrm2_lasx.S loongarch64: Add c/znrm2 optimization functions. 2023-12-29 17:30:57 +08:00
cnrm2_lsx.S loongarch64: Add c/znrm2 optimization functions. 2023-12-29 17:30:57 +08:00
copy.S Delete the macro instruction "li" and use "li.d" instead 2021-08-12 17:02:54 +08:00
copy_lasx.S loongarch64: Refine copy,swap,nrm2,sum optimization. 2023-12-29 17:30:57 +08:00
copy_lsx.S loongarch64: Refine copy,swap,nrm2,sum optimization. 2023-12-29 17:30:57 +08:00
crot_lasx.S loongarch64: Add c/zrot optimization functions. 2023-12-29 17:30:57 +08:00
crot_lsx.S loongarch64: Add c/zrot optimization functions. 2023-12-29 17:30:57 +08:00
cscal_lasx.S loongarch64: Add and refine scal optimization functions. 2023-12-29 17:30:57 +08:00
cscal_lsx.S loongarch64: Add and refine scal optimization functions. 2023-12-29 17:30:57 +08:00
csum_lasx.S loongarch64: Add {c/z}swap and {c/z}sum optimization 2023-12-29 17:30:57 +08:00
csum_lsx.S loongarch64: Add {c/z}swap and {c/z}sum optimization 2023-12-29 17:30:57 +08:00
cswap_lasx.S loongarch64: Add {c/z}swap and {c/z}sum optimization 2023-12-29 17:30:57 +08:00
cswap_lsx.S loongarch64: Add {c/z}swap and {c/z}sum optimization 2023-12-29 17:30:57 +08:00
dgemm_kernel_8x4.S Add dgemm_kernel_8x4.S file. 2023-12-29 17:30:57 +08:00
dgemm_kernel_16x4.S LoongArch64: Update dgemm kernel 2023-08-07 11:06:52 +08:00
dgemm_ncopy_4.S loongarch64: Optimize dgemm_kernel 2021-12-21 09:33:06 +08:00
dgemm_ncopy_4_lsx.S Optimize copy functions with lsx. 2023-12-29 17:30:57 +08:00
dgemm_ncopy_8_lsx.S Optimize copy functions with lsx. 2023-12-29 17:30:57 +08:00
dgemm_ncopy_16.S loongarch64: Optimize dgemm_kernel 2021-12-21 09:33:06 +08:00
dgemm_tcopy_4.S loongarch64: Optimize dgemm_kernel 2021-12-21 09:33:06 +08:00
dgemm_tcopy_4_lsx.S Optimize copy functions with lsx. 2023-12-29 17:30:57 +08:00
dgemm_tcopy_8_lsx.S Optimize copy functions with lsx. 2023-12-29 17:30:57 +08:00
dgemm_tcopy_16.S loongarch64: Optimize dgemm_kernel 2021-12-21 09:33:06 +08:00
dgemv_n_8_lasx.S LoongArch64: Fixed compilation issues when enable DYNAMIC_ARCH 2023-09-27 10:05:27 +08:00
dgemv_t_8_lasx.S LoongArch64: Fixed compilation issues when enable DYNAMIC_ARCH 2023-09-27 10:05:27 +08:00
dnrm2.S Allow negative INCX (API change from version 3.10 of the reference implementation) 2023-08-10 16:50:44 +02:00
dnrm2_lasx.S loongarch64: Refine copy,swap,nrm2,sum optimization. 2023-12-29 17:30:57 +08:00
dnrm2_lsx.S loongarch64: Refine copy,swap,nrm2,sum optimization. 2023-12-29 17:30:57 +08:00
dot.S Delete the macro instruction "li" and use "li.d" instead 2021-08-12 17:02:54 +08:00
dot_lasx.S loongarch: Add optimization for dsdot kernel. 2023-11-28 20:24:16 +08:00
dot_lsx.S loongarch: Add LSX optimization for dot. 2023-11-28 20:24:18 +08:00
dscal_lasx.S loongarch64: Add optimizations for scal. 2023-12-07 14:36:07 +08:00
dscal_lsx.S loongarch64: Add optimizations for scal. 2023-12-07 14:36:07 +08:00
dtrsm_kernel_LN_16x4_lasx.S LoongArch64: Add dtrsm kernel 2023-09-26 15:45:14 +08:00
dtrsm_kernel_LT_16x4_lasx.S LoongArch64: Add dtrsm kernel 2023-09-26 15:45:14 +08:00
dtrsm_kernel_RN_16x4_lasx.S LoongArch64: Add dtrsm kernel 2023-09-26 15:45:14 +08:00
dtrsm_kernel_RT_16x4_lasx.S LoongArch64: Add dtrsm kernel 2023-09-26 15:45:14 +08:00
dtrsm_kernel_macro.S LoongArch64: Add dtrsm kernel 2023-09-26 15:45:14 +08:00
gemm_kernel.S Add support for LOONGARCH64 2021-07-27 15:29:12 +08:00
gemv_n.S Delete the macro instruction "li" and use "li.d" instead 2021-08-12 17:02:54 +08:00
gemv_t.S Delete the macro instruction "li" and use "li.d" instead 2021-08-12 17:02:54 +08:00
iamax.S Delete the macro instruction "li" and use "li.d" instead 2021-08-12 17:02:54 +08:00
iamax_lasx.S loongarch64: Add and refine iamax optimization functions. 2023-12-29 17:30:57 +08:00
iamax_lsx.S loongarch64: Add and refine iamax optimization functions. 2023-12-29 17:30:57 +08:00
iamin.S Delete the macro instruction "li" and use "li.d" instead 2021-08-12 17:02:54 +08:00
iamin_lasx.S loongarch64: Refine iamin optimization. 2023-12-29 17:30:57 +08:00
iamin_lsx.S loongarch64: Refine iamin optimization. 2023-12-29 17:30:57 +08:00
icamax_lasx.S loongarch64: Add and refine iamax optimization functions. 2023-12-29 17:30:57 +08:00
icamax_lsx.S loongarch64: Add and refine iamax optimization functions. 2023-12-29 17:30:57 +08:00
icamin_lasx.S loongarch64: Add ic/zamin optimization functions. 2023-12-29 17:30:57 +08:00
icamin_lsx.S loongarch64: Add ic/zamin optimization functions. 2023-12-29 17:30:57 +08:00
imax_lasx.S loongarch64: Refine imax optimization. 2023-12-29 17:30:57 +08:00
imax_lsx.S loongarch64: Refine imax optimization. 2023-12-29 17:30:57 +08:00
imin_lasx.S loongarch64: Refine imin optimization. 2023-12-29 17:30:57 +08:00
imin_lsx.S loongarch64: Refine imin optimization. 2023-12-29 17:30:57 +08:00
izamax.S Delete the macro instruction "li" and use "li.d" instead 2021-08-12 17:02:54 +08:00
izamin.S Delete the macro instruction "li" and use "li.d" instead 2021-08-12 17:02:54 +08:00
loongarch64_asm.S LoongArch64: Compatible with early internal toolchain 2023-08-31 16:55:29 +08:00
max.S Add support for LOONGARCH64 2021-07-27 15:29:12 +08:00
max_lasx.S loongarch64: Refine amax,amin,max,min optimization. 2023-12-29 17:30:57 +08:00
max_lsx.S loongarch64: Refine amax,amin,max,min optimization. 2023-12-29 17:30:57 +08:00
min.S Add support for LOONGARCH64 2021-07-27 15:29:12 +08:00
min_lasx.S loongarch64: Refine amax,amin,max,min optimization. 2023-12-29 17:30:57 +08:00
min_lsx.S loongarch64: Refine amax,amin,max,min optimization. 2023-12-29 17:30:57 +08:00
rot_lasx.S loongarch64: Refine rot optimization. 2023-12-29 17:30:57 +08:00
rot_lsx.S loongarch64: Refine rot optimization. 2023-12-29 17:30:57 +08:00
scal.S Delete the macro instruction "li" and use "li.d" instead 2021-08-12 17:02:54 +08:00
scal_lasx.S loongarch64: Add and refine scal optimization functions. 2023-12-29 17:30:57 +08:00
scal_lsx.S loongarch64: Add and refine scal optimization functions. 2023-12-29 17:30:57 +08:00
sgemm_kernel_16x8_lasx.S LoongArch64: Compatible with early internal toolchain 2023-08-31 16:55:29 +08:00
sgemm_ncopy_8_lasx.S LoongArch64: Add sgemm_kernel 2023-08-23 16:08:43 +08:00
sgemm_ncopy_16_lasx.S LoongArch64: Add sgemm_kernel 2023-08-23 16:08:43 +08:00
sgemm_tcopy_8_lasx.S LoongArch64: Add sgemm_kernel 2023-08-23 16:08:43 +08:00
sgemm_tcopy_16_lasx.S LoongArch64: Add sgemm_kernel 2023-08-23 16:08:43 +08:00
sgemv_n_8_lasx.S LoongArch64: Fixed compilation issues when enable DYNAMIC_ARCH 2023-09-27 10:05:27 +08:00
sgemv_t_8_lasx.S LoongArch64: Fixed compilation issues when enable DYNAMIC_ARCH 2023-09-27 10:05:27 +08:00
snrm2.S Allow negative INCX (API change from version 3.10 of the reference implementation) 2023-08-10 16:50:44 +02:00
snrm2_lasx.S loongarch64: Refine copy,swap,nrm2,sum optimization. 2023-12-29 17:30:57 +08:00
snrm2_lsx.S loongarch64: Refine copy,swap,nrm2,sum optimization. 2023-12-29 17:30:57 +08:00
sum_lasx.S loongarch64: Refine copy,swap,nrm2,sum optimization. 2023-12-29 17:30:57 +08:00
sum_lsx.S loongarch64: Refine copy,swap,nrm2,sum optimization. 2023-12-29 17:30:57 +08:00
swap.S Delete the macro instruction "li" and use "li.d" instead 2021-08-12 17:02:54 +08:00
swap_lasx.S loongarch64: Refine copy,swap,nrm2,sum optimization. 2023-12-29 17:30:57 +08:00
swap_lsx.S loongarch64: Refine copy,swap,nrm2,sum optimization. 2023-12-29 17:30:57 +08:00
trsm_kernel_LN.S Add support for LOONGARCH64 2021-07-27 15:29:12 +08:00
trsm_kernel_LT.S Add support for LOONGARCH64 2021-07-27 15:29:12 +08:00
trsm_kernel_RT.S Add support for LOONGARCH64 2021-07-27 15:29:12 +08:00
zamax.S Add support for LOONGARCH64 2021-07-27 15:29:12 +08:00
zamin.S Add support for LOONGARCH64 2021-07-27 15:29:12 +08:00
zasum.S Add support for LOONGARCH64 2021-07-27 15:29:12 +08:00
zcopy.S Delete the macro instruction "li" and use "li.d" instead 2021-08-12 17:02:54 +08:00
zdot.S Delete the macro instruction "li" and use "li.d" instead 2021-08-12 17:02:54 +08:00
zgemm3m_kernel.S Add support for LOONGARCH64 2021-07-27 15:29:12 +08:00
zgemm_kernel.S Add support for LOONGARCH64 2021-07-27 15:29:12 +08:00
zgemm_kernel_2x2.S loongarch64: Add zgemm and cgemm optimization 2023-12-29 18:06:26 +08:00
zgemm_kernel_2x2_lasx.S loongarch64: Add zgemm and cgemm optimization 2023-12-29 18:06:26 +08:00
zgemm_ncopy_2_lasx.S loongarch64: Add zgemm and cgemm optimization 2023-12-29 18:06:26 +08:00
zgemm_tcopy_2_lasx.S loongarch64: Add zgemm and cgemm optimization 2023-12-29 18:06:26 +08:00
zgemv_n.S Delete the macro instruction "li" and use "li.d" instead 2021-08-12 17:02:54 +08:00
zgemv_t.S Delete the macro instruction "li" and use "li.d" instead 2021-08-12 17:02:54 +08:00
znrm2.S Allow negative INCX (API change from version 3.10 of the reference implementation) 2023-08-10 16:50:44 +02:00
znrm2_lasx.S loongarch64: Add c/znrm2 optimization functions. 2023-12-29 17:30:57 +08:00
znrm2_lsx.S loongarch64: Add c/znrm2 optimization functions. 2023-12-29 17:30:57 +08:00
zscal.S Delete the macro instruction "li" and use "li.d" instead 2021-08-12 17:02:54 +08:00
ztrsm_kernel_LT.S Add support for LOONGARCH64 2021-07-27 15:29:12 +08:00
ztrsm_kernel_RT.S Add support for LOONGARCH64 2021-07-27 15:29:12 +08:00