64 Commits

Author SHA1 Message Date
gxw
f6d6c14a96 mips: Fixed numpy CI failure 2024-07-17 10:31:49 +08:00
Martin Kroeker
a11f086c17 Update sscal_msa.c 2024-06-23 12:55:19 +02:00
Martin Kroeker
541e1b6959 disable the fast path for inc=1, alpha=0 as it does not handle x=NaN or Inf 2024-06-23 10:37:55 +02:00
Martin Kroeker
c08113c279 fix special cases of x= NAN or INF 2024-06-23 01:12:33 +02:00
Martin Kroeker
5ed4f24d6e Handle corner cases with INF and NAN arguments 2024-06-07 09:39:08 +02:00
Martin Kroeker
8c05765a5a fix other corner cases where x=INF 2024-05-31 18:06:36 +02:00
gxw
9c39e969f5 mips64: Fixed MSA optimization bugs for zgemv and cgemv 2024-04-15 15:17:29 +08:00
Martin Kroeker
09e84bd29a fix loop condition for incx < 0 2024-03-12 15:48:00 +01:00
Martin Kroeker
f747aedb52 fix loop condition for incx < 0 2024-03-12 15:47:17 +01:00
Martin Kroeker
e5d2725e5a Merge pull request #4185 from XiWeiGu/mips_enable_msa
MIPS: Enable MSA
2024-02-05 15:50:16 +01:00
Martin Kroeker
7df363e1e2 temporarily disable the MSA C/ZSCAL kernels 2024-01-12 00:08:52 +01:00
Martin Kroeker
25b0c48082 Update zscal.c 2024-01-08 09:49:18 +01:00
Martin Kroeker
5e7f714e93 Update zscal.c 2024-01-08 08:17:40 +01:00
Martin Kroeker
acf17a825d Handle NAN in input 2024-01-07 20:26:16 +01:00
Martin Kroeker
f692178792 Allow negative INCX (API change from version 3.10 of the reference implementation) 2023-08-10 16:52:09 +02:00
gxw
4d0f000db6 MIPS: Enable MSA 2023-08-07 21:00:10 +08:00
gxw
edea1bcfaf MIPS64: Fixed failed utest dsdot:dsdot_n_1 when TARGET=I6500 2022-09-17 16:43:22 +08:00
Martin Kroeker
b7df500106 Add generic mips32 target 2021-11-20 17:31:51 +01:00
gxw
4b548857d6 Add msa support for loongson
1. Using core loongson3r3 and loongson3r4 for loongson
2. Add DYNAMIC_ARCH for loongson

Change-Id: I1c6b54dbeca3a0cc31d1222af36a7e9bd6ab54c1
2020-12-09 10:28:46 +08:00
Martin Kroeker
7f11e33e8d Merge pull request #3025 from TiredNotTear/develop
MIPS: Fix two bugs
2020-12-08 09:39:27 +01:00
Hao Chen
ad38bd0e89 Fix failed cgemv and zgemv test case after using msa optimization
The cgemv and zgemv test case will call cgemv_n/t_msa.c zgemv_n/t_msa.c files in MIPS environment.
When the macro CONJ is defined, the calculation result will be wrong due to the wrong definition of OP2.
This patch updates the value of OP2 and passes the corresponding test.
2020-12-07 10:25:01 +08:00
Hao Chen
47b639cc9b Fix failed sswap and dswap case by using msa optimization
The swap test case will call sswap_msa.c and dswap_msa.c files in MIPS environmnet.
When inc_x or inc_y is equal to zero, the calculation result of the two functions will be wrong.
This patch adds the processing of inc_x or inc_y equal to zero, and the swap test case has passed.
2020-12-07 10:24:49 +08:00
Jin Bo
65de6f5957 Fix test errors reported by cblas_cgemm & cblas_ctrmm
The file cgemm_kernel_8x4_msa.c holds the MSA optimization
codes of cblas_cgemm and cblas_ctrmm. It defines two
macros: CGEMM_SCALE_1X2 and CGEMM_TRMM_SCALE_1X2. The pc1
array index in the two macros should be 0 and 1.
2020-12-05 15:08:17 +08:00
Martin Kroeker
e55ec82bb9 Delete KERNEL.1004K 2020-04-19 15:44:30 +02:00
Martin Kroeker
7353ea5afc Delete KERNEL.24K 2020-04-19 15:44:19 +02:00
Martin Kroeker
6a04efb122 Rename KERNEL files to include MIPS prefix 2020-04-19 15:43:54 +02:00
Martin Kroeker
d712ea724c Add MIPS24K support 2020-04-18 21:10:18 +02:00
Martin Kroeker
cdbe0f0235 Add MIPS implementation of ?sum
as trivial copy of ?asum with the fabs calls removed
2019-03-30 22:20:14 +01:00
Martin Kroeker
86a824c97f Fix wrong comparison that made IMIN identical to IMAX
as reported by aarnez in #1990
2019-01-31 15:27:21 +01:00
Martin Kroeker
8dd3515fa2 Merge pull request #1565 from martin-frbg/mipstypo
Remove extraneous brace from previous commit of mips dsdot fix
2018-05-17 20:22:58 +02:00
Martin Kroeker
95f7f0229c Remove extraneous brace from previous commit 2018-05-17 18:43:59 +02:00
Martin Kroeker
893b535540 Use correct data type for initializers of v2f64, v4f32
Fixes #1561
2018-05-15 14:42:12 +02:00
Martin Kroeker
9d5098dbc9 Add MIPS 1004K target (Mediatek MT7621 SOC) 2018-05-02 20:20:44 +02:00
Martin Kroeker
954f1832de Merge pull request #1540 from martin-frbg/mips32-zasum
Fix typo in MIPS P5600 complex ASUM code selection
2018-04-25 23:23:00 +02:00
Martin Kroeker
941ad280a8 Fix typo in MIPS P5600 complex ASUM code selection 2018-04-25 22:50:10 +02:00
Martin Kroeker
0fe434598b Fix precision of mips dsdot 2018-04-10 23:30:59 +02:00
Andrew
13e137fbc9 Initialize uninitialized variables (cppcheck) 2018-01-12 22:33:41 +01:00
Shivraj Patil
a4d97d980f Added rot functions.
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
2017-01-17 12:15:07 +05:30
kaustubh
1480f3df71 Add msa optimization for AXPY, COPY, SCALE, SWAP
Signed-off-by: kaustubh <kaustubh.raste@imgtec.com>
2017-01-09 18:27:23 +05:30
kaustubh
88afb3bc94 Add msa optimization for AXPY, COPY, SCALE, SWAP
Signed-off-by: kaustubh <kaustubh.raste@imgtec.com>
2017-01-09 18:22:09 +05:30
Shivraj Patil
a9bf8a781a Added prefetch to CGEMV and ZGEMV.
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
2016-12-27 11:33:51 +05:30
kaustubh
5f93aa5f87 Updated data prefetch in TRSM, ASUM, DOT functions
Signed-off-by: kaustubh <kaustubh.raste@imgtec.com>
2016-12-14 14:05:11 +05:30
kaustubh
9db451acd0 Updated data prefetch in TRSM, ASUM, DOT functions
Signed-off-by: kaustubh <kaustubh.raste@imgtec.com>
2016-12-13 14:02:14 +05:30
kaustubh
3eaff85191 Updated data prefetch in TRSM, ASUM, DOT functions
Signed-off-by: kaustubh <kaustubh.raste@imgtec.com>
2016-12-13 11:41:17 +05:30
kaustubh
00abce3b93 Add data prefetch in DOT and ASUM functions
Signed-off-by: kaustubh <kaustubh.raste@imgtec.com>
2016-11-22 11:21:03 +05:30
kaustubh
f3419e634c SGEMM, DGEMM, CGEMM, ZGEMM functions data prefetch
Signed-off-by: kaustubh <kaustubh.raste@imgtec.com>
2016-10-17 18:29:38 +05:30
kaustubh
90e2321ac3 STRSM, DTRSM functions data prefetch
Signed-off-by: kaustubh <kaustubh.raste@imgtec.com>
2016-10-14 16:41:28 +05:30
Martin Kroeker
91610f3835 Update zdot_msa.c 2016-10-05 18:59:09 +02:00
Martin Kroeker
6e22ecf102 Update zdot.c 2016-10-05 18:58:03 +02:00
Martin Kroeker
3178e4fea0 Remove explicit include of complex.h 2016-09-29 23:41:43 +02:00