Commit Graph

71 Commits

Author SHA1 Message Date
Martin Kroeker
c1019d5832 Handle INF and NAN in inputs 2024-06-27 10:58:59 +02:00
Martin Kroeker
516743f7dc fix other instances of mishandling INF 2024-05-31 16:02:12 +02:00
Martin Kroeker
cf80bd8500 Update nrm2_rvv.c 2024-03-13 13:07:26 +01:00
Martin Kroeker
9baa757905 Update nrm2_vector.c 2024-03-13 11:40:14 +01:00
Martin Kroeker
18a6db6862 Update nrm2_vector.c 2024-03-13 11:10:26 +01:00
Martin Kroeker
3752e73919 handle incx < 0 2024-03-12 20:44:01 +01:00
Martin Kroeker
db70c7f7fb handle incx < 0 2024-03-12 20:42:11 +01:00
Martin Kroeker
dee8557d58 handle incx < 0 2024-03-12 20:40:29 +01:00
Martin Kroeker
d9dff17aec handle incx < 0 2024-03-12 20:38:23 +01:00
Martin Kroeker
6b89e1f1d7 fix loop condition for incx < 0 2024-03-12 15:49:41 +01:00
Martin Kroeker
20016a0096 fix loop condition for incx < 0 2024-03-12 15:48:55 +01:00
Sergei Lewis
ba17758c02 fix axpy implementations where y has a stride of 0 2024-02-16 16:00:38 +00:00
Sergei Lewis
ff1523163f Fix axpy test hangs when n==0. Reenable zaxpy_vector kernel for C910V. 2024-02-09 12:59:14 +00:00
Martin Kroeker
6d8a273cca Handle zero increment(s) in C910V ?AXPBY (#4483)
* Handle zero increment(s)
2024-02-04 22:07:51 +01:00
Martin Kroeker
4d8dee508c temporarily disable the CAXPY/ZAXPY kernels 2024-02-04 01:05:03 +01:00
Sergei Lewis
a3b0ef6596 Restore riscv64 fixes from develop branch: dot product double precision accumulation, zscal NaN handling 2024-02-01 10:32:00 +00:00
Sergei Lewis
1093def0d1 Merge branch 'risc-v' into develop 2024-01-29 11:11:39 +00:00
Martin Kroeker
889c5d026a Merge pull request #4456 from kseniyazaytseva/riscv-rvv10
Fix BLAS and LAPACK tests for RVV 1.0 target, update to 0.12.0 intrincics
2024-01-26 13:31:09 +01:00
Martin Kroeker
4e2a32ff51 Merge pull request #4454 from kseniyazaytseva/riscv-rvv07
Fix BLAS and LAPACK tests for C910V and RISCV64_ZVL256B targets
2024-01-26 11:40:46 +01:00
Martin Kroeker
a21b2fa5e4 Merge pull request #4452 from kseniyazaytseva/riscv-generic
Fix BLAS, BLAS-like functions and Generic RISC-V kernels
2024-01-24 17:52:25 +01:00
Andrey Sokolov
9c49a81d54 Resolve conflicts 2024-01-23 19:08:53 +03:00
kseniyazaytseva
e1afb23811 Fix BLAS and LAPACK tests for C910V and RISCV64_ZVL256B targets
* Fixed bugs in dgemm, [a]min\max, asum kernels
* Added zero checks for BLAS kernels
* Added dsdot implementation for RVV 0.7.1
* Fixed bugs in _vector files for C910V and RISCV64_ZVL256B targets
* Added additional definitions for RISCV64_ZVL256B target
2024-01-23 19:01:31 +03:00
Octavian Maghiar
deecfb1a39 Merge branch 'risc-v' into img-riscv64-zvl128b 2024-01-19 12:26:38 +00:00
kseniyazaytseva
5222b5fc18 Added axpby kernels for GENERIC RISC-V target 2024-01-18 23:22:26 +03:00
kseniyazaytseva
ff41cf5c49 Fix BLAS, BLAS-like functions and Generic RISC-V kernels
* Fixed gemmt, imatcopy, zimatcopy_cnc functions
* Fixed cblas_cscal testing in ctest
* Removed rotmg unreacheble code
* Added zero size checks
2024-01-18 23:19:52 +03:00
kseniyazaytseva
b193ea3d7b Fix BLAS and LAPACK tests for RVV 1.0 target, update to 0.12.0 intrincics
* Update intrincics API to 0.12.0 version (Stride Segment Loads/Stores)
* Fixed nrm2, axpby, ncopy, zgemv and scal kernels
* Added zero size checks
2024-01-18 22:14:32 +03:00
Martin Kroeker
88e994116c Merge pull request #4354 from imaginationtech/img-rvv-kernel-generator
[RISC-V] Improve RVV kernel generator LMUL usage
2024-01-17 15:19:37 +01:00
Sergei Lewis
9edb805e64 fix builds with t-head toolchains that use old versions of the intrinsics spec 2024-01-16 14:33:08 +00:00
Martin Kroeker
f637e12713 Handle INF and NAN 2024-01-08 09:52:38 +01:00
Martin Kroeker
f0808d856b Handle NAN in input 2024-01-07 20:27:29 +01:00
Octavian Maghiar
4a12cf53ec [RISC-V] Improve RVV kernel generator LMUL usage
The RVV kernel generation script uses the provided LMUL to increase the number of accumulator registers.
Since the effect of the LMUL is to group together the vector registers into larger ones, it actually should be used as a multiplier in the calculation of vlenmax.
At the moment, no matter what LMUL is provided, the generated kernels would only set the maximum number of vector elements equal to VLEN/SEW.
Commit changes the use of LMUL to properly adjust vlenmax. Note that an increase in LMUL results in a decrease in the number of effective vector registers.
2023-12-04 11:13:35 +00:00
Octavian Maghiar
e4586e81b8 [RISC-V] Add RISC-V Vector 128-bit target
Current RVV x280 target depends on vlen=512-bits for Level 3 operations.
Commit adds generic target that supports vlen=128-bits.
New target uses the same scalable kernels as x280 for Level 1&2 operations, and autogenerated kernels for Level 3 operations.
Functional correctness of Level 3 operations tested on vlen=128-bits using QEMU v8.1.1 for ctests and BLAS-Tester.
2023-12-04 11:02:18 +00:00
Martin Kroeker
a34a0a7abc Allow negative INCX (API change from version 3.10 of the reference implementation) 2023-08-10 16:56:52 +02:00
Octavian Maghiar
826a9d5fa4 Adds tail undisturbed for RVV Level 2 operations
During the last iteration of some RVV operations, accumulators can get overwritten when VL < VLMAX and tail policy is agnostic.
Commit changes intrinsics tail policy to undistrubed.
2023-07-25 11:36:23 +01:00
Octavian Maghiar
8df0289db6 Adds tail undisturbed for RVV Level 1 operations
During the last iteration of some RVV operations, accumulators can get overwritten when VL < VLMAX and tail policy is agnostic.
Commit changes intrinsics tail policy to undistrubed.
2023-07-20 15:28:35 +01:00
Martin Kroeker
76ef1672f8 Override DSDOT with generic code to get rid of qemu precision error 2023-07-19 22:31:07 +02:00
Octavian Maghiar
1e4a3a2b5e Fixes RVV masked intrinsics for izamax/izamin kernels 2023-07-12 12:55:50 +01:00
Octavian Maghiar
e1958eb705 Fixes RVV masked intrinsics for iamax/iamin/imax/imin kernels
Changes masked intrinsics from _m to _mu and reintroduces maskedoff argument.
2023-07-05 11:34:00 +01:00
Xianyi Zhang
e14a025bb1 Temporily walk around zaxpy vector kernel bug. 2023-06-28 11:17:38 +00:00
Martin Kroeker
772b0cc715 Fix early bailout 2023-06-27 16:12:27 +02:00
Martin Kroeker
d6be5036d7 Fix IDAMAX 2023-06-26 21:19:33 +02:00
Martin Kroeker
1fe96f8da7 Fix failures to handle increments of zero 2023-06-25 22:36:57 +02:00
Martin Kroeker
73b30b1dec Fix VLEV_FLOAT/VSEV_FLOAT macros to compile with t-head 2.6.1 2023-06-18 17:46:29 +02:00
ZhengSh
2a8bc38cdc Merge branch 'xianyi:risc-v' into risc-v 2023-06-09 20:01:03 +08:00
Heller Zheng
0954746380 remove argument unused during compilation.
fix wrong vr = VFMVVF_FLOAT(0, vl);
2023-06-04 20:06:58 -07:00
sh-zheng
d3bf5a5401 Combine two reduction operations of zhe/symv into one, with tail undisturbed setted. 2023-05-22 22:39:45 +08:00
sh-zheng
18d7afe69d Add rvv support for zsymv and active rvv support for zhemv 2023-05-20 01:19:44 +08:00
Heller Zheng
1374a2d08b This PR adapts latest spec changes
Add prefix (_riscv) for all riscv intrinsics
Update some intrinsics' parameter, like vfredxxxx, vmerge
2023-03-19 23:59:03 -07:00
Zhang Xianyi
19f17c8bc6 Merge pull request #3893 from HellerZheng/develop
add riscv level3 C,Z kernel functions.
2023-03-15 10:17:13 +08:00
Sergei Lewis
cb0a70e0e2 dot.c early bail fix 2023-03-02 09:51:10 +00:00