Commit Graph

6882 Commits

Author SHA1 Message Date
Martin Kroeker 889c5d026a
Merge pull request #4456 from kseniyazaytseva/riscv-rvv10
Fix BLAS and LAPACK tests for RVV 1.0 target, update to 0.12.0 intrincics
2024-01-26 13:31:09 +01:00
Martin Kroeker 4e2a32ff51
Merge pull request #4454 from kseniyazaytseva/riscv-rvv07
Fix BLAS and LAPACK tests for C910V and RISCV64_ZVL256B targets
2024-01-26 11:40:46 +01:00
Martin Kroeker a21b2fa5e4
Merge pull request #4452 from kseniyazaytseva/riscv-generic
Fix BLAS, BLAS-like functions and Generic RISC-V kernels
2024-01-24 17:52:25 +01:00
Andrey Sokolov 73530b03fa remove RISCV64_ZVL256B additional extentions 2024-01-24 11:38:14 +03:00
kseniyazaytseva 86943afa9c Fix x280 taget include riscv_vector.h 2024-01-24 10:53:13 +03:00
Andrey Sokolov 9c49a81d54 Resolve conflicts 2024-01-23 19:08:53 +03:00
kseniyazaytseva e1afb23811 Fix BLAS and LAPACK tests for C910V and RISCV64_ZVL256B targets
* Fixed bugs in dgemm, [a]min\max, asum kernels
* Added zero checks for BLAS kernels
* Added dsdot implementation for RVV 0.7.1
* Fixed bugs in _vector files for C910V and RISCV64_ZVL256B targets
* Added additional definitions for RISCV64_ZVL256B target
2024-01-23 19:01:31 +03:00
Martin Kroeker 10c22f4a39
Merge pull request #4355 from imaginationtech/img-riscv64-zvl128b
[RISC-V] Add RISC-V Vector 128-bit target
2024-01-19 13:51:07 +01:00
Octavian Maghiar ccbc3f875b [RISC-V] Add RISCV64_ZVL128B target to common_riscv64.h 2024-01-19 12:40:00 +00:00
Octavian Maghiar deecfb1a39 Merge branch 'risc-v' into img-riscv64-zvl128b 2024-01-19 12:26:38 +00:00
kseniyazaytseva f89e0034a4 Fix LAPACK usage from BLAS 2024-01-18 23:22:26 +03:00
Martin Kroeker f7cf637d7a redo lost edit 2024-01-18 23:22:26 +03:00
Martin Kroeker 85548e66ca Fix build failures seen with the NO_LAPACK option - cspr/csymv/csyr belong on the LAPACK list 2024-01-18 23:22:26 +03:00
Martin Kroeker f129161453 restore C/Z SPMV, SPR, SYR,SYMV 2024-01-18 23:22:26 +03:00
kseniyazaytseva 5222b5fc18 Added axpby kernels for GENERIC RISC-V target 2024-01-18 23:22:26 +03:00
Martin Kroeker 1c04df20bd Re-enable overriding the LAPACK SYMV,SYR,SPMV and SPR implementations 2024-01-18 23:20:15 +03:00
Martin Kroeker 5b4df851d7 fix stray blank on continuation line 2024-01-18 23:20:15 +03:00
kseniyazaytseva ff41cf5c49 Fix BLAS, BLAS-like functions and Generic RISC-V kernels
* Fixed gemmt, imatcopy, zimatcopy_cnc functions
* Fixed cblas_cscal testing in ctest
* Removed rotmg unreacheble code
* Added zero size checks
2024-01-18 23:19:52 +03:00
kseniyazaytseva b193ea3d7b Fix BLAS and LAPACK tests for RVV 1.0 target, update to 0.12.0 intrincics
* Update intrincics API to 0.12.0 version (Stride Segment Loads/Stores)
* Fixed nrm2, axpby, ncopy, zgemv and scal kernels
* Added zero size checks
2024-01-18 22:14:32 +03:00
Martin Kroeker 88e994116c
Merge pull request #4354 from imaginationtech/img-rvv-kernel-generator
[RISC-V] Improve RVV kernel generator LMUL usage
2024-01-17 15:19:37 +01:00
Martin Kroeker e3508d3713
Merge pull request #4439 from sergei-lewis/risc-v
Fix builds with t-head toolchains that use old intrinsics spec
2024-01-16 20:35:12 +01:00
Sergei Lewis 9edb805e64 fix builds with t-head toolchains that use old versions of the intrinsics spec 2024-01-16 14:33:08 +00:00
Martin Kroeker 1332f8a822
Merge pull request #4159 from OMaghiarIMG/risc-v-tail-policy
Set tail policy to undisturbed for RVV intrinsics accumulators
2023-12-08 10:25:41 +01:00
Martin Kroeker 2d316c2920
Merge pull request #4125 from OMaghiarIMG/risc-v
Fixes RVV masked intrinsics for iamax/iamin/imax/imin kernels
2023-12-07 14:50:58 +01:00
Octavian Maghiar 4a12cf53ec [RISC-V] Improve RVV kernel generator LMUL usage
The RVV kernel generation script uses the provided LMUL to increase the number of accumulator registers.
Since the effect of the LMUL is to group together the vector registers into larger ones, it actually should be used as a multiplier in the calculation of vlenmax.
At the moment, no matter what LMUL is provided, the generated kernels would only set the maximum number of vector elements equal to VLEN/SEW.
Commit changes the use of LMUL to properly adjust vlenmax. Note that an increase in LMUL results in a decrease in the number of effective vector registers.
2023-12-04 11:13:35 +00:00
Octavian Maghiar e4586e81b8 [RISC-V] Add RISC-V Vector 128-bit target
Current RVV x280 target depends on vlen=512-bits for Level 3 operations.
Commit adds generic target that supports vlen=128-bits.
New target uses the same scalable kernels as x280 for Level 1&2 operations, and autogenerated kernels for Level 3 operations.
Functional correctness of Level 3 operations tested on vlen=128-bits using QEMU v8.1.1 for ctests and BLAS-Tester.
2023-12-04 11:02:18 +00:00
Octavian Maghiar 826a9d5fa4 Adds tail undisturbed for RVV Level 2 operations
During the last iteration of some RVV operations, accumulators can get overwritten when VL < VLMAX and tail policy is agnostic.
Commit changes intrinsics tail policy to undistrubed.
2023-07-25 11:36:23 +01:00
Octavian Maghiar 8df0289db6 Adds tail undisturbed for RVV Level 1 operations
During the last iteration of some RVV operations, accumulators can get overwritten when VL < VLMAX and tail policy is agnostic.
Commit changes intrinsics tail policy to undistrubed.
2023-07-20 15:28:35 +01:00
Octavian Maghiar 1e4a3a2b5e Fixes RVV masked intrinsics for izamax/izamin kernels 2023-07-12 12:55:50 +01:00
Octavian Maghiar e1958eb705 Fixes RVV masked intrinsics for iamax/iamin/imax/imin kernels
Changes masked intrinsics from _m to _mu and reintroduces maskedoff argument.
2023-07-05 11:34:00 +01:00
Martin Kroeker 62f0f506ec
Merge pull request #4049 from sh-zheng/risc-v
Add rvv support for zsymv and active rvv support for zhemv
2023-06-09 19:08:00 +02:00
ZhengSh 2a8bc38cdc
Merge branch 'xianyi:risc-v' into risc-v 2023-06-09 20:01:03 +08:00
Martin Kroeker 5147831f25
Merge pull request #4074 from HellerZheng/risc-v
fix wrong vr = VFMVVF_FLOAT(0, vl); in symv_L_rvv.c and symv_U_rvv.c
2023-06-06 14:55:32 +02:00
Heller Zheng 0954746380 remove argument unused during compilation.
fix wrong vr = VFMVVF_FLOAT(0, vl);
2023-06-04 20:06:58 -07:00
sh-zheng d3bf5a5401 Combine two reduction operations of zhe/symv into one, with tail undisturbed setted. 2023-05-22 22:39:45 +08:00
sh-zheng 18d7afe69d Add rvv support for zsymv and active rvv support for zhemv 2023-05-20 01:19:44 +08:00
Zhang Xianyi 30222d0832
Merge pull request #3971 from HellerZheng/risc-v
RISC-V for new intrinsic API changes
2023-04-01 12:43:43 +08:00
Heller Zheng 6b74bee2f9 Update TARGET=x280 description. 2023-03-27 18:59:24 -07:00
Heller Zheng 1374a2d08b This PR adapts latest spec changes
Add prefix (_riscv) for all riscv intrinsics
Update some intrinsics' parameter, like vfredxxxx, vmerge
2023-03-19 23:59:03 -07:00
Zhang Xianyi 19f17c8bc6
Merge pull request #3893 from HellerZheng/develop
add riscv level3 C,Z kernel functions.
2023-03-15 10:17:13 +08:00
Zhang Xianyi 20511dfa65
Merge pull request #3919 from sergei-lewis/risc-v-latest-rvv-intrinsics
update riscv intrinsics for latest spec
2023-03-15 10:16:19 +08:00
Sergei Lewis 9b61be4545 factoring riscv64/dot.c fix into separate PR as requested 2023-03-01 17:40:42 +00:00
Sergei Lewis 2406958629 * update intrinsics to match latest spec at https://github.com/riscv-non-isa/rvv-intrinsic-doc (in particular, __riscv_ prefixes for rvv intrinsics)
* fix multiple numerical stability and corner case issues
* add a script to generate arbitrary gemm kernel shapes
* add a generic zvl256b target to demonstrate large gemm kernel unrolls
2023-02-24 10:45:03 +00:00
Heller Zheng 63cf4d0166 add riscv level3 C,Z kernel functions. 2023-02-01 19:13:44 -08:00
Xianyi Zhang c19dff0a31 Fix T-Head RVV intrinsic API changes. 2023-01-25 19:33:32 +08:00
Xianyi Zhang d9993e21a2 Refs #3825 Merge branch 'HellerZheng-develop' into risc-v 2022-12-03 12:01:29 +08:00
Xianyi Zhang e5313f53d5 Merge branch 'develop' of https://github.com/HellerZheng/OpenBLAS_riscv_x280 into HellerZheng-develop 2022-12-03 12:00:52 +08:00
Xianyi Zhang e284c048df Merge branch 'develop' into risc-v 2022-12-03 11:56:55 +08:00
Martin Kroeker 0a24f631e9
Merge pull request #3844 from Mousius/switch-ratio-16
Set SWITCH_RATIO for Arm(R) Neoverse(TM) V1 CPUs
2022-12-02 12:48:43 +01:00
Martin Kroeker 65984fbe68
Merge pull request #3847 from bartoldeman/scal-benchmark
scal benchmark: eliminate y, move init/timing out of loop
2022-12-02 11:51:50 +01:00