gxw
519ea6e87a
utest: Add utest for the {sc/dz}amax and {s/d/sc/dz}amin
2024-01-30 11:32:36 +08:00
Sergei Lewis
1093def0d1
Merge branch 'risc-v' into develop
2024-01-29 11:11:39 +00:00
Martin Kroeker
8892121130
Merge pull request #4462 from martin-frbg/issue4449
...
Use +sve in arch declarations of the fallback paths for SVE targets
2024-01-26 22:41:16 +01:00
Martin Kroeker
48a4c4d454
Use +sve in arch declarations of the fallback paths for SVE targets
2024-01-26 16:30:52 +01:00
Mark Ryan
e0b610d01f
Harmonize riscv64 LIBNAME for forced and non-forced targets
...
The forced values for LIBNAME were either riscv64_generic or c910v
while the non-forced value of LIBNAME was always riscv64.
2024-01-26 15:18:18 +00:00
Mark Ryan
ec2aa32eb0
Fix crash in cpuid_riscv64.c
...
The crash is reproducible when building OpenBLAS without forcing a
target in a riscv64 container running on an X86_64 machine with an
older version of QEMU, e.g., 7.0.0, registered with binfmt_misc to
run riscv64 binaries. With this setup, cat /proc/cpuinfo in the
container returns the cpu information for the host, which contains a
"model name" string, and we execute the buggy code. The code in
question is searching in an uninitialised buffer for the ':' character
and doesn't check to see whether it was found or not. This can result
in pmodel containing the pointer value 1 and a crash when pmodel is
defererenced. The algorithm to detect the C910V CPU has not been
modified, merely fixed to prevent the crash.
A few additional checks for NULL pointers are added to improve the
robustness of the code and a whitespace error is corrected.
2024-01-26 15:17:31 +00:00
Martin Kroeker
889c5d026a
Merge pull request #4456 from kseniyazaytseva/riscv-rvv10
...
Fix BLAS and LAPACK tests for RVV 1.0 target, update to 0.12.0 intrincics
2024-01-26 13:31:09 +01:00
Martin Kroeker
4e2a32ff51
Merge pull request #4454 from kseniyazaytseva/riscv-rvv07
...
Fix BLAS and LAPACK tests for C910V and RISCV64_ZVL256B targets
2024-01-26 11:40:46 +01:00
gxw
276e3ebf9e
LoongArch64: Add dzamax and dzamin opt
2024-01-26 10:03:50 +08:00
Martin Kroeker
a21b2fa5e4
Merge pull request #4452 from kseniyazaytseva/riscv-generic
...
Fix BLAS, BLAS-like functions and Generic RISC-V kernels
2024-01-24 17:52:25 +01:00
Andrey Sokolov
73530b03fa
remove RISCV64_ZVL256B additional extentions
2024-01-24 11:38:14 +03:00
kseniyazaytseva
86943afa9c
Fix x280 taget include riscv_vector.h
2024-01-24 10:53:13 +03:00
Martin Kroeker
d938aed7fe
reset "mem structure overflowed" state on shutdown
2024-01-23 17:15:53 +01:00
Andrey Sokolov
9c49a81d54
Resolve conflicts
2024-01-23 19:08:53 +03:00
kseniyazaytseva
e1afb23811
Fix BLAS and LAPACK tests for C910V and RISCV64_ZVL256B targets
...
* Fixed bugs in dgemm, [a]min\max, asum kernels
* Added zero checks for BLAS kernels
* Added dsdot implementation for RVV 0.7.1
* Fixed bugs in _vector files for C910V and RISCV64_ZVL256B targets
* Added additional definitions for RISCV64_ZVL256B target
2024-01-23 19:01:31 +03:00
Martin Kroeker
d6a5174e9c
Merge pull request #4447 from RevySR/update-thead-toolchains
...
Update T-Head toolchains v2.8.0
2024-01-22 08:10:02 +01:00
Han Gao/Revy/Rabenda
304a9b60af
Update T-Head toolchains v2.8.0
...
Signed-off-by: Han Gao/Revy/Rabenda <rabenda.cn@gmail.com>
2024-01-21 14:32:52 +00:00
Martin Kroeker
f5de4fad27
Merge pull request #4444 from Mousius/part-mapping
...
Add dynamic support for Arm(R) Neoverse(TM) V2 processor
2024-01-20 15:55:07 +01:00
Chris Sidebottom
aaf65210cc
Add dynamic support for Arm(R) Neoverse(TM) V2 processor
...
Whilst I figure out how best to map the L2 parameters without
duplicating all of `ARMV8SVE`, lets just map this to `NEOVERSEV1`.
2024-01-19 19:05:50 +00:00
Martin Kroeker
10c22f4a39
Merge pull request #4355 from imaginationtech/img-riscv64-zvl128b
...
[RISC-V] Add RISC-V Vector 128-bit target
2024-01-19 13:51:07 +01:00
Octavian Maghiar
ccbc3f875b
[RISC-V] Add RISCV64_ZVL128B target to common_riscv64.h
2024-01-19 12:40:00 +00:00
Octavian Maghiar
deecfb1a39
Merge branch 'risc-v' into img-riscv64-zvl128b
2024-01-19 12:26:38 +00:00
kseniyazaytseva
f89e0034a4
Fix LAPACK usage from BLAS
2024-01-18 23:22:26 +03:00
Martin Kroeker
f7cf637d7a
redo lost edit
2024-01-18 23:22:26 +03:00
Martin Kroeker
85548e66ca
Fix build failures seen with the NO_LAPACK option - cspr/csymv/csyr belong on the LAPACK list
2024-01-18 23:22:26 +03:00
Martin Kroeker
f129161453
restore C/Z SPMV, SPR, SYR,SYMV
2024-01-18 23:22:26 +03:00
kseniyazaytseva
5222b5fc18
Added axpby kernels for GENERIC RISC-V target
2024-01-18 23:22:26 +03:00
Martin Kroeker
1c04df20bd
Re-enable overriding the LAPACK SYMV,SYR,SPMV and SPR implementations
2024-01-18 23:20:15 +03:00
Martin Kroeker
5b4df851d7
fix stray blank on continuation line
2024-01-18 23:20:15 +03:00
kseniyazaytseva
ff41cf5c49
Fix BLAS, BLAS-like functions and Generic RISC-V kernels
...
* Fixed gemmt, imatcopy, zimatcopy_cnc functions
* Fixed cblas_cscal testing in ctest
* Removed rotmg unreacheble code
* Added zero size checks
2024-01-18 23:19:52 +03:00
Martin Kroeker
500442cf96
Merge pull request #4442 from pbo-linaro/fix-utest-compilation
...
Fix utest compilation
2024-01-18 20:59:13 +01:00
kseniyazaytseva
b193ea3d7b
Fix BLAS and LAPACK tests for RVV 1.0 target, update to 0.12.0 intrincics
...
* Update intrincics API to 0.12.0 version (Stride Segment Loads/Stores)
* Fixed nrm2, axpby, ncopy, zgemv and scal kernels
* Added zero size checks
2024-01-18 22:14:32 +03:00
Pierrick Bouvier
a4992e09bc
Fix utest compilation
...
Introduced recently when adding new test cases for ZSCAL
- include cblas is needed for cblas_zscal
- ASSERT macro does not exist
- missing closing )
2024-01-18 18:21:30 +04:00
Martin Kroeker
6f0e0e4021
Merge pull request #4438 from Dirreke/csky-support
...
Add CSKY support
2024-01-18 13:04:52 +01:00
Martin Kroeker
43cb266178
Merge pull request #4441 from martin-frbg/gemv-threshold
...
Increase multithreading threshold for S/DGEMV by a factor of 50
2024-01-17 22:25:01 +01:00
Martin Kroeker
d2fc4f3b4d
Increase multithreading threshold by a factor of 50
2024-01-17 20:59:24 +01:00
Martin Kroeker
88e994116c
Merge pull request #4354 from imaginationtech/img-rvv-kernel-generator
...
[RISC-V] Improve RVV kernel generator LMUL usage
2024-01-17 15:19:37 +01:00
Martin Kroeker
ec46ca7a43
Support Arm Compiler for Linux as classic flang ( #4436 )
...
* Support ArmCompilerforLinux as classic flang
2024-01-17 07:33:10 +01:00
Martin Kroeker
e3508d3713
Merge pull request #4439 from sergei-lewis/risc-v
...
Fix builds with t-head toolchains that use old intrinsics spec
2024-01-16 20:35:12 +01:00
Dirreke
ec89466e14
Add CSKY support
2024-01-16 23:45:06 +08:00
Sergei Lewis
9edb805e64
fix builds with t-head toolchains that use old versions of the intrinsics spec
2024-01-16 14:33:08 +00:00
Martin Kroeker
452741b67f
Merge pull request #4435 from imciner2/im/sapphire
...
Fix Clang sapphire rapids march flag
2024-01-16 13:57:29 +01:00
Ian McInerney
8f4e325ea8
Fix Clang sapphire rapids march flag
2024-01-15 23:42:03 +00:00
Martin Kroeker
13c764eaaa
Merge pull request #4434 from martin-frbg/issue4433
...
Only use mtune=native in ARM64 fallback paths when not cross-compiling
2024-01-15 23:36:07 +01:00
Martin Kroeker
025a1b2c7b
Only use mtune=native when not cross-compiling
2024-01-15 22:40:21 +01:00
Martin Kroeker
2527afaaa2
Merge pull request #4429 from martin-frbg/issue4428
...
Handle NAN and INF in ARM and generic/s390x ZSCAL
2024-01-15 11:26:12 +01:00
Martin Kroeker
0d2e486edf
Handle NAN and INF
2024-01-15 11:18:59 +01:00
Martin Kroeker
a782103b9c
Merge pull request #4425 from martin-frbg/issue2392
...
Add BLAS extension openblas_set_num_threads_local()
2024-01-14 21:57:57 +01:00
Martin Kroeker
152a6c43b6
Add blas_omp_threads_local
2024-01-14 19:59:55 +01:00
Martin Kroeker
8a9d492af7
Add default for blas_omp_threads_local
2024-01-14 19:58:49 +01:00