Martin Kroeker
|
d6a5174e9c
|
Merge pull request #4447 from RevySR/update-thead-toolchains
Update T-Head toolchains v2.8.0
|
2024-01-22 08:10:02 +01:00 |
Han Gao/Revy/Rabenda
|
304a9b60af
|
Update T-Head toolchains v2.8.0
Signed-off-by: Han Gao/Revy/Rabenda <rabenda.cn@gmail.com>
|
2024-01-21 14:32:52 +00:00 |
Martin Kroeker
|
f5de4fad27
|
Merge pull request #4444 from Mousius/part-mapping
Add dynamic support for Arm(R) Neoverse(TM) V2 processor
|
2024-01-20 15:55:07 +01:00 |
Chris Sidebottom
|
aaf65210cc
|
Add dynamic support for Arm(R) Neoverse(TM) V2 processor
Whilst I figure out how best to map the L2 parameters without
duplicating all of `ARMV8SVE`, lets just map this to `NEOVERSEV1`.
|
2024-01-19 19:05:50 +00:00 |
Martin Kroeker
|
10c22f4a39
|
Merge pull request #4355 from imaginationtech/img-riscv64-zvl128b
[RISC-V] Add RISC-V Vector 128-bit target
|
2024-01-19 13:51:07 +01:00 |
Octavian Maghiar
|
ccbc3f875b
|
[RISC-V] Add RISCV64_ZVL128B target to common_riscv64.h
|
2024-01-19 12:40:00 +00:00 |
Octavian Maghiar
|
deecfb1a39
|
Merge branch 'risc-v' into img-riscv64-zvl128b
|
2024-01-19 12:26:38 +00:00 |
kseniyazaytseva
|
f89e0034a4
|
Fix LAPACK usage from BLAS
|
2024-01-18 23:22:26 +03:00 |
Martin Kroeker
|
f7cf637d7a
|
redo lost edit
|
2024-01-18 23:22:26 +03:00 |
Martin Kroeker
|
85548e66ca
|
Fix build failures seen with the NO_LAPACK option - cspr/csymv/csyr belong on the LAPACK list
|
2024-01-18 23:22:26 +03:00 |
Martin Kroeker
|
f129161453
|
restore C/Z SPMV, SPR, SYR,SYMV
|
2024-01-18 23:22:26 +03:00 |
kseniyazaytseva
|
5222b5fc18
|
Added axpby kernels for GENERIC RISC-V target
|
2024-01-18 23:22:26 +03:00 |
Martin Kroeker
|
1c04df20bd
|
Re-enable overriding the LAPACK SYMV,SYR,SPMV and SPR implementations
|
2024-01-18 23:20:15 +03:00 |
Martin Kroeker
|
5b4df851d7
|
fix stray blank on continuation line
|
2024-01-18 23:20:15 +03:00 |
kseniyazaytseva
|
ff41cf5c49
|
Fix BLAS, BLAS-like functions and Generic RISC-V kernels
* Fixed gemmt, imatcopy, zimatcopy_cnc functions
* Fixed cblas_cscal testing in ctest
* Removed rotmg unreacheble code
* Added zero size checks
|
2024-01-18 23:19:52 +03:00 |
Martin Kroeker
|
500442cf96
|
Merge pull request #4442 from pbo-linaro/fix-utest-compilation
Fix utest compilation
|
2024-01-18 20:59:13 +01:00 |
kseniyazaytseva
|
b193ea3d7b
|
Fix BLAS and LAPACK tests for RVV 1.0 target, update to 0.12.0 intrincics
* Update intrincics API to 0.12.0 version (Stride Segment Loads/Stores)
* Fixed nrm2, axpby, ncopy, zgemv and scal kernels
* Added zero size checks
|
2024-01-18 22:14:32 +03:00 |
Pierrick Bouvier
|
a4992e09bc
|
Fix utest compilation
Introduced recently when adding new test cases for ZSCAL
- include cblas is needed for cblas_zscal
- ASSERT macro does not exist
- missing closing )
|
2024-01-18 18:21:30 +04:00 |
Martin Kroeker
|
6f0e0e4021
|
Merge pull request #4438 from Dirreke/csky-support
Add CSKY support
|
2024-01-18 13:04:52 +01:00 |
Martin Kroeker
|
43cb266178
|
Merge pull request #4441 from martin-frbg/gemv-threshold
Increase multithreading threshold for S/DGEMV by a factor of 50
|
2024-01-17 22:25:01 +01:00 |
Martin Kroeker
|
d2fc4f3b4d
|
Increase multithreading threshold by a factor of 50
|
2024-01-17 20:59:24 +01:00 |
Martin Kroeker
|
88e994116c
|
Merge pull request #4354 from imaginationtech/img-rvv-kernel-generator
[RISC-V] Improve RVV kernel generator LMUL usage
|
2024-01-17 15:19:37 +01:00 |
Martin Kroeker
|
ec46ca7a43
|
Support Arm Compiler for Linux as classic flang (#4436)
* Support ArmCompilerforLinux as classic flang
|
2024-01-17 07:33:10 +01:00 |
Martin Kroeker
|
e3508d3713
|
Merge pull request #4439 from sergei-lewis/risc-v
Fix builds with t-head toolchains that use old intrinsics spec
|
2024-01-16 20:35:12 +01:00 |
Dirreke
|
ec89466e14
|
Add CSKY support
|
2024-01-16 23:45:06 +08:00 |
Sergei Lewis
|
9edb805e64
|
fix builds with t-head toolchains that use old versions of the intrinsics spec
|
2024-01-16 14:33:08 +00:00 |
Martin Kroeker
|
452741b67f
|
Merge pull request #4435 from imciner2/im/sapphire
Fix Clang sapphire rapids march flag
|
2024-01-16 13:57:29 +01:00 |
Ian McInerney
|
8f4e325ea8
|
Fix Clang sapphire rapids march flag
|
2024-01-15 23:42:03 +00:00 |
Martin Kroeker
|
13c764eaaa
|
Merge pull request #4434 from martin-frbg/issue4433
Only use mtune=native in ARM64 fallback paths when not cross-compiling
|
2024-01-15 23:36:07 +01:00 |
Martin Kroeker
|
025a1b2c7b
|
Only use mtune=native when not cross-compiling
|
2024-01-15 22:40:21 +01:00 |
Martin Kroeker
|
2527afaaa2
|
Merge pull request #4429 from martin-frbg/issue4428
Handle NAN and INF in ARM and generic/s390x ZSCAL
|
2024-01-15 11:26:12 +01:00 |
Martin Kroeker
|
0d2e486edf
|
Handle NAN and INF
|
2024-01-15 11:18:59 +01:00 |
Martin Kroeker
|
a782103b9c
|
Merge pull request #4425 from martin-frbg/issue2392
Add BLAS extension openblas_set_num_threads_local()
|
2024-01-14 21:57:57 +01:00 |
Martin Kroeker
|
152a6c43b6
|
Add blas_omp_threads_local
|
2024-01-14 19:59:55 +01:00 |
Martin Kroeker
|
8a9d492af7
|
Add default for blas_omp_threads_local
|
2024-01-14 19:58:49 +01:00 |
Martin Kroeker
|
b3341527ad
|
Merge pull request #4426 from martin-frbg/issue4415
Tweak LAPACK tests for SGS/DGS to avoid spurious errors resulting from FMA-induced inaccuracies
|
2024-01-13 23:27:13 +01:00 |
Martin Kroeker
|
9fab60d32f
|
Remove matrix dimension 6 from SGS to avoid spurious errors from FMA
|
2024-01-13 20:39:05 +01:00 |
Martin Kroeker
|
bf66af3dc0
|
remove matrix dimension 6 from DGS to avoid spurious errors from FMA
|
2024-01-13 20:37:36 +01:00 |
Martin Kroeker
|
87d31af2ae
|
Add openblas_set_num_threads_local()
|
2024-01-13 20:06:24 +01:00 |
Martin Kroeker
|
2e2e538b7c
|
Add openblas_set_num_threads_local() and use of blas_omp_threads_local in OMP parallel regions
|
2024-01-13 20:02:43 +01:00 |
Martin Kroeker
|
f9b2d7f225
|
Merge pull request #3253 from wi24rd/patch-1
Fix typo in common.h
|
2024-01-13 19:55:01 +01:00 |
Martin Kroeker
|
5f5b7c4f45
|
Merge pull request #4423 from martin-frbg/issue4422
Check compiler support for AVX512BF16 and base COL/SPR kernel choice on that
|
2024-01-12 16:30:50 +01:00 |
Martin Kroeker
|
f31bea07dd
|
Merge pull request #4419 from martin-frbg/issue4413
[WIP] Add fixes and utests for ZSCAL with NaN or Inf arguments
|
2024-01-12 14:27:08 +01:00 |
Martin Kroeker
|
20413ee6ec
|
Update zscal.c
|
2024-01-12 13:11:13 +01:00 |
Martin Kroeker
|
b57627c27f
|
Handle NAN and INF
|
2024-01-12 12:03:08 +01:00 |
Martin Kroeker
|
d1ead06bd8
|
define NAN and INFINITY if needed
|
2024-01-12 09:29:13 +01:00 |
Martin Kroeker
|
995a990e24
|
Make AVX512 BFLOAT16 kernels conditional on compiler capability
|
2024-01-12 00:12:46 +01:00 |
Martin Kroeker
|
1dada6d65d
|
Add compiler test and flag for AVX512BF16 capability
|
2024-01-12 00:10:56 +01:00 |
Martin Kroeker
|
7df363e1e2
|
temporarily disable the MSA C/ZSCAL kernels
|
2024-01-12 00:08:52 +01:00 |
Martin Kroeker
|
3599f2de8b
|
Merge pull request #4421 from ChipKerchner/power10Copies_DGEMM
Replace two vector loads with one vector pair load and fix endianess of stores - DGEMM PowerPC versions.
|
2024-01-10 07:49:00 +01:00 |