Commit Graph

7349 Commits

Author SHA1 Message Date
TGY b5ba95a6c0 Modernize obsolete inline order 2023-08-16 00:48:40 +02:00
Martin Kroeker 562ef5fdca
Merge pull request #4169 from felixonmars/patch-1
Use defined variable for riscv64 in arch.cmake
2023-08-12 17:20:56 +02:00
Martin Kroeker 0e5d56ae4a
Merge pull request #4170 from felixonmars/patch-2
Fix 64-bit fortran options for riscv64
2023-08-12 09:21:05 +02:00
Martin Kroeker ebc157fcc9
Merge pull request #4190 from martin-frbg/issue4186-2
Allow negative INCX in the ?NRM2 kernels
2023-08-10 23:12:59 +02:00
Martin Kroeker 34da1a067d
Allow negative INCX (API change from version 3.10 of the reference implementation) 2023-08-10 17:01:50 +02:00
Martin Kroeker 07e32c4cb8
Allow negative INCX (API change from version 3.10 of the reference implementation) 2023-08-10 17:00:18 +02:00
Martin Kroeker c211da0688
Allow negative INCX (API change from version 3.10 of the reference implementation) 2023-08-10 16:58:57 +02:00
Martin Kroeker a34a0a7abc
Allow negative INCX (API change from version 3.10 of the reference implementation) 2023-08-10 16:56:52 +02:00
Martin Kroeker 54d3246fc6
Allow negative INCX (API change from version 3.10 of the reference implementation) 2023-08-10 16:55:17 +02:00
Martin Kroeker 7dd441d5db
Allow negative INCX (API change from version 3.10 of the reference implementation) 2023-08-10 16:53:33 +02:00
Martin Kroeker f692178792
Allow negative INCX (API change from version 3.10 of the reference implementation) 2023-08-10 16:52:09 +02:00
Martin Kroeker d15ffb7fdf
Allow negative INCX (API change from version 3.10 of the reference implementation) 2023-08-10 16:50:44 +02:00
Martin Kroeker a2d867f4d1
Allow negative iNCX (API change from version 3.10 of the reference implementation) 2023-08-10 16:49:05 +02:00
Martin Kroeker 9a0e9c8b69
Merge pull request #4171 from boomanaiden154/clang-libomp-fixes
Fix build with some clang installations when openmp is enabled
2023-08-10 16:32:33 +02:00
Martin Kroeker 7af0f41762
Merge pull request #4189 from martin-frbg/issue4186
Prepare the interface for INCX < 0 in the new NRM2 implementation from BLAS 3.10
2023-08-10 14:11:12 +02:00
Martin Kroeker 4cc804c754
Prepare for INCX < 0 in new NRM2 implementation from BLAS 3.10 2023-08-09 16:13:23 +02:00
Martin Kroeker afdc56a421
Merge pull request #4158 from XiWeiGu/loongarch64_update_dgemm_kernel
LoongArch64: Update dgemm kernel
2023-08-07 12:44:09 +02:00
Martin Kroeker 91e5513f3b
Merge pull request #4184 from XiWeiGu/dgemv
LoongArch64: Add dgemv_t_8_lasx.S and dgemv_n_8_lasx.S V2
2023-08-07 08:47:19 +02:00
gxw e8b571d245 LoongArch64: Add dgemv_t_8_lasx.S and dgemv_n_8_lasx.S V2 2023-08-07 11:20:42 +08:00
gxw 71fcee6eef LoongArch64: Update dgemm kernel 2023-08-07 11:06:52 +08:00
Martin Kroeker 0f521ece25
Merge pull request #4183 from martin-frbg/issue4181
Apply USE_TRMM to MIPS64_GENERIC as to GENERIC in gmake builds
2023-08-06 18:59:50 +02:00
Martin Kroeker 232420bdf5
Merge pull request #4182 from xianyi/revert-4153-dgemv
Revert "LoongArch64: Add dgemv_t_8_lasx.S and dgemv_n_8_lasx.S"
2023-08-06 16:00:32 +02:00
Martin Kroeker 41c31bc1d4
Revert "LoongArch64: Add dgemv_t_8_lasx.S and dgemv_n_8_lasx.S" 2023-08-06 16:00:03 +02:00
Martin Kroeker 61d803547a
Apply USE_TRMM to MIPS64_GENERIC as to GENERIC 2023-08-06 15:17:38 +02:00
Martin Kroeker f8ee309402
Merge pull request #4153 from XiWeiGu/dgemv
LoongArch64: Add dgemv_t_8_lasx.S and dgemv_n_8_lasx.S
2023-08-06 08:49:16 +02:00
Martin Kroeker 12e98482e9
Merge pull request #4179 from martin-frbg/jenkinsfix
Run "make clean" on Jenkins first to remove stale objects
2023-08-05 22:47:26 +02:00
Martin Kroeker 51c218d17a
Update Jenkinsfile 2023-08-05 18:33:15 +02:00
Martin Kroeker df978c90cd
Update Jenkinsfile.pwr 2023-08-05 18:32:41 +02:00
Martin Kroeker ef4a7e3fca
Merge pull request #4127 from XiWeiGu/LoongArch64-CI
LoongArch64 CI
2023-08-05 18:19:47 +02:00
Martin Kroeker b63e4581a3
Merge pull request #4016 from mmuetzel/ci-msys2
Add support for LLVM Flang
2023-08-05 15:59:34 +02:00
Markus Mützel 53378296c8 CI: Build with NO_AVX512 for the runners that use Flang 16. 2023-08-05 13:47:38 +02:00
Markus Mützel 1c3fcaaf42 CI (MSYS2): Re-run failed tests verbosely. 2023-08-05 13:16:06 +02:00
Markus Mützel f334bd9041 CI (MSYS2): Use LLVM Flang on CLANG64 runners. Add CLANG32 runner. 2023-08-05 13:16:06 +02:00
Markus Mützel 57256623f4 fc.cmake: Add support for LLVM Flang. 2023-08-05 13:16:06 +02:00
gxw ec1e96aac8 LoongArch64: Add dgemv_t_8_lasx.S and dgemv_n_8_lasx.S 2023-08-05 10:24:17 +08:00
gxw 96bf226bca gh-actions: Add loongarch64 CI 2023-08-05 10:21:43 +08:00
gxw db9a42f8c3 LoongArch64: using getauxval to do runtime check
Using the getauxval instruction can prevent errors
caused by hardware supporting vector instructions
while the kernel does not support them
2023-08-05 10:21:43 +08:00
gxw d46772e037 LoongArch64: Add compiler feature checks 2023-08-05 10:21:43 +08:00
Martin Kroeker 8a171350db
Merge pull request #4178 from martin-frbg/llvm17
Add (gmake) support for LLVM17's new flang
2023-08-04 20:56:00 +02:00
Martin Kroeker ef23240ab8
Merge pull request #4177 from martin-frbg/issue4176
Fix ZAXPY calls with INCX=0 on pre-AVX x86_64 and add utest
2023-08-04 20:55:22 +02:00
Martin Kroeker e8bc8a0ee7
Add support for the new generation flang that comes with LLVM17 2023-08-04 15:32:19 +02:00
Martin Kroeker f2c9ae9c33
Identify the new generation of flang that comes with LLVM17 2023-08-04 15:31:03 +02:00
Martin Kroeker 862d06ab8a
Add INCX=0,INCY=1 test case for CAXPY 2023-08-04 15:28:02 +02:00
Martin Kroeker d64fa286f7
add test case for zaxpy with incx=0 incy=1 2023-08-04 12:26:36 +02:00
Martin Kroeker 4664b57e6e
use shortcut only when both incx and incy are zero 2023-08-04 12:25:34 +02:00
Martin Kroeker c2f4bdbbb4
Merge pull request #4163 from martin-frbg/issue4017
Rework OpenMP thread count limit handling
2023-07-31 17:58:51 +02:00
Martin Kroeker 09131f79a6
Merge pull request #4164 from martin-frbg/issue4162
Enable use of AVX512 microkernels with NVIDIA HPC from version 22.3
2023-07-29 15:07:20 +02:00
Martin Kroeker 6a428b5629
Update casum_microk_skylakex-2.c 2023-07-29 12:24:30 +02:00
Martin Kroeker ebb447e32e
Update zasum_microk_skylakex-2.c 2023-07-29 12:23:57 +02:00
Martin Kroeker 9f6847583a
nvc currently miscompiles this, hopefully fixed in release 23.09 2023-07-29 11:50:16 +02:00