TGY
|
b5ba95a6c0
|
Modernize obsolete inline order
|
2023-08-16 00:48:40 +02:00 |
Martin Kroeker
|
562ef5fdca
|
Merge pull request #4169 from felixonmars/patch-1
Use defined variable for riscv64 in arch.cmake
|
2023-08-12 17:20:56 +02:00 |
Martin Kroeker
|
0e5d56ae4a
|
Merge pull request #4170 from felixonmars/patch-2
Fix 64-bit fortran options for riscv64
|
2023-08-12 09:21:05 +02:00 |
Martin Kroeker
|
ebc157fcc9
|
Merge pull request #4190 from martin-frbg/issue4186-2
Allow negative INCX in the ?NRM2 kernels
|
2023-08-10 23:12:59 +02:00 |
Martin Kroeker
|
34da1a067d
|
Allow negative INCX (API change from version 3.10 of the reference implementation)
|
2023-08-10 17:01:50 +02:00 |
Martin Kroeker
|
07e32c4cb8
|
Allow negative INCX (API change from version 3.10 of the reference implementation)
|
2023-08-10 17:00:18 +02:00 |
Martin Kroeker
|
c211da0688
|
Allow negative INCX (API change from version 3.10 of the reference implementation)
|
2023-08-10 16:58:57 +02:00 |
Martin Kroeker
|
a34a0a7abc
|
Allow negative INCX (API change from version 3.10 of the reference implementation)
|
2023-08-10 16:56:52 +02:00 |
Martin Kroeker
|
54d3246fc6
|
Allow negative INCX (API change from version 3.10 of the reference implementation)
|
2023-08-10 16:55:17 +02:00 |
Martin Kroeker
|
7dd441d5db
|
Allow negative INCX (API change from version 3.10 of the reference implementation)
|
2023-08-10 16:53:33 +02:00 |
Martin Kroeker
|
f692178792
|
Allow negative INCX (API change from version 3.10 of the reference implementation)
|
2023-08-10 16:52:09 +02:00 |
Martin Kroeker
|
d15ffb7fdf
|
Allow negative INCX (API change from version 3.10 of the reference implementation)
|
2023-08-10 16:50:44 +02:00 |
Martin Kroeker
|
a2d867f4d1
|
Allow negative iNCX (API change from version 3.10 of the reference implementation)
|
2023-08-10 16:49:05 +02:00 |
Martin Kroeker
|
9a0e9c8b69
|
Merge pull request #4171 from boomanaiden154/clang-libomp-fixes
Fix build with some clang installations when openmp is enabled
|
2023-08-10 16:32:33 +02:00 |
Martin Kroeker
|
7af0f41762
|
Merge pull request #4189 from martin-frbg/issue4186
Prepare the interface for INCX < 0 in the new NRM2 implementation from BLAS 3.10
|
2023-08-10 14:11:12 +02:00 |
Martin Kroeker
|
4cc804c754
|
Prepare for INCX < 0 in new NRM2 implementation from BLAS 3.10
|
2023-08-09 16:13:23 +02:00 |
Martin Kroeker
|
afdc56a421
|
Merge pull request #4158 from XiWeiGu/loongarch64_update_dgemm_kernel
LoongArch64: Update dgemm kernel
|
2023-08-07 12:44:09 +02:00 |
Martin Kroeker
|
91e5513f3b
|
Merge pull request #4184 from XiWeiGu/dgemv
LoongArch64: Add dgemv_t_8_lasx.S and dgemv_n_8_lasx.S V2
|
2023-08-07 08:47:19 +02:00 |
gxw
|
e8b571d245
|
LoongArch64: Add dgemv_t_8_lasx.S and dgemv_n_8_lasx.S V2
|
2023-08-07 11:20:42 +08:00 |
gxw
|
71fcee6eef
|
LoongArch64: Update dgemm kernel
|
2023-08-07 11:06:52 +08:00 |
Martin Kroeker
|
0f521ece25
|
Merge pull request #4183 from martin-frbg/issue4181
Apply USE_TRMM to MIPS64_GENERIC as to GENERIC in gmake builds
|
2023-08-06 18:59:50 +02:00 |
Martin Kroeker
|
232420bdf5
|
Merge pull request #4182 from xianyi/revert-4153-dgemv
Revert "LoongArch64: Add dgemv_t_8_lasx.S and dgemv_n_8_lasx.S"
|
2023-08-06 16:00:32 +02:00 |
Martin Kroeker
|
41c31bc1d4
|
Revert "LoongArch64: Add dgemv_t_8_lasx.S and dgemv_n_8_lasx.S"
|
2023-08-06 16:00:03 +02:00 |
Martin Kroeker
|
61d803547a
|
Apply USE_TRMM to MIPS64_GENERIC as to GENERIC
|
2023-08-06 15:17:38 +02:00 |
Martin Kroeker
|
f8ee309402
|
Merge pull request #4153 from XiWeiGu/dgemv
LoongArch64: Add dgemv_t_8_lasx.S and dgemv_n_8_lasx.S
|
2023-08-06 08:49:16 +02:00 |
Martin Kroeker
|
12e98482e9
|
Merge pull request #4179 from martin-frbg/jenkinsfix
Run "make clean" on Jenkins first to remove stale objects
|
2023-08-05 22:47:26 +02:00 |
Martin Kroeker
|
51c218d17a
|
Update Jenkinsfile
|
2023-08-05 18:33:15 +02:00 |
Martin Kroeker
|
df978c90cd
|
Update Jenkinsfile.pwr
|
2023-08-05 18:32:41 +02:00 |
Martin Kroeker
|
ef4a7e3fca
|
Merge pull request #4127 from XiWeiGu/LoongArch64-CI
LoongArch64 CI
|
2023-08-05 18:19:47 +02:00 |
Martin Kroeker
|
b63e4581a3
|
Merge pull request #4016 from mmuetzel/ci-msys2
Add support for LLVM Flang
|
2023-08-05 15:59:34 +02:00 |
Markus Mützel
|
53378296c8
|
CI: Build with NO_AVX512 for the runners that use Flang 16.
|
2023-08-05 13:47:38 +02:00 |
Markus Mützel
|
1c3fcaaf42
|
CI (MSYS2): Re-run failed tests verbosely.
|
2023-08-05 13:16:06 +02:00 |
Markus Mützel
|
f334bd9041
|
CI (MSYS2): Use LLVM Flang on CLANG64 runners. Add CLANG32 runner.
|
2023-08-05 13:16:06 +02:00 |
Markus Mützel
|
57256623f4
|
fc.cmake: Add support for LLVM Flang.
|
2023-08-05 13:16:06 +02:00 |
gxw
|
ec1e96aac8
|
LoongArch64: Add dgemv_t_8_lasx.S and dgemv_n_8_lasx.S
|
2023-08-05 10:24:17 +08:00 |
gxw
|
96bf226bca
|
gh-actions: Add loongarch64 CI
|
2023-08-05 10:21:43 +08:00 |
gxw
|
db9a42f8c3
|
LoongArch64: using getauxval to do runtime check
Using the getauxval instruction can prevent errors
caused by hardware supporting vector instructions
while the kernel does not support them
|
2023-08-05 10:21:43 +08:00 |
gxw
|
d46772e037
|
LoongArch64: Add compiler feature checks
|
2023-08-05 10:21:43 +08:00 |
Martin Kroeker
|
8a171350db
|
Merge pull request #4178 from martin-frbg/llvm17
Add (gmake) support for LLVM17's new flang
|
2023-08-04 20:56:00 +02:00 |
Martin Kroeker
|
ef23240ab8
|
Merge pull request #4177 from martin-frbg/issue4176
Fix ZAXPY calls with INCX=0 on pre-AVX x86_64 and add utest
|
2023-08-04 20:55:22 +02:00 |
Martin Kroeker
|
e8bc8a0ee7
|
Add support for the new generation flang that comes with LLVM17
|
2023-08-04 15:32:19 +02:00 |
Martin Kroeker
|
f2c9ae9c33
|
Identify the new generation of flang that comes with LLVM17
|
2023-08-04 15:31:03 +02:00 |
Martin Kroeker
|
862d06ab8a
|
Add INCX=0,INCY=1 test case for CAXPY
|
2023-08-04 15:28:02 +02:00 |
Martin Kroeker
|
d64fa286f7
|
add test case for zaxpy with incx=0 incy=1
|
2023-08-04 12:26:36 +02:00 |
Martin Kroeker
|
4664b57e6e
|
use shortcut only when both incx and incy are zero
|
2023-08-04 12:25:34 +02:00 |
Martin Kroeker
|
c2f4bdbbb4
|
Merge pull request #4163 from martin-frbg/issue4017
Rework OpenMP thread count limit handling
|
2023-07-31 17:58:51 +02:00 |
Martin Kroeker
|
09131f79a6
|
Merge pull request #4164 from martin-frbg/issue4162
Enable use of AVX512 microkernels with NVIDIA HPC from version 22.3
|
2023-07-29 15:07:20 +02:00 |
Martin Kroeker
|
6a428b5629
|
Update casum_microk_skylakex-2.c
|
2023-07-29 12:24:30 +02:00 |
Martin Kroeker
|
ebb447e32e
|
Update zasum_microk_skylakex-2.c
|
2023-07-29 12:23:57 +02:00 |
Martin Kroeker
|
9f6847583a
|
nvc currently miscompiles this, hopefully fixed in release 23.09
|
2023-07-29 11:50:16 +02:00 |