Commit Graph

7367 Commits

Author SHA1 Message Date
Martin Kroeker
3bb70b8ca4 Merge pull request #4205 from martin-frbg/fixintmain
Fix missing type declaration for main() in converted LAPACK files
2023-08-26 08:38:38 +02:00
Martin Kroeker
88435104c8 Merge pull request #4204 from martin-frbg/llvm17-2
Work around LLVM17 miscompiling the AVX512 microkernels for CASUM/ZASUM
2023-08-26 00:32:18 +02:00
Martin Kroeker
be57c595aa Merge pull request #4203 from martin-frbg/issue4201
Add support for building arm64 SVE kernels with the NVIDIA HPC compiler
2023-08-25 22:55:38 +02:00
Martin Kroeker
7a6203ffa1 restore default Neoverse SVE build instructions for non-NVIDIA compilers 2023-08-25 18:25:51 +02:00
Martin Kroeker
7f7d3896dd Fix missing type declaration for main 2023-08-25 18:07:47 +02:00
Martin Kroeker
2c3034ff7f Disable the C/ZASUM AVX512 microkernels when compiling with LLVM17 as well 2023-08-25 17:22:51 +02:00
Martin Kroeker
49689fbef7 Add support for compiling SVE kernels with the NVIDIA HPC compiler 2023-08-25 17:11:04 +02:00
Martin Kroeker
8794544b43 Add support for compiling the Neoverse SVE kernels with the NVIDIA HPC compiler 2023-08-25 16:47:32 +02:00
Martin Kroeker
e9f1b2d26f Expand the SVE compatibility check for the NVIDIA HPC compiler 2023-08-25 16:45:56 +02:00
Martin Kroeker
d69f57c8c2 Merge pull request #4200 from XiWeiGu/loongarch64_sgemm
LoongArch64: Add sgemm_kernel
2023-08-23 13:05:34 +02:00
gxw
553cc1372f LoongArch64: Add sgemm_kernel 2023-08-23 16:08:43 +08:00
Martin Kroeker
12ede72ab7 Merge pull request #4192 from imciner2/im/clangfix
Fix cooperlake and sapphire rapids march flags on clang
2023-08-21 15:46:35 +02:00
Martin Kroeker
8d9f701fbf Merge pull request #4195 from TiborGY/BF16_ignore
Add junk from BF16 test to .gitignore
2023-08-19 12:16:44 +02:00
Martin Kroeker
7f67ba9147 Merge pull request #4198 from martin-frbg/issue4197
Correct INFO returned for too small lda in non-CBLAS s/dgeadd
2023-08-19 07:51:51 +02:00
Martin Kroeker
214be14c1d Correct INFO returned for lda in non-CBLAS s/dgeadd 2023-08-18 22:48:30 +02:00
Martin Kroeker
1b09f4b2bb Merge pull request #4193 from imciner2/im/ppcgnu
Fix power10 gcc intrinsic check
2023-08-17 22:56:08 +02:00
Ian McInerney
79c15db348 Fix power10 gcc intrinsic check
__builtin_vsx_assemble_pair was only in GCC 10-11.2 and was replaced by
__builtin_vsx_build_pair thereafter.
2023-08-17 15:05:29 +01:00
TiborGY
0d30daa772 Add junk from BF16 test to .gitignore 2023-08-16 00:07:17 +02:00
Ian McInerney
8a8a8479be Fix cooperlake and sapphire rapids march flags on clang
The march=cooperlake and march=sapphirerapids flags were never getting
added when building with Clang targetting those architectures. Instead
it was falling back to the skylake AVX512 implementation.

Clang added support for these two architectures in Clang 9 and Clang 12,
so introduce new checks for those versions to enable the appropriate
march flag, and fallback to skylake otherwise.
2023-08-14 16:12:35 +01:00
Martin Kroeker
562ef5fdca Merge pull request #4169 from felixonmars/patch-1
Use defined variable for riscv64 in arch.cmake
2023-08-12 17:20:56 +02:00
Martin Kroeker
0e5d56ae4a Merge pull request #4170 from felixonmars/patch-2
Fix 64-bit fortran options for riscv64
2023-08-12 09:21:05 +02:00
Martin Kroeker
ebc157fcc9 Merge pull request #4190 from martin-frbg/issue4186-2
Allow negative INCX in the ?NRM2 kernels
2023-08-10 23:12:59 +02:00
Martin Kroeker
34da1a067d Allow negative INCX (API change from version 3.10 of the reference implementation) 2023-08-10 17:01:50 +02:00
Martin Kroeker
07e32c4cb8 Allow negative INCX (API change from version 3.10 of the reference implementation) 2023-08-10 17:00:18 +02:00
Martin Kroeker
c211da0688 Allow negative INCX (API change from version 3.10 of the reference implementation) 2023-08-10 16:58:57 +02:00
Martin Kroeker
a34a0a7abc Allow negative INCX (API change from version 3.10 of the reference implementation) 2023-08-10 16:56:52 +02:00
Martin Kroeker
54d3246fc6 Allow negative INCX (API change from version 3.10 of the reference implementation) 2023-08-10 16:55:17 +02:00
Martin Kroeker
7dd441d5db Allow negative INCX (API change from version 3.10 of the reference implementation) 2023-08-10 16:53:33 +02:00
Martin Kroeker
f692178792 Allow negative INCX (API change from version 3.10 of the reference implementation) 2023-08-10 16:52:09 +02:00
Martin Kroeker
d15ffb7fdf Allow negative INCX (API change from version 3.10 of the reference implementation) 2023-08-10 16:50:44 +02:00
Martin Kroeker
a2d867f4d1 Allow negative iNCX (API change from version 3.10 of the reference implementation) 2023-08-10 16:49:05 +02:00
Martin Kroeker
9a0e9c8b69 Merge pull request #4171 from boomanaiden154/clang-libomp-fixes
Fix build with some clang installations when openmp is enabled
2023-08-10 16:32:33 +02:00
Martin Kroeker
7af0f41762 Merge pull request #4189 from martin-frbg/issue4186
Prepare the interface for INCX < 0 in the new NRM2 implementation from BLAS 3.10
2023-08-10 14:11:12 +02:00
Martin Kroeker
4cc804c754 Prepare for INCX < 0 in new NRM2 implementation from BLAS 3.10 2023-08-09 16:13:23 +02:00
Martin Kroeker
afdc56a421 Merge pull request #4158 from XiWeiGu/loongarch64_update_dgemm_kernel
LoongArch64: Update dgemm kernel
2023-08-07 12:44:09 +02:00
Martin Kroeker
91e5513f3b Merge pull request #4184 from XiWeiGu/dgemv
LoongArch64: Add dgemv_t_8_lasx.S and dgemv_n_8_lasx.S V2
2023-08-07 08:47:19 +02:00
gxw
e8b571d245 LoongArch64: Add dgemv_t_8_lasx.S and dgemv_n_8_lasx.S V2 2023-08-07 11:20:42 +08:00
gxw
71fcee6eef LoongArch64: Update dgemm kernel 2023-08-07 11:06:52 +08:00
Martin Kroeker
0f521ece25 Merge pull request #4183 from martin-frbg/issue4181
Apply USE_TRMM to MIPS64_GENERIC as to GENERIC in gmake builds
2023-08-06 18:59:50 +02:00
Martin Kroeker
232420bdf5 Merge pull request #4182 from xianyi/revert-4153-dgemv
Revert "LoongArch64: Add dgemv_t_8_lasx.S and dgemv_n_8_lasx.S"
2023-08-06 16:00:32 +02:00
Martin Kroeker
41c31bc1d4 Revert "LoongArch64: Add dgemv_t_8_lasx.S and dgemv_n_8_lasx.S" 2023-08-06 16:00:03 +02:00
Martin Kroeker
61d803547a Apply USE_TRMM to MIPS64_GENERIC as to GENERIC 2023-08-06 15:17:38 +02:00
Martin Kroeker
f8ee309402 Merge pull request #4153 from XiWeiGu/dgemv
LoongArch64: Add dgemv_t_8_lasx.S and dgemv_n_8_lasx.S
2023-08-06 08:49:16 +02:00
Martin Kroeker
12e98482e9 Merge pull request #4179 from martin-frbg/jenkinsfix
Run "make clean" on Jenkins first to remove stale objects
2023-08-05 22:47:26 +02:00
Martin Kroeker
51c218d17a Update Jenkinsfile 2023-08-05 18:33:15 +02:00
Martin Kroeker
df978c90cd Update Jenkinsfile.pwr 2023-08-05 18:32:41 +02:00
Martin Kroeker
ef4a7e3fca Merge pull request #4127 from XiWeiGu/LoongArch64-CI
LoongArch64 CI
2023-08-05 18:19:47 +02:00
Martin Kroeker
b63e4581a3 Merge pull request #4016 from mmuetzel/ci-msys2
Add support for LLVM Flang
2023-08-05 15:59:34 +02:00
Markus Mützel
53378296c8 CI: Build with NO_AVX512 for the runners that use Flang 16. 2023-08-05 13:47:38 +02:00
Markus Mützel
1c3fcaaf42 CI (MSYS2): Re-run failed tests verbosely. 2023-08-05 13:16:06 +02:00