Commit Graph

  • 9f815cf1bf Update version to 0.3.24 v0.3.24 Martin Kroeker 2023-09-03 22:58:32 +02:00
  • 3c49711f1e Update version to 0.3.24 Martin Kroeker 2023-09-03 22:57:22 +02:00
  • 2c68822cde Merge pull request #4210 from xianyi/develop Martin Kroeker 2023-09-03 22:55:22 +02:00
  • 3c51bd0fbf Merge pull request #4209 from martin-frbg/changelog0324 Martin Kroeker 2023-09-03 22:51:03 +02:00
  • 5d73041068 Update Changelog for 0.3.24 Martin Kroeker 2023-09-03 19:05:53 +02:00
  • 8e6d93359d Merge pull request #4196 from TiborGY/obsolete_inlines Martin Kroeker 2023-09-03 14:12:42 +02:00
  • 33797c44fc Merge pull request #4143 from martin-frbg/issue4130 Martin Kroeker 2023-09-01 14:20:25 +02:00
  • ee310e3533 Merge pull request #4208 from XiWeiGu/loongarch64_toolchain Martin Kroeker 2023-09-01 10:50:01 +02:00
  • 42909ce57d Merge branch 'xianyi:develop' into issue4130 Martin Kroeker 2023-09-01 09:05:58 +02:00
  • a2a184572c update zrotg Martin Kroeker 2023-08-31 23:42:12 +02:00
  • 394a1fd1bf LoongArch64: Compatible with early internal toolchain gxw 2023-08-31 15:44:22 +08:00
  • 12d8f219d6 Merge pull request #4207 from martin-frbg/issue4174-2 Martin Kroeker 2023-08-26 12:05:37 +02:00
  • 9c4ae4d4fb Merge pull request #4206 from martin-frbg/issue4201-2 Martin Kroeker 2023-08-26 10:17:27 +02:00
  • 3bb70b8ca4 Merge pull request #4205 from martin-frbg/fixintmain Martin Kroeker 2023-08-26 08:38:38 +02:00
  • 3b6050ac04 clarify the comment on the out-of-bounds check from #723 Martin Kroeker 2023-08-26 02:00:00 +02:00
  • 22a402bc2c clarify the comment on the out-of-bounds check from #723 Martin Kroeker 2023-08-26 01:58:08 +02:00
  • 88435104c8 Merge pull request #4204 from martin-frbg/llvm17-2 Martin Kroeker 2023-08-26 00:32:18 +02:00
  • fc8894dd98 Workaround miscompilation by NVIDIA nvc Martin Kroeker 2023-08-26 00:30:17 +02:00
  • be57c595aa Merge pull request #4203 from martin-frbg/issue4201 Martin Kroeker 2023-08-25 22:55:38 +02:00
  • 7a6203ffa1 restore default Neoverse SVE build instructions for non-NVIDIA compilers Martin Kroeker 2023-08-25 18:25:51 +02:00
  • 7f7d3896dd Fix missing type declaration for main Martin Kroeker 2023-08-25 18:07:47 +02:00
  • 2c3034ff7f Disable the C/ZASUM AVX512 microkernels when compiling with LLVM17 as well Martin Kroeker 2023-08-25 17:22:51 +02:00
  • 49689fbef7 Add support for compiling SVE kernels with the NVIDIA HPC compiler Martin Kroeker 2023-08-25 17:11:04 +02:00
  • 8794544b43 Add support for compiling the Neoverse SVE kernels with the NVIDIA HPC compiler Martin Kroeker 2023-08-25 16:47:32 +02:00
  • e9f1b2d26f Expand the SVE compatibility check for the NVIDIA HPC compiler Martin Kroeker 2023-08-25 16:45:56 +02:00
  • d69f57c8c2 Merge pull request #4200 from XiWeiGu/loongarch64_sgemm Martin Kroeker 2023-08-23 13:05:34 +02:00
  • 553cc1372f LoongArch64: Add sgemm_kernel gxw 2023-08-18 17:39:44 +08:00
  • 12ede72ab7 Merge pull request #4192 from imciner2/im/clangfix Martin Kroeker 2023-08-21 15:46:35 +02:00
  • 8d9f701fbf Merge pull request #4195 from TiborGY/BF16_ignore Martin Kroeker 2023-08-19 12:16:44 +02:00
  • 7f67ba9147 Merge pull request #4198 from martin-frbg/issue4197 Martin Kroeker 2023-08-19 07:51:51 +02:00
  • 214be14c1d Correct INFO returned for lda in non-CBLAS s/dgeadd Martin Kroeker 2023-08-18 22:48:30 +02:00
  • 1b09f4b2bb Merge pull request #4193 from imciner2/im/ppcgnu Martin Kroeker 2023-08-17 22:56:08 +02:00
  • 79c15db348 Fix power10 gcc intrinsic check Ian McInerney 2023-08-14 21:36:35 +01:00
  • b5ba95a6c0 Modernize obsolete inline order TGY 2023-08-16 00:48:40 +02:00
  • 0d30daa772 Add junk from BF16 test to .gitignore TiborGY 2023-08-16 00:07:17 +02:00
  • 8a8a8479be Fix cooperlake and sapphire rapids march flags on clang Ian McInerney 2023-08-14 15:41:28 +01:00
  • 562ef5fdca Merge pull request #4169 from felixonmars/patch-1 Martin Kroeker 2023-08-12 17:20:56 +02:00
  • 0e5d56ae4a Merge pull request #4170 from felixonmars/patch-2 Martin Kroeker 2023-08-12 09:21:05 +02:00
  • ebc157fcc9 Merge pull request #4190 from martin-frbg/issue4186-2 Martin Kroeker 2023-08-10 23:12:59 +02:00
  • 34da1a067d Allow negative INCX (API change from version 3.10 of the reference implementation) Martin Kroeker 2023-08-10 17:01:50 +02:00
  • 07e32c4cb8 Allow negative INCX (API change from version 3.10 of the reference implementation) Martin Kroeker 2023-08-10 17:00:18 +02:00
  • c211da0688 Allow negative INCX (API change from version 3.10 of the reference implementation) Martin Kroeker 2023-08-10 16:58:57 +02:00
  • a34a0a7abc Allow negative INCX (API change from version 3.10 of the reference implementation) Martin Kroeker 2023-08-10 16:56:52 +02:00
  • 54d3246fc6 Allow negative INCX (API change from version 3.10 of the reference implementation) Martin Kroeker 2023-08-10 16:55:17 +02:00
  • 7dd441d5db Allow negative INCX (API change from version 3.10 of the reference implementation) Martin Kroeker 2023-08-10 16:53:33 +02:00
  • f692178792 Allow negative INCX (API change from version 3.10 of the reference implementation) Martin Kroeker 2023-08-10 16:52:09 +02:00
  • d15ffb7fdf Allow negative INCX (API change from version 3.10 of the reference implementation) Martin Kroeker 2023-08-10 16:50:44 +02:00
  • a2d867f4d1 Allow negative iNCX (API change from version 3.10 of the reference implementation) Martin Kroeker 2023-08-10 16:49:05 +02:00
  • 9a0e9c8b69 Merge pull request #4171 from boomanaiden154/clang-libomp-fixes Martin Kroeker 2023-08-10 16:32:33 +02:00
  • 7af0f41762 Merge pull request #4189 from martin-frbg/issue4186 Martin Kroeker 2023-08-10 14:11:12 +02:00
  • 4cc804c754 Prepare for INCX < 0 in new NRM2 implementation from BLAS 3.10 Martin Kroeker 2023-08-09 16:13:23 +02:00
  • 4d0f000db6 MIPS: Enable MSA gxw 2023-08-07 16:55:59 +08:00
  • afdc56a421 Merge pull request #4158 from XiWeiGu/loongarch64_update_dgemm_kernel Martin Kroeker 2023-08-07 12:44:09 +02:00
  • 91e5513f3b Merge pull request #4184 from XiWeiGu/dgemv Martin Kroeker 2023-08-07 08:47:19 +02:00
  • e8b571d245 LoongArch64: Add dgemv_t_8_lasx.S and dgemv_n_8_lasx.S V2 gxw 2023-07-11 10:01:12 +08:00
  • 71fcee6eef LoongArch64: Update dgemm kernel gxw 2023-06-29 11:11:08 +08:00
  • 0f521ece25 Merge pull request #4183 from martin-frbg/issue4181 Martin Kroeker 2023-08-06 18:59:50 +02:00
  • 232420bdf5 Merge pull request #4182 from xianyi/revert-4153-dgemv Martin Kroeker 2023-08-06 16:00:32 +02:00
  • 41c31bc1d4 Revert "LoongArch64: Add dgemv_t_8_lasx.S and dgemv_n_8_lasx.S" revert-4153-dgemv Martin Kroeker 2023-08-06 16:00:03 +02:00
  • 61d803547a Apply USE_TRMM to MIPS64_GENERIC as to GENERIC Martin Kroeker 2023-08-06 15:17:38 +02:00
  • f8ee309402 Merge pull request #4153 from XiWeiGu/dgemv Martin Kroeker 2023-08-06 08:49:16 +02:00
  • 12e98482e9 Merge pull request #4179 from martin-frbg/jenkinsfix Martin Kroeker 2023-08-05 22:47:26 +02:00
  • 51c218d17a Update Jenkinsfile Martin Kroeker 2023-08-05 18:33:15 +02:00
  • df978c90cd Update Jenkinsfile.pwr Martin Kroeker 2023-08-05 18:32:41 +02:00
  • ef4a7e3fca Merge pull request #4127 from XiWeiGu/LoongArch64-CI Martin Kroeker 2023-08-05 18:19:47 +02:00
  • b63e4581a3 Merge pull request #4016 from mmuetzel/ci-msys2 Martin Kroeker 2023-08-05 15:59:34 +02:00
  • 53378296c8 CI: Build with NO_AVX512 for the runners that use Flang 16. Markus Mützel 2023-08-05 13:47:38 +02:00
  • 1c3fcaaf42 CI (MSYS2): Re-run failed tests verbosely. Markus Mützel 2023-04-24 18:32:03 +02:00
  • f334bd9041 CI (MSYS2): Use LLVM Flang on CLANG64 runners. Add CLANG32 runner. Markus Mützel 2023-04-21 10:36:21 +02:00
  • 57256623f4 fc.cmake: Add support for LLVM Flang. Markus Mützel 2023-04-21 10:20:54 +02:00
  • ec1e96aac8 LoongArch64: Add dgemv_t_8_lasx.S and dgemv_n_8_lasx.S gxw 2023-07-11 10:01:12 +08:00
  • 96bf226bca gh-actions: Add loongarch64 CI gxw 2023-06-30 17:34:08 +08:00
  • db9a42f8c3 LoongArch64: using getauxval to do runtime check gxw 2023-06-30 16:31:47 +08:00
  • d46772e037 LoongArch64: Add compiler feature checks gxw 2023-06-30 16:19:38 +08:00
  • 8a171350db Merge pull request #4178 from martin-frbg/llvm17 Martin Kroeker 2023-08-04 20:56:00 +02:00
  • ef23240ab8 Merge pull request #4177 from martin-frbg/issue4176 Martin Kroeker 2023-08-04 20:55:22 +02:00
  • e8bc8a0ee7 Add support for the new generation flang that comes with LLVM17 Martin Kroeker 2023-08-04 15:32:19 +02:00
  • f2c9ae9c33 Identify the new generation of flang that comes with LLVM17 Martin Kroeker 2023-08-04 15:31:03 +02:00
  • 862d06ab8a Add INCX=0,INCY=1 test case for CAXPY Martin Kroeker 2023-08-04 15:28:02 +02:00
  • d64fa286f7 add test case for zaxpy with incx=0 incy=1 Martin Kroeker 2023-08-04 12:26:36 +02:00
  • 4664b57e6e use shortcut only when both incx and incy are zero Martin Kroeker 2023-08-04 12:25:34 +02:00
  • c2f4bdbbb4 Merge pull request #4163 from martin-frbg/issue4017 Martin Kroeker 2023-07-31 17:58:51 +02:00
  • 09131f79a6 Merge pull request #4164 from martin-frbg/issue4162 Martin Kroeker 2023-07-29 15:07:20 +02:00
  • 6a428b5629 Update casum_microk_skylakex-2.c Martin Kroeker 2023-07-29 12:24:30 +02:00
  • ebb447e32e Update zasum_microk_skylakex-2.c Martin Kroeker 2023-07-29 12:23:57 +02:00
  • 9f6847583a nvc currently miscompiles this, hopefully fixed in release 23.09 Martin Kroeker 2023-07-29 11:50:16 +02:00
  • fe54ee3d15 nvc currently miscompiles this, hopefully fixed in release 23.09 Martin Kroeker 2023-07-29 11:48:38 +02:00
  • b209915121 Fix build with clang Aiden Grossman 2023-07-25 15:10:50 -07:00
  • f5506b002c Add 64-bit flag on INTERFACE64 only Felix Yan 2023-07-28 16:19:14 +03:00
  • 4ed6414c17 Fix 64-bit fortran options for riscv64 Felix Yan 2023-07-28 04:53:27 +03:00
  • 007cd834c1 Use defined variable for riscv64 in arch.cmake Felix Yan 2023-07-28 04:50:16 +03:00
  • 5720fa02c5 Merge pull request #4168 from Mousius/sve-zgemm-cgemm Martin Kroeker 2023-07-27 17:41:45 +02:00
  • b3a5144a74 Merge pull request #4167 from Mousius/sve-zhemm-fix Martin Kroeker 2023-07-27 16:20:55 +02:00
  • 84a268b6ca Use SVE zgemm/cgemm on Arm(R) Neoverse(TM) V1 core Chris Sidebottom 2023-07-27 10:55:34 +01:00
  • 730ca04b48 Fix ZHEMM copy for SVE Chris Sidebottom 2023-07-27 13:27:28 +01:00
  • 9ba9c8bdc0 Merge pull request #4165 from rgommers/docs-packaging-and-ilp64 Martin Kroeker 2023-07-27 10:36:24 +02:00
  • ee72575475 Add documentation on redistributing OpenBLAS Ralf Gommers 2023-07-26 13:00:07 +02:00
  • 2a62d2df96 Enable use of AVX512 microkernels with NVIDIA HPC from version 22.3 Martin Kroeker 2023-07-26 19:39:11 +02:00
  • 76aa6bac4d Fix cirun url [skip actions] steppi 2023-07-26 12:01:12 -04:00
  • 849c8806b8 Merge pull request #4161 from Mousius/non-sve-kernels Martin Kroeker 2023-07-26 15:49:40 +02:00