Commit Graph

8564 Commits

Author SHA1 Message Date
Martin Kroeker
d11e734002 Merge pull request #4827 from Mousius/a64fx-gcc11
Fix GCC11 check for A64FX target
2024-07-29 16:36:13 +02:00
Chris Sidebottom
54ce33e851 Fix GCC11 check for A64FX target 2024-07-29 15:28:59 +01:00
Martin Kroeker
6d071f1a1c Merge pull request #4826 from Mousius/a64fx-fallback
Add fallback compile options for A64FX target
2024-07-29 15:33:43 +02:00
Chris Sidebottom
3ed226d3f8 Re-add ISCLANG filter 2024-07-29 11:32:59 +01:00
Chris Sidebottom
85ca003ae7 Add fallback compile options for A64FX target 2024-07-29 11:25:03 +01:00
Martin Kroeker
05bf35f296 Merge pull request #4822 from martin-frbg/issue4821
Fix c_check to tolerate a dashed suffix to the gcc version number
2024-07-27 20:28:06 +02:00
Martin Kroeker
175008caf8 harden against a dashed suffix to the gcc version number 2024-07-27 19:08:02 +02:00
Martin Kroeker
886acfc444 Merge pull request #4819 from martin-frbg/issue4776
Re-enable the SGESDD benchmark after the SCAL fixes
2024-07-26 16:57:35 +02:00
Martin Kroeker
4460d3ee7f re-enable the sgesdd benchmark 2024-07-26 15:07:52 +02:00
Martin Kroeker
092986582f Merge pull request #4818 from martin-frbg/docs_winbuild
[Docs] replace "Preview" in the MSVC vcvarsall path example
2024-07-26 14:57:53 +02:00
Martin Kroeker
25e148ec58 Merge pull request #4817 from martin-frbg/fix4807
Fix SCAL on x86 and RISCV_GENERIC
2024-07-26 14:56:44 +02:00
Martin Kroeker
a090011fbf just use numeric constants in dimensions 2024-07-26 12:56:12 +02:00
Martin Kroeker
7006492863 replace "Preview" in the MSVC vcvarsall path with "Community" 2024-07-26 12:49:57 +02:00
Martin Kroeker
db5328e85b make array dimensions constant 2024-07-26 12:45:39 +02:00
Martin Kroeker
d9ae4609fb remove C99 requirement 2024-07-26 11:15:33 +02:00
Martin Kroeker
a875304eb0 fix inverted conditional for NAN handling 2024-07-26 09:50:20 +02:00
Martin Kroeker
24acdd6bbb correct offset 2024-07-26 09:49:24 +02:00
Martin Kroeker
fb7c53c5e5 Merge pull request #4807 from martin-frbg/scalfixes
[WIP]Make NAN handling in the SCAL kernels depend on the dummy2 parameter
2024-07-25 23:42:50 +02:00
Martin Kroeker
15c53dd2e0 Merge pull request #4794 from XiWeiGu/Fixed_Numpy_CI_Test
Try to fixed numpy ci test failures
2024-07-25 23:42:13 +02:00
Martin Kroeker
a4e56e0452 Merge pull request #4806 from Mousius/small-gemm
Small GEMM for AArch64 with SVE
2024-07-25 21:50:04 +02:00
Martin Kroeker
949a7f9393 Merge pull request #4811 from yamazakimitsufumi/add_a64fx_to_dynamic_arch
Add A64FX to the list of CPUs supported by DYNAMIC_ARCH
2024-07-25 19:13:04 +02:00
yamazaki-mitsufumi
88caf02f62 Fix ambiguous error on Mac OS 2024-07-25 22:43:13 +09:00
Martin Kroeker
b613754143 Update scal..c 2024-07-24 14:31:29 +02:00
Martin Kroeker
4140ac45d7 Merge pull request #4813 from martin-frbg/issue4812
Fix incompatible definitions of MAXLOC in f2c-converted LAPACK sources
2024-07-23 21:35:06 +02:00
Martin Kroeker
0096482f03 fix incompatible definitions of MAXLOC 2024-07-23 15:01:26 +02:00
Martin Kroeker
ed82fd24fc Merge pull request #4810 from martin-frbg/issue4805
Work around a gcc14.1 bug that breaks utest on Loongarch
2024-07-23 14:54:55 +02:00
yamazaki-mitsufumi
821ef34635 Add A64FX to the list of CPUs supported by DYNAMIC_ARCH 2024-07-23 20:44:39 +09:00
Martin Kroeker
29f3e759b9 work around a gcc14.1 bug observed on Loongarch 2024-07-23 11:20:48 +02:00
Martin Kroeker
f5d04318e3 Merge branch 'OpenMathLib:develop' into scalfixes 2024-07-21 13:43:43 +02:00
Martin Kroeker
73f8866ffb make NAN handling depend on DUMMY2 parameter 2024-07-21 13:42:47 +02:00
Martin Kroeker
dfbc2348a8 fix NAN handling 2024-07-20 18:27:15 +02:00
Martin Kroeker
c064319ecb fix alpha=NAN case 2024-07-20 17:42:31 +02:00
Martin Kroeker
c2ffd90e8c make NAN handling depend on dummy2 parameter 2024-07-20 17:31:00 +02:00
Chris Sidebottom
ea4ab3b310 Better header guard around bridge 2024-07-20 14:39:57 +01:00
Chris Sidebottom
7311d93016 Unroll TT further 2024-07-19 17:51:20 +01:00
Martin Kroeker
a815594fd1 Merge pull request #4801 from markdryan/markdryan/riscv-dynamic-arch
Add autodetection for riscv64
2024-07-19 17:12:07 +02:00
Martin Kroeker
dd6c33d34d make NAN handling depend on dummy2 parameter 2024-07-19 16:14:55 +02:00
Martin Kroeker
5a845ef1f4 Merge pull request #4809 from penghongbo/reorder_gemm_gemvt
Change computational order in GEMV and GEMM Power6 kernel
2024-07-19 13:06:42 +02:00
Hong Bo Peng
db98f8753f Try to fix LAPACK testing failures on P7.
1. Remove the FADD insn from the GEMV Transpose code.
  2. Remove the FADD insn from GEMM and ZGEMM code.
  3. Reorder the compution of the Imaginary part in ZGEMM code.
2024-07-19 02:08:19 -04:00
Chris Sidebottom
a9edddb695 Unroll TN further 2024-07-18 20:04:15 +01:00
Chris Sidebottom
9984c5ce9d Clean up k2 removal more and unroll SGEMM more 2024-07-18 18:35:43 +01:00
Chris Sidebottom
b1c9fafabb Remove k2 loop from DGEMM TN and use a more conservative heuristic for SGEMM 2024-07-18 17:37:18 +01:00
Martin Kroeker
2020569705 fix NAN handling and make it depend on dummy2 parameter 2024-07-17 23:55:54 +02:00
Martin Kroeker
3870995f01 make NAN handling depend on dummy2 parameter 2024-07-17 23:54:24 +02:00
Martin Kroeker
7284c533b5 make NAN handling depend on dummy2 parameter 2024-07-17 23:50:40 +02:00
Martin Kroeker
73751218a4 make NAN handling depend on dummy2 parameter 2024-07-17 23:41:26 +02:00
Martin Kroeker
b9bfc8ce09 make NAN handling depend on dummy2 parameter 2024-07-17 23:29:50 +02:00
Martin Kroeker
eb4879e04c make NAN handling depend on the dummy2 parameter 2024-07-17 23:24:19 +02:00
Martin Kroeker
ee87cb90d0 Merge pull request #4803 from iha-taisei/SVESupportSDGEMV
A64FX: Add support for SVE to SGEMV/DGEMV kernels.
2024-07-17 23:14:21 +02:00
gxw
34b80ce03f mips64: Fixed numpy CI failure 2024-07-17 10:32:22 +08:00