Commit Graph

8576 Commits

Author SHA1 Message Date
Martin Kroeker
6468dc1142 restore the coarse locking of the pre-4359 version 2024-08-02 16:39:47 +02:00
Martin Kroeker
abff4baa4d re-enable queue struct members related to locking 2024-08-02 16:37:01 +02:00
Martin Kroeker
9afd0c8afd Merge pull request #4814 from Mousius/gemv-proxy
Forward GEMM to GEMV when one argument is actually a vector
2024-07-31 23:18:01 +02:00
Martin Kroeker
edbf093c98 Update zarch SCAL kernels to handle INF and NAN arguments (#4829)
* handle INF and NAN in input (for S/D only if DUMMY2 argument is set)
2024-07-31 19:45:15 +02:00
Chris Sidebottom
ba2e989c67 Add accumulators to AArch64 GEMV Kernels
This helps to reduce values going missing as we accumulate.
2024-07-31 13:09:14 +01:00
Chris Sidebottom
b26424c6a2 Allow opt into GEMM -> GEMV forwarding 2024-07-31 13:09:14 +01:00
Chris Sidebottom
90eb863d4b Re-add accidental removal 2024-07-31 13:09:14 +01:00
Chris Sidebottom
28b5334f22 Complete implementation of GEMV forwarding 2024-07-31 13:09:14 +01:00
Martin Kroeker
3db5dbc88e forward to GEMV when one argument is actually a vector 2024-07-31 13:09:14 +01:00
Martin Kroeker
136a4edc5f Merge pull request #4830 from martin-frbg/jenk
Continue requesting ubuntu18 instead of latest on OSUOSL powerCI
2024-07-30 22:19:14 +02:00
Martin Kroeker
86c15f028b Update Jenkinsfile.pwr 2024-07-30 21:21:34 +02:00
Martin Kroeker
a13015b656 try requesting ubuntu22 instead of latest 2024-07-30 19:10:18 +02:00
Martin Kroeker
d11e734002 Merge pull request #4827 from Mousius/a64fx-gcc11
Fix GCC11 check for A64FX target
2024-07-29 16:36:13 +02:00
Chris Sidebottom
54ce33e851 Fix GCC11 check for A64FX target 2024-07-29 15:28:59 +01:00
Martin Kroeker
6d071f1a1c Merge pull request #4826 from Mousius/a64fx-fallback
Add fallback compile options for A64FX target
2024-07-29 15:33:43 +02:00
Chris Sidebottom
3ed226d3f8 Re-add ISCLANG filter 2024-07-29 11:32:59 +01:00
Chris Sidebottom
85ca003ae7 Add fallback compile options for A64FX target 2024-07-29 11:25:03 +01:00
Martin Kroeker
05bf35f296 Merge pull request #4822 from martin-frbg/issue4821
Fix c_check to tolerate a dashed suffix to the gcc version number
2024-07-27 20:28:06 +02:00
Martin Kroeker
175008caf8 harden against a dashed suffix to the gcc version number 2024-07-27 19:08:02 +02:00
Martin Kroeker
886acfc444 Merge pull request #4819 from martin-frbg/issue4776
Re-enable the SGESDD benchmark after the SCAL fixes
2024-07-26 16:57:35 +02:00
Martin Kroeker
4460d3ee7f re-enable the sgesdd benchmark 2024-07-26 15:07:52 +02:00
Martin Kroeker
092986582f Merge pull request #4818 from martin-frbg/docs_winbuild
[Docs] replace "Preview" in the MSVC vcvarsall path example
2024-07-26 14:57:53 +02:00
Martin Kroeker
25e148ec58 Merge pull request #4817 from martin-frbg/fix4807
Fix SCAL on x86 and RISCV_GENERIC
2024-07-26 14:56:44 +02:00
Martin Kroeker
a090011fbf just use numeric constants in dimensions 2024-07-26 12:56:12 +02:00
Martin Kroeker
7006492863 replace "Preview" in the MSVC vcvarsall path with "Community" 2024-07-26 12:49:57 +02:00
Martin Kroeker
db5328e85b make array dimensions constant 2024-07-26 12:45:39 +02:00
Martin Kroeker
d9ae4609fb remove C99 requirement 2024-07-26 11:15:33 +02:00
Martin Kroeker
a875304eb0 fix inverted conditional for NAN handling 2024-07-26 09:50:20 +02:00
Martin Kroeker
24acdd6bbb correct offset 2024-07-26 09:49:24 +02:00
Martin Kroeker
fb7c53c5e5 Merge pull request #4807 from martin-frbg/scalfixes
[WIP]Make NAN handling in the SCAL kernels depend on the dummy2 parameter
2024-07-25 23:42:50 +02:00
Martin Kroeker
15c53dd2e0 Merge pull request #4794 from XiWeiGu/Fixed_Numpy_CI_Test
Try to fixed numpy ci test failures
2024-07-25 23:42:13 +02:00
Martin Kroeker
a4e56e0452 Merge pull request #4806 from Mousius/small-gemm
Small GEMM for AArch64 with SVE
2024-07-25 21:50:04 +02:00
Martin Kroeker
949a7f9393 Merge pull request #4811 from yamazakimitsufumi/add_a64fx_to_dynamic_arch
Add A64FX to the list of CPUs supported by DYNAMIC_ARCH
2024-07-25 19:13:04 +02:00
yamazaki-mitsufumi
88caf02f62 Fix ambiguous error on Mac OS 2024-07-25 22:43:13 +09:00
Martin Kroeker
b613754143 Update scal..c 2024-07-24 14:31:29 +02:00
Martin Kroeker
4140ac45d7 Merge pull request #4813 from martin-frbg/issue4812
Fix incompatible definitions of MAXLOC in f2c-converted LAPACK sources
2024-07-23 21:35:06 +02:00
Martin Kroeker
0096482f03 fix incompatible definitions of MAXLOC 2024-07-23 15:01:26 +02:00
Martin Kroeker
ed82fd24fc Merge pull request #4810 from martin-frbg/issue4805
Work around a gcc14.1 bug that breaks utest on Loongarch
2024-07-23 14:54:55 +02:00
yamazaki-mitsufumi
821ef34635 Add A64FX to the list of CPUs supported by DYNAMIC_ARCH 2024-07-23 20:44:39 +09:00
Martin Kroeker
29f3e759b9 work around a gcc14.1 bug observed on Loongarch 2024-07-23 11:20:48 +02:00
Martin Kroeker
f5d04318e3 Merge branch 'OpenMathLib:develop' into scalfixes 2024-07-21 13:43:43 +02:00
Martin Kroeker
73f8866ffb make NAN handling depend on DUMMY2 parameter 2024-07-21 13:42:47 +02:00
Martin Kroeker
dfbc2348a8 fix NAN handling 2024-07-20 18:27:15 +02:00
Martin Kroeker
c064319ecb fix alpha=NAN case 2024-07-20 17:42:31 +02:00
Martin Kroeker
c2ffd90e8c make NAN handling depend on dummy2 parameter 2024-07-20 17:31:00 +02:00
Chris Sidebottom
ea4ab3b310 Better header guard around bridge 2024-07-20 14:39:57 +01:00
Chris Sidebottom
7311d93016 Unroll TT further 2024-07-19 17:51:20 +01:00
Martin Kroeker
a815594fd1 Merge pull request #4801 from markdryan/markdryan/riscv-dynamic-arch
Add autodetection for riscv64
2024-07-19 17:12:07 +02:00
Martin Kroeker
dd6c33d34d make NAN handling depend on dummy2 parameter 2024-07-19 16:14:55 +02:00
Martin Kroeker
5a845ef1f4 Merge pull request #4809 from penghongbo/reorder_gemm_gemvt
Change computational order in GEMV and GEMM Power6 kernel
2024-07-19 13:06:42 +02:00