Martin Kroeker
9afd0c8afd
Merge pull request #4814 from Mousius/gemv-proxy
...
Forward GEMM to GEMV when one argument is actually a vector
2024-07-31 23:18:01 +02:00
Martin Kroeker
edbf093c98
Update zarch SCAL kernels to handle INF and NAN arguments ( #4829 )
...
* handle INF and NAN in input (for S/D only if DUMMY2 argument is set)
2024-07-31 19:45:15 +02:00
Chris Sidebottom
ba2e989c67
Add accumulators to AArch64 GEMV Kernels
...
This helps to reduce values going missing as we accumulate.
2024-07-31 13:09:14 +01:00
Chris Sidebottom
b26424c6a2
Allow opt into GEMM -> GEMV forwarding
2024-07-31 13:09:14 +01:00
Chris Sidebottom
90eb863d4b
Re-add accidental removal
2024-07-31 13:09:14 +01:00
Chris Sidebottom
28b5334f22
Complete implementation of GEMV forwarding
2024-07-31 13:09:14 +01:00
Martin Kroeker
3db5dbc88e
forward to GEMV when one argument is actually a vector
2024-07-31 13:09:14 +01:00
Martin Kroeker
136a4edc5f
Merge pull request #4830 from martin-frbg/jenk
...
Continue requesting ubuntu18 instead of latest on OSUOSL powerCI
2024-07-30 22:19:14 +02:00
Martin Kroeker
86c15f028b
Update Jenkinsfile.pwr
2024-07-30 21:21:34 +02:00
Martin Kroeker
a13015b656
try requesting ubuntu22 instead of latest
2024-07-30 19:10:18 +02:00
Martin Kroeker
d11e734002
Merge pull request #4827 from Mousius/a64fx-gcc11
...
Fix GCC11 check for A64FX target
2024-07-29 16:36:13 +02:00
Chris Sidebottom
54ce33e851
Fix GCC11 check for A64FX target
2024-07-29 15:28:59 +01:00
Martin Kroeker
6d071f1a1c
Merge pull request #4826 from Mousius/a64fx-fallback
...
Add fallback compile options for A64FX target
2024-07-29 15:33:43 +02:00
Chris Sidebottom
3ed226d3f8
Re-add ISCLANG filter
2024-07-29 11:32:59 +01:00
Chris Sidebottom
85ca003ae7
Add fallback compile options for A64FX target
2024-07-29 11:25:03 +01:00
Martin Kroeker
05bf35f296
Merge pull request #4822 from martin-frbg/issue4821
...
Fix c_check to tolerate a dashed suffix to the gcc version number
2024-07-27 20:28:06 +02:00
Martin Kroeker
175008caf8
harden against a dashed suffix to the gcc version number
2024-07-27 19:08:02 +02:00
Martin Kroeker
886acfc444
Merge pull request #4819 from martin-frbg/issue4776
...
Re-enable the SGESDD benchmark after the SCAL fixes
2024-07-26 16:57:35 +02:00
Martin Kroeker
4460d3ee7f
re-enable the sgesdd benchmark
2024-07-26 15:07:52 +02:00
Martin Kroeker
092986582f
Merge pull request #4818 from martin-frbg/docs_winbuild
...
[Docs] replace "Preview" in the MSVC vcvarsall path example
2024-07-26 14:57:53 +02:00
Martin Kroeker
25e148ec58
Merge pull request #4817 from martin-frbg/fix4807
...
Fix SCAL on x86 and RISCV_GENERIC
2024-07-26 14:56:44 +02:00
Martin Kroeker
a090011fbf
just use numeric constants in dimensions
2024-07-26 12:56:12 +02:00
Martin Kroeker
7006492863
replace "Preview" in the MSVC vcvarsall path with "Community"
2024-07-26 12:49:57 +02:00
Martin Kroeker
db5328e85b
make array dimensions constant
2024-07-26 12:45:39 +02:00
Martin Kroeker
d9ae4609fb
remove C99 requirement
2024-07-26 11:15:33 +02:00
Martin Kroeker
a875304eb0
fix inverted conditional for NAN handling
2024-07-26 09:50:20 +02:00
Martin Kroeker
24acdd6bbb
correct offset
2024-07-26 09:49:24 +02:00
Martin Kroeker
fb7c53c5e5
Merge pull request #4807 from martin-frbg/scalfixes
...
[WIP]Make NAN handling in the SCAL kernels depend on the dummy2 parameter
2024-07-25 23:42:50 +02:00
Martin Kroeker
15c53dd2e0
Merge pull request #4794 from XiWeiGu/Fixed_Numpy_CI_Test
...
Try to fixed numpy ci test failures
2024-07-25 23:42:13 +02:00
Martin Kroeker
a4e56e0452
Merge pull request #4806 from Mousius/small-gemm
...
Small GEMM for AArch64 with SVE
2024-07-25 21:50:04 +02:00
Martin Kroeker
949a7f9393
Merge pull request #4811 from yamazakimitsufumi/add_a64fx_to_dynamic_arch
...
Add A64FX to the list of CPUs supported by DYNAMIC_ARCH
2024-07-25 19:13:04 +02:00
yamazaki-mitsufumi
88caf02f62
Fix ambiguous error on Mac OS
2024-07-25 22:43:13 +09:00
Martin Kroeker
b613754143
Update scal..c
2024-07-24 14:31:29 +02:00
Martin Kroeker
4140ac45d7
Merge pull request #4813 from martin-frbg/issue4812
...
Fix incompatible definitions of MAXLOC in f2c-converted LAPACK sources
2024-07-23 21:35:06 +02:00
Martin Kroeker
0096482f03
fix incompatible definitions of MAXLOC
2024-07-23 15:01:26 +02:00
Martin Kroeker
ed82fd24fc
Merge pull request #4810 from martin-frbg/issue4805
...
Work around a gcc14.1 bug that breaks utest on Loongarch
2024-07-23 14:54:55 +02:00
yamazaki-mitsufumi
821ef34635
Add A64FX to the list of CPUs supported by DYNAMIC_ARCH
2024-07-23 20:44:39 +09:00
Martin Kroeker
29f3e759b9
work around a gcc14.1 bug observed on Loongarch
2024-07-23 11:20:48 +02:00
Martin Kroeker
f5d04318e3
Merge branch 'OpenMathLib:develop' into scalfixes
2024-07-21 13:43:43 +02:00
Martin Kroeker
73f8866ffb
make NAN handling depend on DUMMY2 parameter
2024-07-21 13:42:47 +02:00
Martin Kroeker
dfbc2348a8
fix NAN handling
2024-07-20 18:27:15 +02:00
Martin Kroeker
c064319ecb
fix alpha=NAN case
2024-07-20 17:42:31 +02:00
Martin Kroeker
c2ffd90e8c
make NAN handling depend on dummy2 parameter
2024-07-20 17:31:00 +02:00
Chris Sidebottom
ea4ab3b310
Better header guard around bridge
2024-07-20 14:39:57 +01:00
Chris Sidebottom
7311d93016
Unroll TT further
2024-07-19 17:51:20 +01:00
Martin Kroeker
a815594fd1
Merge pull request #4801 from markdryan/markdryan/riscv-dynamic-arch
...
Add autodetection for riscv64
2024-07-19 17:12:07 +02:00
Martin Kroeker
dd6c33d34d
make NAN handling depend on dummy2 parameter
2024-07-19 16:14:55 +02:00
Martin Kroeker
5a845ef1f4
Merge pull request #4809 from penghongbo/reorder_gemm_gemvt
...
Change computational order in GEMV and GEMM Power6 kernel
2024-07-19 13:06:42 +02:00
Hong Bo Peng
db98f8753f
Try to fix LAPACK testing failures on P7.
...
1. Remove the FADD insn from the GEMV Transpose code.
2. Remove the FADD insn from GEMM and ZGEMM code.
3. Reorder the compution of the Imaginary part in ZGEMM code.
2024-07-19 02:08:19 -04:00
Chris Sidebottom
a9edddb695
Unroll TN further
2024-07-18 20:04:15 +01:00