Martin Kroeker
dd6c33d34d
make NAN handling depend on dummy2 parameter
2024-07-19 16:14:55 +02:00
Martin Kroeker
2020569705
fix NAN handling and make it depend on dummy2 parameter
2024-07-17 23:55:54 +02:00
Martin Kroeker
3870995f01
make NAN handling depend on dummy2 parameter
2024-07-17 23:54:24 +02:00
Martin Kroeker
7284c533b5
make NAN handling depend on dummy2 parameter
2024-07-17 23:50:40 +02:00
Martin Kroeker
73751218a4
make NAN handling depend on dummy2 parameter
2024-07-17 23:41:26 +02:00
Martin Kroeker
b9bfc8ce09
make NAN handling depend on dummy2 parameter
2024-07-17 23:29:50 +02:00
Martin Kroeker
eb4879e04c
make NAN handling depend on the dummy2 parameter
2024-07-17 23:24:19 +02:00
Martin Kroeker
ee87cb90d0
Merge pull request #4803 from iha-taisei/SVESupportSDGEMV
...
A64FX: Add support for SVE to SGEMV/DGEMV kernels.
2024-07-17 23:14:21 +02:00
iha fujitsu
0985fdc82b
A64FX: Add support for SVE to SGEMV/DGEMV kernels.
2024-07-16 17:31:33 +09:00
Mark Ryan
67bf4b6998
Fix axpby_rvv kernels for cases where inc_y = 0
...
The following openblas_utest tests fail when the RISCV64_ZVL128B is
enabled.
TEST 89/103 axpby:zaxpby_inc_0 [FAIL]
TEST 92/103 axpby:caxpby_inc_0 [FAIL]
TEST 95/103 axpby:daxpby_inc_0 [FAIL]
TEST 98/103 axpby:saxpby_inc_0 [FAIL]
The issue is that the vectorized kernels do not work when inc_y == 0.
This patch updates the kernels to fall back to the scalar algorithms
when inc_y == 0, fixing the failing tests.
Signed-off-by: Mark Ryan <markdryan@rivosinc.com>
2024-07-15 14:24:47 +00:00
Martin Kroeker
5d08ec7ff3
Merge pull request #4782 from martin-frbg/azurewincl
...
Fix NAN handling in ARM/generic SCAL; have AzureCI Windows show errors on failure
2024-07-11 23:55:15 +02:00
Chip Kerchner
cb154832f8
Vectorize SBGEMM incopy - 4x faster.
2024-07-09 13:10:03 -05:00
Martin Kroeker
a5c04e326a
Update scal.c
2024-07-04 22:28:01 +02:00
Martin Kroeker
536200bc9e
fix handling of INF or NAN
2024-07-04 17:47:19 +02:00
Martin Kroeker
3677b3886c
Merge pull request #4702 from bashimao/detect-nv-grace
...
Correctly detect ARM Neoverse V2 CPUs.
2024-06-30 22:48:48 +02:00
Martin Kroeker
f3c364c2cc
temporarily(?) disable the alpha=0 branch as it fails to handle INF,NAN
2024-06-27 22:18:27 +02:00
Martin Kroeker
2a5fe97e3b
temporarily(?) disable the alpha=0 branch as it does not handle INF,NAN
2024-06-27 16:21:57 +02:00
Martin Kroeker
c1019d5832
Handle INF and NAN in inputs
2024-06-27 10:58:59 +02:00
Martin Kroeker
9e24121e7e
temporarily(?) disable da=0 shortcut to handle x=Inf or NAN
2024-06-23 17:48:18 +02:00
Martin Kroeker
a11f086c17
Update sscal_msa.c
2024-06-23 12:55:19 +02:00
Martin Kroeker
541e1b6959
disable the fast path for inc=1, alpha=0 as it does not handle x=NaN or Inf
2024-06-23 10:37:55 +02:00
Martin Kroeker
c08113c279
fix special cases of x= NAN or INF
2024-06-23 01:12:33 +02:00
Martin Kroeker
bd47630bcf
exclude the alpha=0 branch as it does not handle NaN or Inf in x
2024-06-23 00:54:39 +02:00
Martin Kroeker
68f2501958
temporarily(?) disable the alpha=0 branch to handle Inf/NaN in x
2024-06-22 21:08:57 +02:00
Martin Kroeker
0a744a939a
temporarily(?) disable the alpha=0 branch to handle NaN/Inf in x
2024-06-22 21:07:43 +02:00
Martin Kroeker
7f8f037a36
handle INF and NAN in input
2024-06-22 16:03:30 +02:00
Martin Kroeker
f1248b849d
handle INF and NAN in input
2024-06-22 15:55:29 +02:00
Martin Kroeker
a2ee4b1966
Merge branch 'OpenMathLib:develop' into issue4728
2024-06-21 09:35:56 +02:00
Martin Kroeker
3ec59922b6
Add a clobber list to fix utest errors seen with gcc13 on Apple M
2024-06-20 16:19:32 +02:00
Martin Kroeker
3d8054fb16
add clobber list
2024-06-14 22:07:44 +02:00
Martin Kroeker
dd7efcf9ef
Avoid exceeding the configured thread count in x86_64 TOBF16 ( #4748 )
...
* avoid setting nthreads higher than available
2024-06-14 14:21:13 +02:00
Martin Kroeker
6ffaf99817
disable da=0 shortcut to handle NAN and INF correctly
2024-06-07 14:46:58 +02:00
Martin Kroeker
c7cacd9b38
disable the shortcut for da=0 to ensure proper handling of INF and NAN
2024-06-07 13:48:56 +02:00
Martin Kroeker
5ed4f24d6e
Handle corner cases with INF and NAN arguments
2024-06-07 09:39:08 +02:00
Martin Kroeker
2bd43ad0eb
Merge branch 'OpenMathLib:develop' into issue4728
2024-06-07 00:37:25 +02:00
Martin Kroeker
1abafcd9b2
handle corner cases involving NAN and/or INF
2024-06-06 23:59:43 +02:00
Martin Kroeker
442dec28df
Merge pull request #4738 from martin-frbg/issue4737
...
Disable GEMM3M for generic targets (not implemented)
2024-06-06 17:22:38 +02:00
Martin Kroeker
2787c9f8e4
Disable GEMM3M for generic targets (not implemented)
2024-06-06 14:39:50 +02:00
gxw
af73ae6208
LoongArch: Fixed issue 4728
2024-06-06 16:43:09 +08:00
gxw
8ab2e9ec65
LoongArch: DGEMM small matrix opt
2024-06-04 16:52:45 +08:00
Martin Kroeker
83bc8d5dd8
Merge pull request #4712 from RajalakshmiSR/zscalp10
...
POWER: Fix issues in zscal to address lapack failures
2024-06-01 11:22:08 +02:00
Martin Kroeker
020b3e1682
fix handling of INF arguments
2024-06-01 00:51:18 +02:00
Martin Kroeker
8c05765a5a
fix other corner cases where x=INF
2024-05-31 18:06:36 +02:00
Martin Kroeker
516743f7dc
fix other instances of mishandling INF
2024-05-31 16:02:12 +02:00
Martin Kroeker
9ff4e9714e
additional fixes for handling INF arguments
2024-05-31 15:44:07 +02:00
Martin Kroeker
ce130f11d2
Update zscal.c
2024-05-31 15:09:03 +02:00
Martin Kroeker
ab13cfef93
more fixes for infinite x
2024-05-31 14:34:49 +02:00
Martin Kroeker
ad2b5c67c8
fix another corner case involving infinity
2024-05-31 01:06:58 +02:00
Bart Oldeman
62f7b244ff
Replace use of FLT_MAX in x86_64 zscal.c by isinf()
...
Commit def4996 fixed issues with inf and nan values in zscal,
but used FLT_MAX, where DBL_MAX or isinf() is more appropriate,
as FLT_MAX is for single precision only.
Using FLT_MAX caused test case failures in the LAPACK tests.
isinf() is consistent with the later fix 969601a1
2024-05-24 17:20:27 +00:00
Rajalakshmi Srinivasaraghavan
e112191b54
POWER: Fix issues in zscal to address lapack failures
...
This patch fixes following lapack failures with clang compiler on POWER.
zed.out: ZVX: 18 out of 5190 tests failed to pass the threshold
zgd.out: ZGV drivers: 25 out of 1092 tests failed to pass the threshold
zgd.out: ZGV drivers: 6 out of 1092 tests failed to pass the threshold
2024-05-22 08:00:06 -05:00