Commit Graph

8681 Commits

Author SHA1 Message Date
Martin Kroeker
076766df4e Update CMakeLists.txt 2024-05-31 18:23:18 +02:00
Martin Kroeker
8c05765a5a fix other corner cases where x=INF 2024-05-31 18:06:36 +02:00
Martin Kroeker
516743f7dc fix other instances of mishandling INF 2024-05-31 16:02:12 +02:00
Martin Kroeker
9ff4e9714e additional fixes for handling INF arguments 2024-05-31 15:44:07 +02:00
Martin Kroeker
ce130f11d2 Update zscal.c 2024-05-31 15:09:03 +02:00
Martin Kroeker
ab13cfef93 more fixes for infinite x 2024-05-31 14:34:49 +02:00
Martin Kroeker
a16f8249ba add tests with the imaginary part of the array infinite 2024-05-31 01:08:17 +02:00
Martin Kroeker
ad2b5c67c8 fix another corner case involving infinity 2024-05-31 01:06:58 +02:00
Martin Kroeker
0d007adb18 fix clang_cl-flang job to use flang-new after the llvm update 2024-05-30 23:30:16 +02:00
Martin Kroeker
b9a1c9a06c Merge pull request #4725 from Neumann-A/patch-1
Fix CMake warning
2024-05-30 21:32:32 +02:00
Martin Kroeker
ff6670cb83 don't generate non-cblas files for gemm_batch 2024-05-30 18:26:02 +02:00
Alexander Neumann
dd4505c5dd Fix CMake warning 2024-05-30 09:04:23 +02:00
Martin Kroeker
362a063396 remove return value 2024-05-29 23:16:58 +02:00
Martin Kroeker
d0794f88dc add gemm_batch driver 2024-05-29 15:49:20 +02:00
Martin Kroeker
833a8880c6 add cblas_?gemm_batch 2024-05-29 15:47:50 +02:00
Martin Kroeker
89c7bbcba6 add cblas_?gemm_batch 2024-05-29 15:47:02 +02:00
Martin Kroeker
103637887e add cblas_?gemm_batch 2024-05-29 15:46:10 +02:00
Martin Kroeker
0073affe63 Merge pull request #4693 from goplanid/locks-improvement
Lock Management Improvements for Memory Allocation Efficiency
2024-05-26 23:14:52 +02:00
Martin Kroeker
834e633d79 Merge pull request #4718 from martin-frbg/issue4713
Override Intel icx's default fp-model to ensure correct handling on NaNs
2024-05-26 16:38:18 +02:00
Martin Kroeker
3833190454 Merge pull request #4716 from martin-frbg/lapack1018
Fix a potential bounds error in ?UNHR_COL/?ORHR_COL (Reference-LAPACK PR 1018)
2024-05-26 14:01:31 +02:00
Martin Kroeker
cf7e668fe8 Merge pull request #4709 from martin-frbg/docsbuildbranch
Don't try to deploy docs when PR-building in a fork
2024-05-26 14:01:05 +02:00
Martin Kroeker
8b4996a2d5 Override icx's default fast math mode to ensure correct NaN handling 2024-05-26 13:16:03 +02:00
Martin Kroeker
616cc28d82 Override icx's default fast math mode to ensure correct NaN handling 2024-05-26 12:59:11 +02:00
Martin Kroeker
772116879d Merge pull request #4717 from bartoldeman/zscal-float-inf-fix
Replace use of FLT_MAX in x86_64 zscal.c by isinf()
2024-05-26 12:26:41 +02:00
Bart Oldeman
62f7b244ff Replace use of FLT_MAX in x86_64 zscal.c by isinf()
Commit def4996 fixed issues with inf and nan values in zscal,
but used FLT_MAX, where DBL_MAX or isinf() is more appropriate,
as FLT_MAX is for single precision only.
Using FLT_MAX caused test case failures in the LAPACK tests.

isinf() is consistent with the later fix 969601a1
2024-05-24 17:20:27 +00:00
Martin Kroeker
7ebbe3cc72 Fix potential bounds error (Reference-LAPACK PR 1018) 2024-05-23 23:12:19 +02:00
Martin Kroeker
791e015024 Fix potential bounds error (Reference-LAPACK PR 1018) 2024-05-23 23:11:14 +02:00
Martin Kroeker
4dd715d220 Fix potential bounds error (Reference-LAPACK PR 1018) 2024-05-23 23:09:55 +02:00
Martin Kroeker
e2c1a1e269 Fix potential bounds error (Reference-LAPACK PR 1018) 2024-05-23 23:08:27 +02:00
Rajalakshmi Srinivasaraghavan
e112191b54 POWER: Fix issues in zscal to address lapack failures
This patch fixes following lapack failures with clang compiler on POWER.
zed.out: ZVX:   18 out of  5190 tests failed to pass the threshold
zgd.out: ZGV drivers:     25 out of   1092 tests failed to pass the threshold
zgd.out: ZGV drivers:      6 out of   1092 tests failed to pass the threshold
2024-05-22 08:00:06 -05:00
Martin Kroeker
172d91846f Don't try to deploy docs in a fork 2024-05-20 22:53:43 +02:00
Martin Kroeker
700ea74a37 Merge pull request #4705 from martin-frbg/issue4703
Fix INTERFACE64 builds on Loongarch64
2024-05-18 21:38:22 +02:00
Martin Kroeker
aa259b141d Merge pull request #4704 from amritahs-ibm/saxpy_perf_fix
Fix regression SAXPY when compiler with OpenXL compiler.
2024-05-18 19:11:25 +02:00
Martin Kroeker
25b34e67f9 Merge pull request #4678 from ev-br/codspeed
WIP: add codspeed benchmarks [skip cirrus]
2024-05-18 16:51:02 +02:00
Martin Kroeker
6494f432df Fix INTERFACE64 builds on Loongarch64 2024-05-18 16:49:03 +02:00
Evgeni Burovski
81cf0db047 DOC: add a readme for benchmarks/pybench 2024-05-18 15:30:00 +03:00
Evgeni Burovski
9f28161837 BENCH: add benchmarks using codspeed.io 2024-05-18 15:25:16 +03:00
Martin Kroeker
5015548d18 Merge pull request #4700 from martin-frbg/fix4698
Remove spurious brace in cmake/system.cmake
2024-05-16 15:38:01 +02:00
Matthias Langer
0050a9660b Correctly detect ARM Neoverse V2 CPUs. 2024-05-16 09:59:52 +00:00
Martin Kroeker
ce96e0e50f Merge pull request #4699 from ChipKerchner/fixSwapVectorOrder
POWER: Fixing endianness issue in cswap/zswap kernel for AIX
2024-05-16 09:28:20 +02:00
Martin Kroeker
a3f6b13bc9 remove spurious brace 2024-05-16 09:25:53 +02:00
Chip Kerchner
3a1417671a POWER: Fixing endianness issue in cswap/zswap kernel for AIX 2024-05-15 19:36:46 -05:00
Martin Kroeker
668f48f4fc Use CMAKE_C_COMPILER_VERSION instead of dumpversion calls (#4698)
* Use CMAKE_C_COMPILER_VERSION throughout
2024-05-15 23:58:14 +02:00
Martin Kroeker
39c96063fb Merge pull request #4694 from martin-frbg/issue3660
Add a minimum problem size for multithreading in GBMV
2024-05-15 22:14:41 +02:00
Martin Kroeker
f5c080f083 Fix CMAKE syntax in kernel file parsing of IFNEQ conditionals (#4695)
* Fix syntax in parsing of IFNEQ
2024-05-15 20:58:31 +02:00
Martin Kroeker
9a2a6a2e52 Merge pull request #4696 from frjohnst/restore_second
Revert PRs 4515 and 4520 (restore second, dsecnd)
2024-05-15 18:35:20 +02:00
frjohnst
87026ac1b1 Revert "fix conlict between PR 4515 and AIX shared obj support"
This reverts commit bdaa6705ca.

It turns out that PRs 4515 and 4520 break the tests under
lapack-netlib/TESTING which require SECOND and DSECND. IBM
has decided this is a bigger biger problem than the conflict
between lapack second_ and the xlf run time.
2024-05-15 09:45:17 -04:00
frjohnst
56d3d1039c Revert "resolve second_ conflict which breaks xlf timef"
This reverts commit 9b24b31419.

It turns out that PRs 4515 and 4520 break the tests under
lapack-netlib/TESTING which require SECOND and DSECND. IBM
has decided this is a bigger biger problem than the conflict
between lapack second_ and the xlf run time.
2024-05-15 09:44:29 -04:00
Martin Kroeker
2957281275 Introduce a lower limit for multithreading 2024-05-14 18:59:21 +02:00
Martin Kroeker
5fd871d7ea Introduce a lower limit for multithreading 2024-05-14 18:48:03 +02:00