Commit Graph

8796 Commits

Author SHA1 Message Date
Martin Kroeker
6452f7b46d Merge pull request #4873 from ChipKerchner/fixSBGEMMDefaults
[POWER] Problem with multi-threaded SBGEMM
2024-08-14 19:22:03 +02:00
Chip Kerchner
75472b830a Merge branch 'develop' into betterPowerGEMVTail 2024-08-14 10:52:46 -05:00
Martin Kroeker
ca7777de18 Merge pull request #4870 from chenx97/fix-recursive-make-var
Fix recursive variable expansion in Makefiles for LOONGSON3A
2024-08-14 16:03:50 +02:00
Martin Kroeker
f6469e21bc move gelqs and geqrs to lapack-deprecated 2024-08-14 16:00:43 +02:00
Chip Kerchner
31226740d6 Cleanup of SBGEMM unit test. 2024-08-14 08:10:25 -05:00
Henry Chen
ef94b96530 Use ldc1 and sdc1 for the prologue and epilogue on LOONGSON3A
This fix is similar to
2d8064174c.
2024-08-14 18:05:11 +08:00
Martin Kroeker
23b5d66a86 Ensure a memory buffer has been allocated for each thread before invoking it 2024-08-14 10:35:44 +02:00
Henry Chen
20bdb65882 Fix recursive variable expansion in Makefiles for LOONGSON3A 2024-08-14 15:08:32 +08:00
Chip Kerchner
b1737698db Fix DEFAULTS in SBGEMM for POWER10. Also comparisons for SBGEMM unit test can be exactly due to epilison differences. 2024-08-13 07:01:21 -05:00
Martin Kroeker
e5525036e7 Merge pull request #4865 from martin-frbg/issue4856
Tweak LAPACK STFSM test threshold a little more to cover POWER10 fma
2024-08-13 07:20:06 +02:00
Martin Kroeker
fd52d09490 Merge pull request #4864 from martin-frbg/issue4862
Spell out function prototypes in the SYRK calls of potrf_parallel
2024-08-13 00:16:45 +02:00
Martin Kroeker
35dd625adf Merge pull request #4859 from martin-frbg/cooper_sb
Address clang array overflow warning in the SBGEMV microkernel for Cooper Lake
2024-08-12 22:05:43 +02:00
Martin Kroeker
d8f740791a tweak threshold a little more to cover POWER10 fma 2024-08-12 14:50:49 +02:00
Martin Kroeker
73e13b0273 flesh out HERK prototype 2024-08-12 14:45:40 +02:00
Martin Kroeker
824306baab flesh out HERK prototype 2024-08-12 14:44:13 +02:00
Martin Kroeker
7ca835a82c address clang array overflow warning 2024-08-10 13:44:56 +02:00
Martin Kroeker
a87c4d26dd Merge pull request #4857 from nekopsykose/ppc
fix cmake typo for power10 cc version check
2024-08-10 00:15:28 +02:00
psykose
1265eee85c fix cmake typo for power10 cc version check
fixes 668f48f4fc
2024-08-09 20:38:58 +02:00
Martin Kroeker
cd3945b998 Update version to 0.3.28.dev 2024-08-08 23:09:45 +02:00
Martin Kroeker
cbd321aecb Update versin to 0.3.28.dev 2024-08-08 23:08:52 +02:00
Martin Kroeker
cb38d666da Merge pull request #4855 from OpenMathLib/release-0.3.0
Merge release branch back into develop to copy tag
2024-08-08 23:08:07 +02:00
Martin Kroeker
5ef8b19646 Merge pull request #4854 from OpenMathLib/develop
merge develop in preparation of the 0.3.28 release
v0.3.28
2024-08-08 22:41:46 +02:00
Martin Kroeker
884a949a0d Merge branch 'release-0.3.0' into develop 2024-08-08 22:41:26 +02:00
Martin Kroeker
116bc767d8 Update version to 0.3.28 2024-08-08 22:23:02 +02:00
Martin Kroeker
91d6722a3d Update version to 0.3.28 2024-08-08 22:22:24 +02:00
Martin Kroeker
2c8e001efe Merge pull request #4853 from martin-frbg/changelog0328
Update Changelog.txt for 0.3.28
2024-08-08 21:14:40 +02:00
Martin Kroeker
1c2bfea1bb Merge pull request #4852 from martin-frbg/fix4814
Disable forwarding from SBGEMM to SBGEMV for now
2024-08-08 19:16:48 +02:00
Martin Kroeker
1df95bb23a Update Changelog.txt for 0.3.28 2024-08-08 18:51:25 +02:00
Martin Kroeker
7878976236 disable forwarding from SBGEMM to SBGEMV for now 2024-08-08 18:03:38 +02:00
Martin Kroeker
d92cc96978 Merge pull request #4851 from martin-frbg/test3m
Fix invocation of GEMM3M tests in gmake builds
2024-08-08 00:07:17 +02:00
Martin Kroeker
76db713e79 fix invocation of GEMM3M tests 2024-08-07 21:37:20 +02:00
Martin Kroeker
deae7cf1ec Merge pull request #4850 from martin-frbg/generic_3m
Make the dummy GEMM3M kernel for GENERIC targets forward to regular GEMM for now
2024-08-07 21:35:38 +02:00
Martin Kroeker
46e331a917 remove the unworkable GEMM3M restriction from GENERIC again 2024-08-07 19:41:10 +02:00
Martin Kroeker
ccc23338d7 have the dummy GEMM3M kernel at least forward to regular GEMM 2024-08-07 19:39:02 +02:00
Harmen Stoppels
fe0a69e308 even less invasive 2024-08-07 16:43:45 +02:00
Harmen Stoppels
f49371c1ba Set CMake 3.0 policies to NEW 2024-08-07 16:40:11 +02:00
Harmen Stoppels
1ef9f24b39 Revert "require consistent minimal cmake version"
This reverts commit 5b07ec643c.
2024-08-07 16:37:02 +02:00
Martin Kroeker
753c7ebe17 Merge pull request #4835 from martin-frbg/revertwin4359
Temporarily revert to the coarse-grained locking in the Windows thread server
2024-08-07 14:09:32 +02:00
Harmen Stoppels
5b07ec643c require consistent minimal cmake version 2024-08-07 09:43:47 +02:00
Martin Kroeker
3b8d7dfdca Merge pull request #4846 from martin-frbg/lapack1025
Make the type used for the "hidden" string length argument configurable (adapted from Reference-LAPACK PR 1025)
2024-08-07 00:04:37 +02:00
Martin Kroeker
797ae08dbe Add explanation of LAPACK_STRLEN 2024-08-06 21:38:00 +02:00
Martin Kroeker
923b79de47 make the type of the hidden arguments configurable via LAPACK_STRLEN (Reference-LAPACK PR 1025) 2024-08-06 17:55:14 +02:00
Martin Kroeker
cc36db643e Support new LAPACK build option LAPACK_STRLEN 2024-08-06 17:31:03 +02:00
Martin Kroeker
7e8118d94e Support new build option LAPACK_STRLEN 2024-08-06 17:30:17 +02:00
Martin Kroeker
5bdd3a05f0 Merge pull request #4841 from martin-frbg/lapack1033
Prevent compilers from using FMA that could increase error in ?GEEVX (Reference-LAPACK PR 1033)
2024-08-05 23:50:40 +02:00
Martin Kroeker
ae9e0e36c3 Merge pull request #4842 from martin-frbg/lapack1030
Fix typos and sytrd boundary workspace (Reference-LAPACK PR 1030)
2024-08-05 22:23:44 +02:00
Martin Kroeker
bce48d4a13 Fix typos and sytrd boundary workspace (Reference-LAPACK PR 1030) 2024-08-05 17:37:07 +02:00
Martin Kroeker
c8b4ceca85 prevent compilers from using FMA (Reference-LAPACK PR 1033) 2024-08-05 16:45:05 +02:00
Martin Kroeker
14a8a9a43c Merge pull request #4840 from martin-frbg/issue4823
set MACOSX_RPATH to true on Apple
2024-08-05 15:35:25 +02:00
Martin Kroeker
a4845fa12d set MACOSX_RPATH to true on Apple 2024-08-04 23:38:22 +02:00