Martin Kroeker
|
7129a64d87
|
Merge pull request #4881 from martin-frbg/issue4805-2
Use fld.d/fst.d in PROLOGUE/EPILOGUE in LOONGSON3R5 GEMM
|
2024-08-16 08:47:12 +02:00 |
Martin Kroeker
|
49080b631e
|
remove optimizer pragma again
|
2024-08-15 22:15:27 +02:00 |
Martin Kroeker
|
e05d98d00a
|
expressly use fld.d/fst.d for floating point registers instead of LD/ST macros
|
2024-08-15 22:14:29 +02:00 |
Martin Kroeker
|
3ee9e9d8d0
|
Merge pull request #4879 from martin-frbg/issue4868-2
Ensure a memory buffer has been allocated for each thread before invoking it (take 2)
|
2024-08-15 22:06:54 +02:00 |
Martin Kroeker
|
dd71df8fab
|
Merge pull request #4880 from ChipKerchner/betterPowerGEMVTail
[POWER] Vectorize SGEMV transpose reduce stage
|
2024-08-15 20:36:22 +02:00 |
Martin Kroeker
|
a8d6b0219a
|
Merge pull request #4877 from XiWeiGu/fixed_undefined_blas_set_parameter
Fixed the undefined reference to blas_set_parameter
|
2024-08-15 15:35:26 +02:00 |
Martin Kroeker
|
d24b3cf393
|
properly fix buffer allocation and assignment
|
2024-08-15 15:32:58 +02:00 |
Chip Kerchner
|
a0aeba631d
|
Merge branch 'develop' into betterPowerGEMVTail
|
2024-08-15 08:00:00 -05:00 |
Martin Kroeker
|
eba8615c11
|
Merge pull request #4876 from martin-frbg/granite
Add autodetection support for Intel Granite Rapids as Sapphire Rapids
|
2024-08-15 13:50:54 +02:00 |
Martin Kroeker
|
bc80e7f02d
|
Merge pull request #4878 from martin-frbg/cirrus-androidndk
Cirrus CI: fix installation of NDK in armv7 crossbuild
|
2024-08-15 13:50:09 +02:00 |
Martin Kroeker
|
94c9e0b7ad
|
Update ndk version number
|
2024-08-15 11:30:23 +02:00 |
Martin Kroeker
|
ed0321563a
|
fix installation of NDK in armv7 crossbuild
|
2024-08-15 11:11:07 +02:00 |
gxw
|
fd033467ac
|
Fixed the undefined reference to blas_set_parameter
Fixed the undefined reference to blas_set_parameter when
enabling USE_OPENMP and DYNAMIC_ARCH.
|
2024-08-15 16:48:48 +08:00 |
Martin Kroeker
|
1b8e40874e
|
Add autodetection support for Intel Granite Rapids as Sapphire Rapids
|
2024-08-15 09:33:42 +02:00 |
Martin Kroeker
|
4944148e66
|
Merge pull request #4875 from ChipKerchner/addGEMVtoBF16Test
Add GEMV to SBGEMx vs SGEMx testing
|
2024-08-15 09:32:11 +02:00 |
Martin Kroeker
|
a388c4b834
|
Merge pull request #4872 from chenx97/ls3a-fix-stack-fpr-len
Use ldc1 and sdc1 for the prologue and epilogue on LOONGSON3A
|
2024-08-15 00:10:16 +02:00 |
Martin Kroeker
|
f24b521709
|
Merge pull request #4787 from vlad0x00/patch-1
Update cross compile info
|
2024-08-15 00:09:53 +02:00 |
Vladimir Nikolić
|
2d84ed7e76
|
Update README.md
|
2024-08-14 14:31:35 -07:00 |
Chip Kerchner
|
083faf7556
|
Merge branch 'develop' into betterPowerGEMVTail
|
2024-08-14 15:56:03 -05:00 |
Chip Kerchner
|
c23897f585
|
Add GEMV testing to SBGEMx vs SGEMx testing.
|
2024-08-14 15:55:23 -05:00 |
Martin Kroeker
|
0d8ee96f1e
|
Merge pull request #4874 from martin-frbg/issue4869
Fix handling of deprecated ?GELQS/?GEQRS in building the shared library
|
2024-08-14 22:49:12 +02:00 |
Martin Kroeker
|
b80671d896
|
Merge pull request #4871 from martin-frbg/issue4868
Ensure a buffer has been allocated for each thread before invoking it
|
2024-08-14 20:53:39 +02:00 |
Martin Kroeker
|
6452f7b46d
|
Merge pull request #4873 from ChipKerchner/fixSBGEMMDefaults
[POWER] Problem with multi-threaded SBGEMM
|
2024-08-14 19:22:03 +02:00 |
Chip Kerchner
|
75472b830a
|
Merge branch 'develop' into betterPowerGEMVTail
|
2024-08-14 10:52:46 -05:00 |
Martin Kroeker
|
ca7777de18
|
Merge pull request #4870 from chenx97/fix-recursive-make-var
Fix recursive variable expansion in Makefiles for LOONGSON3A
|
2024-08-14 16:03:50 +02:00 |
Martin Kroeker
|
f6469e21bc
|
move gelqs and geqrs to lapack-deprecated
|
2024-08-14 16:00:43 +02:00 |
Chip Kerchner
|
31226740d6
|
Cleanup of SBGEMM unit test.
|
2024-08-14 08:10:25 -05:00 |
Henry Chen
|
ef94b96530
|
Use ldc1 and sdc1 for the prologue and epilogue on LOONGSON3A
This fix is similar to
2d8064174c.
|
2024-08-14 18:05:11 +08:00 |
Martin Kroeker
|
23b5d66a86
|
Ensure a memory buffer has been allocated for each thread before invoking it
|
2024-08-14 10:35:44 +02:00 |
Henry Chen
|
20bdb65882
|
Fix recursive variable expansion in Makefiles for LOONGSON3A
|
2024-08-14 15:08:32 +08:00 |
Chip Kerchner
|
b1737698db
|
Fix DEFAULTS in SBGEMM for POWER10. Also comparisons for SBGEMM unit test can be exactly due to epilison differences.
|
2024-08-13 07:01:21 -05:00 |
Martin Kroeker
|
e5525036e7
|
Merge pull request #4865 from martin-frbg/issue4856
Tweak LAPACK STFSM test threshold a little more to cover POWER10 fma
|
2024-08-13 07:20:06 +02:00 |
Martin Kroeker
|
fd52d09490
|
Merge pull request #4864 from martin-frbg/issue4862
Spell out function prototypes in the SYRK calls of potrf_parallel
|
2024-08-13 00:16:45 +02:00 |
Martin Kroeker
|
35dd625adf
|
Merge pull request #4859 from martin-frbg/cooper_sb
Address clang array overflow warning in the SBGEMV microkernel for Cooper Lake
|
2024-08-12 22:05:43 +02:00 |
Martin Kroeker
|
d8f740791a
|
tweak threshold a little more to cover POWER10 fma
|
2024-08-12 14:50:49 +02:00 |
Martin Kroeker
|
73e13b0273
|
flesh out HERK prototype
|
2024-08-12 14:45:40 +02:00 |
Martin Kroeker
|
824306baab
|
flesh out HERK prototype
|
2024-08-12 14:44:13 +02:00 |
Martin Kroeker
|
7ca835a82c
|
address clang array overflow warning
|
2024-08-10 13:44:56 +02:00 |
Martin Kroeker
|
a87c4d26dd
|
Merge pull request #4857 from nekopsykose/ppc
fix cmake typo for power10 cc version check
|
2024-08-10 00:15:28 +02:00 |
psykose
|
1265eee85c
|
fix cmake typo for power10 cc version check
fixes 668f48f4fc
|
2024-08-09 20:38:58 +02:00 |
Martin Kroeker
|
cd3945b998
|
Update version to 0.3.28.dev
|
2024-08-08 23:09:45 +02:00 |
Martin Kroeker
|
cbd321aecb
|
Update versin to 0.3.28.dev
|
2024-08-08 23:08:52 +02:00 |
Martin Kroeker
|
cb38d666da
|
Merge pull request #4855 from OpenMathLib/release-0.3.0
Merge release branch back into develop to copy tag
|
2024-08-08 23:08:07 +02:00 |
Martin Kroeker
|
5ef8b19646
|
Merge pull request #4854 from OpenMathLib/develop
merge develop in preparation of the 0.3.28 release
|
2024-08-08 22:41:46 +02:00 |
Martin Kroeker
|
884a949a0d
|
Merge branch 'release-0.3.0' into develop
|
2024-08-08 22:41:26 +02:00 |
Martin Kroeker
|
116bc767d8
|
Update version to 0.3.28
|
2024-08-08 22:23:02 +02:00 |
Martin Kroeker
|
91d6722a3d
|
Update version to 0.3.28
|
2024-08-08 22:22:24 +02:00 |
Martin Kroeker
|
2c8e001efe
|
Merge pull request #4853 from martin-frbg/changelog0328
Update Changelog.txt for 0.3.28
|
2024-08-08 21:14:40 +02:00 |
Martin Kroeker
|
1c2bfea1bb
|
Merge pull request #4852 from martin-frbg/fix4814
Disable forwarding from SBGEMM to SBGEMV for now
|
2024-08-08 19:16:48 +02:00 |
Martin Kroeker
|
1df95bb23a
|
Update Changelog.txt for 0.3.28
|
2024-08-08 18:51:25 +02:00 |