Martin Kroeker
|
76db713e79
|
fix invocation of GEMM3M tests
|
2024-08-07 21:37:20 +02:00 |
Martin Kroeker
|
deae7cf1ec
|
Merge pull request #4850 from martin-frbg/generic_3m
Make the dummy GEMM3M kernel for GENERIC targets forward to regular GEMM for now
|
2024-08-07 21:35:38 +02:00 |
Martin Kroeker
|
46e331a917
|
remove the unworkable GEMM3M restriction from GENERIC again
|
2024-08-07 19:41:10 +02:00 |
Martin Kroeker
|
ccc23338d7
|
have the dummy GEMM3M kernel at least forward to regular GEMM
|
2024-08-07 19:39:02 +02:00 |
Martin Kroeker
|
753c7ebe17
|
Merge pull request #4835 from martin-frbg/revertwin4359
Temporarily revert to the coarse-grained locking in the Windows thread server
|
2024-08-07 14:09:32 +02:00 |
Martin Kroeker
|
3b8d7dfdca
|
Merge pull request #4846 from martin-frbg/lapack1025
Make the type used for the "hidden" string length argument configurable (adapted from Reference-LAPACK PR 1025)
|
2024-08-07 00:04:37 +02:00 |
Martin Kroeker
|
797ae08dbe
|
Add explanation of LAPACK_STRLEN
|
2024-08-06 21:38:00 +02:00 |
Martin Kroeker
|
923b79de47
|
make the type of the hidden arguments configurable via LAPACK_STRLEN (Reference-LAPACK PR 1025)
|
2024-08-06 17:55:14 +02:00 |
Martin Kroeker
|
cc36db643e
|
Support new LAPACK build option LAPACK_STRLEN
|
2024-08-06 17:31:03 +02:00 |
Martin Kroeker
|
7e8118d94e
|
Support new build option LAPACK_STRLEN
|
2024-08-06 17:30:17 +02:00 |
Martin Kroeker
|
5bdd3a05f0
|
Merge pull request #4841 from martin-frbg/lapack1033
Prevent compilers from using FMA that could increase error in ?GEEVX (Reference-LAPACK PR 1033)
|
2024-08-05 23:50:40 +02:00 |
Martin Kroeker
|
ae9e0e36c3
|
Merge pull request #4842 from martin-frbg/lapack1030
Fix typos and sytrd boundary workspace (Reference-LAPACK PR 1030)
|
2024-08-05 22:23:44 +02:00 |
Martin Kroeker
|
bce48d4a13
|
Fix typos and sytrd boundary workspace (Reference-LAPACK PR 1030)
|
2024-08-05 17:37:07 +02:00 |
Martin Kroeker
|
c8b4ceca85
|
prevent compilers from using FMA (Reference-LAPACK PR 1033)
|
2024-08-05 16:45:05 +02:00 |
Martin Kroeker
|
14a8a9a43c
|
Merge pull request #4840 from martin-frbg/issue4823
set MACOSX_RPATH to true on Apple
|
2024-08-05 15:35:25 +02:00 |
Martin Kroeker
|
a4845fa12d
|
set MACOSX_RPATH to true on Apple
|
2024-08-04 23:38:22 +02:00 |
Martin Kroeker
|
19f8a8d61c
|
Merge pull request #4839 from martin-frbg/fix4794
Add proper returns in x86_64 s/dscal kernels
|
2024-08-04 21:38:58 +02:00 |
Martin Kroeker
|
cf483d9f64
|
Merge pull request #4836 from martin-frbg/issue4275-3
use TARGET rather than CORE from Makefile.conf_last to fill in pkgconfig
|
2024-08-04 12:27:23 +02:00 |
Martin Kroeker
|
50397e017a
|
Merge pull request #4838 from martin-frbg/fix4662-3
fix invalid ifdef syntax in HUGETLB handling
|
2024-08-04 11:32:10 +02:00 |
Martin Kroeker
|
ae27b02213
|
Merge pull request #4837 from martin-frbg/dyn_riscv_cmake
Add CMAKE support for RISCV64 DYNAMIC_ARCH
|
2024-08-04 10:11:40 +02:00 |
Martin Kroeker
|
f1c9803f9a
|
add proper return statement
|
2024-08-04 00:14:31 +02:00 |
Martin Kroeker
|
60abcc3991
|
add proper return statement
|
2024-08-04 00:13:31 +02:00 |
Martin Kroeker
|
5257f807a9
|
fix invalid ifdef syntax in HUGETLB handling
|
2024-08-04 00:03:17 +02:00 |
Martin Kroeker
|
2aed90171a
|
Add riscv sources for DYNAMIC_ARCH
|
2024-08-03 23:58:10 +02:00 |
Martin Kroeker
|
e8bd97ab4b
|
add RISCV64 entries for DYNAMIC_ARCH
|
2024-08-03 23:56:59 +02:00 |
Martin Kroeker
|
f40819476c
|
mention RISCV64 as a permitted architecture for DYNAMIC_ARCH
|
2024-08-03 23:54:35 +02:00 |
Martin Kroeker
|
7af3c552d3
|
use TARGET rather than CORE from Makefile.conf_last to fill in pkgconfig
|
2024-08-03 23:51:57 +02:00 |
Martin Kroeker
|
2c2b6bcf96
|
Merge pull request #4831 from martin-frbg/gemmforward
Enable forwarding from GEMM to GEMV for RISCV and PPC in addition to ARM64
|
2024-08-03 18:52:11 +02:00 |
Martin Kroeker
|
6468dc1142
|
restore the coarse locking of the pre-4359 version
|
2024-08-02 16:39:47 +02:00 |
Martin Kroeker
|
abff4baa4d
|
re-enable queue struct members related to locking
|
2024-08-02 16:37:01 +02:00 |
Martin Kroeker
|
42d8865234
|
fix typo
|
2024-08-01 12:24:45 +02:00 |
Martin Kroeker
|
9eecd0d33b
|
enable GEMM/GEMV forwarding for riscv and ppc
|
2024-07-31 23:29:12 +02:00 |
Martin Kroeker
|
fcb88b9d52
|
enable GEMM/GEMV forwarding for riscv and ppc
|
2024-07-31 23:21:35 +02:00 |
Martin Kroeker
|
9afd0c8afd
|
Merge pull request #4814 from Mousius/gemv-proxy
Forward GEMM to GEMV when one argument is actually a vector
|
2024-07-31 23:18:01 +02:00 |
Martin Kroeker
|
edbf093c98
|
Update zarch SCAL kernels to handle INF and NAN arguments (#4829)
* handle INF and NAN in input (for S/D only if DUMMY2 argument is set)
|
2024-07-31 19:45:15 +02:00 |
Chris Sidebottom
|
ba2e989c67
|
Add accumulators to AArch64 GEMV Kernels
This helps to reduce values going missing as we accumulate.
|
2024-07-31 13:09:14 +01:00 |
Chris Sidebottom
|
b26424c6a2
|
Allow opt into GEMM -> GEMV forwarding
|
2024-07-31 13:09:14 +01:00 |
Chris Sidebottom
|
90eb863d4b
|
Re-add accidental removal
|
2024-07-31 13:09:14 +01:00 |
Chris Sidebottom
|
28b5334f22
|
Complete implementation of GEMV forwarding
|
2024-07-31 13:09:14 +01:00 |
Martin Kroeker
|
3db5dbc88e
|
forward to GEMV when one argument is actually a vector
|
2024-07-31 13:09:14 +01:00 |
Martin Kroeker
|
136a4edc5f
|
Merge pull request #4830 from martin-frbg/jenk
Continue requesting ubuntu18 instead of latest on OSUOSL powerCI
|
2024-07-30 22:19:14 +02:00 |
Martin Kroeker
|
86c15f028b
|
Update Jenkinsfile.pwr
|
2024-07-30 21:21:34 +02:00 |
Martin Kroeker
|
a13015b656
|
try requesting ubuntu22 instead of latest
|
2024-07-30 19:10:18 +02:00 |
Martin Kroeker
|
d11e734002
|
Merge pull request #4827 from Mousius/a64fx-gcc11
Fix GCC11 check for A64FX target
|
2024-07-29 16:36:13 +02:00 |
Chris Sidebottom
|
54ce33e851
|
Fix GCC11 check for A64FX target
|
2024-07-29 15:28:59 +01:00 |
Martin Kroeker
|
6d071f1a1c
|
Merge pull request #4826 from Mousius/a64fx-fallback
Add fallback compile options for A64FX target
|
2024-07-29 15:33:43 +02:00 |
Chris Sidebottom
|
3ed226d3f8
|
Re-add ISCLANG filter
|
2024-07-29 11:32:59 +01:00 |
Chris Sidebottom
|
85ca003ae7
|
Add fallback compile options for A64FX target
|
2024-07-29 11:25:03 +01:00 |
Martin Kroeker
|
05bf35f296
|
Merge pull request #4822 from martin-frbg/issue4821
Fix c_check to tolerate a dashed suffix to the gcc version number
|
2024-07-27 20:28:06 +02:00 |
Martin Kroeker
|
175008caf8
|
harden against a dashed suffix to the gcc version number
|
2024-07-27 19:08:02 +02:00 |