Martin Kroeker
94adf98bb8
remove unused status variable
2023-07-26 08:31:37 +02:00
Martin Kroeker
3326b924b3
remove status variable blas_num_threads_set; initialize openmp thread maximum on startup
2023-07-26 00:31:24 +02:00
Martin Kroeker
ea669c8ae9
simplify openmp thread limit handling
2023-07-26 00:27:14 +02:00
Chris Sidebottom
24586bc4ff
Disambiguate whilelt
2023-07-25 20:15:44 +01:00
Chris Sidebottom
f971ef55f2
Add ARMV8SVE to AArch64 Dynamic Dispatch
...
In order to enable support for future cores which have similar tunings
(in this case I'm doing this for the Arm(R) Neoverse(TM) V2 core), this generically detects SVE support and enables it. This should better manage the size and complexity of dynamic dispatch rather than just copy pasting the same parameters.
To make `ARMV8SVE` more representive of the common 128-bit SVE case,
I've split it and similar parameters from A64FX which has the wider
512-bit SVE.
2023-07-25 18:35:15 +01:00
Chris Sidebottom
aea2a4622b
Use latest non-SVE kernels in ARMV8SVE
...
These are generally better and, in some cases, include threading which helps in the cores we're targeting here.
2023-07-25 14:12:26 +01:00
martin-frbg
7976deff80
Fix file permissions (issue 4095)
2023-07-23 20:37:07 +02:00
martin-frbg
fec4867748
Fix file permissions (issue 4095)
2023-07-23 20:31:55 +02:00
Martin Kroeker
25037ae875
Fix actual arguments in some LAPACK procedure calls (Reference-LAPACK PR 885) ( #4155 )
...
* Fix actual arguments (Reference-LAPACK PR 885)
2023-07-22 23:14:25 +02:00
Martin Kroeker
bd01dc354b
Merge pull request #4151 from martin-frbg/issue4101
...
Ensure that early calls to blas_set_num_threads will not overwrite unrelated memory
2023-07-20 13:21:07 +02:00
Martin Kroeker
3bdcf3259d
Merge branch 'xianyi:develop' into issue4101
2023-07-20 08:23:20 +02:00
Martin Kroeker
5cb4f5940d
Merge pull request #4152 from martin-frbg/shutup-4098
...
Override the C910V DSDOT with generic code to get rid of the qemu precision error in CI
2023-07-20 08:22:57 +02:00
Martin Kroeker
76ef1672f8
Override DSDOT with generic code to get rid of qemu precision error
2023-07-19 22:31:07 +02:00
Martin Kroeker
8a27a274a1
Merge pull request #4150 from martin-frbg/armsve
...
Fix runtime detection in ARMV8 DYNAMIC_ARCH to check SVE capability
2023-07-19 22:25:55 +02:00
Martin Kroeker
b34f19a365
Ensure that a premature call to set_num_threads will not overwrite unrelated memory
2023-07-19 22:19:22 +02:00
Martin Kroeker
66904f8148
Ensure that a premature call will not overwrite unrelated memory
2023-07-19 22:14:34 +02:00
Martin Kroeker
5c58994eb2
Add fallback warning
2023-07-19 18:27:41 +02:00
Martin Kroeker
ca7199f249
Treat newer Neoverse as N1 if SVE unavailable (may be disabled in container/cloud env)
2023-07-19 14:48:42 +02:00
Martin Kroeker
9e81a3a0a2
Merge pull request #4100 from martin-frbg/cirrusm1gccmake
...
Cirrus CI: Add Apple M1 build using gcc,gmake and OpenMP
2023-07-18 08:04:29 +02:00
Martin Kroeker
ada9e442eb
Add Apple M1 build using gcc,gmake and OpenMP
2023-07-17 23:13:56 +02:00
Martin Kroeker
81228fc586
Merge pull request #4147 from martin-frbg/aldern
...
Support Alder Lake N (family 6 exmodel 11 model 14) as Haswell
2023-07-17 09:11:23 +02:00
Martin Kroeker
8da6aca2ec
Support Alder Lake N (fam 6 exmodel 11 model 14) as Haswell
2023-07-16 22:15:15 +02:00
Martin Kroeker
b61e64da6f
Merge pull request #4142 from exyntech/armv8-as-arm64
...
Fix armv8 detection in system_check.cmake
2023-07-15 23:15:49 +02:00
Martin Kroeker
f82a197143
Merge pull request #4137 from felixonmars/patch-1
...
Fix riscv64 detection in system_check.cmake
2023-07-15 19:41:06 +02:00
Martin Kroeker
0a637cc403
Fix workspace query corner cases to always return at least 1 (Reference-LAPACK PR 883) ( #4146 )
...
* Fix workspace query corner cases to always return at least 1
2023-07-15 16:37:42 +02:00
Martin Kroeker
4c43d1eeba
Fix C prototypes and LAPACKE headers for ?GEDMD/?GEDMDQ ( #4134 )
...
* Fix prototypes for ?GEDMD/?GEDMDQ and their LAPACKE interfaces
2023-07-15 07:47:19 +02:00
Martin Kroeker
49077e7bde
Merge pull request #4145 from martin-frbg/issue4144
...
Restore zero-initialization of variables in generic ztrsm_utcopy
2023-07-14 12:44:05 +02:00
Martin Kroeker
3d31191b0f
Work around Clang failing to disambiguate SVE intrinsics and add AppleClang crossbuild to MacOS/arm64 DYNAMIC_ARCH in AzureCI ( #4140 )
...
* Add AppleClang crossbuild to MacOS/arm64 DYNAMIC_ARCH
* add casts to disambiguate svwhilelt for clang
2023-07-14 11:06:48 +02:00
Martin Kroeker
04cdf5efb4
fix typo and missing declaration
2023-07-14 00:05:00 +02:00
Martin Kroeker
5e1103b8d7
Update rotg.c
2023-07-13 23:35:38 +02:00
Martin Kroeker
cfa0a80664
Restore initialization of data variables
2023-07-13 23:23:12 +02:00
Martin Kroeker
9567305e4c
Restore initialization of data01,data02
2023-07-13 23:21:18 +02:00
Martin Kroeker
4cc232bb07
Merge branch 'xianyi:develop' into issue4130
2023-07-13 21:40:22 +02:00
Martin Kroeker
7c75c8b2fe
fix truncated edit
2023-07-13 21:40:12 +02:00
Martin Kroeker
0f2ce93904
typo fix
2023-07-13 10:56:59 +02:00
Martin Kroeker
affeef0b9c
Fix gmake build not always picking the right ARM64 arch options for clang ( #4136 )
...
* Fix gcc version checks erroneously excluding clang
* Avoid some mtune names not supported by (Apple)Clang
2023-07-13 08:38:03 +02:00
Martin Kroeker
e08743d977
Update to use safe scaling algorithm from Reference-LAPACK PR 527
2023-07-12 23:02:36 +02:00
Andy Mroczkowski
45b2cd2fb2
treat armv8 CMAKE_SYSTEM_PROCESSOR as arm64
...
The cmake scripts incorrectly treated armv8 as 32-bit arm, causing
compilation issues. This just adds 'armv8' to the arm64 condition check.
2023-07-12 09:37:45 -04:00
Martin Kroeker
494313e75e
Merge pull request #4138 from martin-frbg/fix4126
...
Add converted C versions of C/ZRSCL to fix build errors introduced by PR4126
2023-07-11 20:41:02 +02:00
Martin Kroeker
afef854863
Add C versions of C/ZRSCL
2023-07-11 17:08:27 +02:00
Martin Kroeker
35dedb68ce
Add C versions of C/ZRSCL
2023-07-11 17:07:30 +02:00
Felix Yan
a721fccfdc
Fix riscv64 detection in system_check.cmake
2023-07-11 16:34:20 +03:00
Martin Kroeker
2edebc5fb9
Merge pull request #4133 from martin-frbg/issue4132
...
Fix info code returned for invalid ldb by IMATCOPY
2023-07-10 01:50:38 +02:00
Martin Kroeker
bcebe9b4c9
Merge pull request #4131 from martin-frbg/lapack878
...
Fix computation of UPLO in LAPACKE_?larfb (Reference-LAPACK PR 878)
2023-07-10 01:50:16 +02:00
Martin Kroeker
26fd4b9c8c
Merge pull request #4129 from martin-frbg/lapack876
...
Fix segfault in ?GELSS when NRHS is zero (Reference-LAPACK PR 876)
2023-07-10 01:49:55 +02:00
Martin Kroeker
22ad23abb1
Merge pull request #4126 from martin-frbg/lapack839
...
Add C/ZRSCL for reciprocal scaling of a complex vector (Reference-LAPACK PR 839)
2023-07-10 01:49:33 +02:00
Martin Kroeker
351645b8af
Merge pull request #4123 from martin-frbg/lapack867
...
Correct order of eigenvals/vecs for 2x2 matrices in ?STEMR (Reference-LAPACK PR 867)
2023-07-10 01:48:18 +02:00
Martin Kroeker
f5413447aa
Merge pull request #4122 from martin-frbg/issue4121
...
Fix CMAKE builds of SVE-capable targets in arm64 DYNAMIC_ARCH
2023-07-09 22:57:44 +02:00
Martin Kroeker
5dd1d9cacd
Merge pull request #4120 from martin-frbg/jenkinsbadge
...
Add status badges for OSUOSL's POWERCI and IBMZ-CI services to README.MD
2023-07-09 22:57:11 +02:00
Martin Kroeker
15dfb2f2cf
Merge pull request #4118 from XiWeiGu/develop
...
LoongArch64: Add WhereAmI()
2023-07-09 22:56:47 +02:00