Chris Sidebottom
ec334e69dc
Use SVE kernel for SGEMM/DGEMM on Arm(R) Neoverse(TM) V1
...
This re-spins #3869 with some additional copy unrolling which helps maintain SYRK performance.
After #3868 , the SVE kernels represent a pretty good boost.
This re-uses ARMV8SVE as a base and I'm going to incrementally move everything to use ARMV8SVE in additional patches (as well as fix up anything that's not already in ARMV8SVE).
2023-04-17 17:38:42 +01:00
Martin Kroeker
a5e1fdd525
Merge pull request #4007 from Mousius/update-contributors
...
Add Chris Sidebottom to CONTRIBUTORS.md
2023-04-17 15:45:39 +02:00
Chris Sidebottom
bfc20c2e97
Add Chris Sidebottom to CONTRIBUTORS.md
2023-04-17 11:53:31 +01:00
Martin Kroeker
a44422f0d5
Merge pull request #3983 from thrasibule/makeflags
...
parallel build fixes
2023-04-16 13:49:05 +02:00
Martin Kroeker
73e6fcb925
Merge pull request #4006 from martin-frbg/issue4005
...
Fix ?GEMMT implementation
2023-04-16 13:30:17 +02:00
Martin Kroeker
38d7a7b562
Fix ?GEMMT
2023-04-16 00:07:58 +02:00
Martin Kroeker
4eac244c9a
Merge pull request #4004 from martin-frbg/ccheckif
...
fix missing blank in c_check
2023-04-14 22:57:18 +02:00
Martin Kroeker
970e611e00
fix missing blank in test
2023-04-14 19:42:34 +02:00
Martin Kroeker
6f759a9ce9
Merge pull request #4002 from imzhuhl/spr_detect
...
Fix x86 detection error
2023-04-13 13:18:39 +02:00
Honglin Zhu
ac650225c1
Fix x86 detection error
2023-04-13 00:08:27 +08:00
Martin Kroeker
58de28f332
Merge pull request #3999 from martin-frbg/issue3998
...
Convert CMAKE booleans to 0/1 values for gensymbol
2023-04-12 10:38:27 +02:00
Martin Kroeker
2ea00788c2
Add ?GEMMT
2023-04-11 22:46:51 +02:00
Martin Kroeker
6c45c98083
Add (only) the GEMMT functions
2023-04-11 22:41:18 +02:00
Martin Kroeker
cd8eb33a9c
Expose BUILD_LAPACK_DEPRECATED
2023-04-11 22:39:53 +02:00
Martin Kroeker
57bdc36c84
add conditionals for BUILD_LAPACK_DEPRECATED
2023-04-11 22:38:38 +02:00
Martin Kroeker
e0f8b4fef4
Merge pull request #4000 from martin-frbg/applem2
...
Support Apple A15/M2 cpus through the existing VORTEX target
2023-04-11 08:28:44 +02:00
Martin Kroeker
caa2945138
Support Apple A15/M2 cpus through the existing VORTEX target
2023-04-11 00:04:09 +02:00
Martin Kroeker
d5fbec7c20
Export ?MIN/?MAX, ?AMIN/?AMAX, CDOT/ZDOT and ?GEMMT
2023-04-10 23:49:35 +02:00
Martin Kroeker
fd20a2e8c6
Convert CMAKE booleans to 0/1 values for gensymbol
2023-04-10 22:28:00 +02:00
Martin Kroeker
326b200b08
Merge pull request #3996 from martin-frbg/issue3989
...
Protect CROSS_SUFFIX against spurious linebreaks from isolated dashes
2023-04-07 23:31:51 +02:00
Martin Kroeker
3effdc1505
Protect CROSS_PATH against spurious addition of linebreaks from isolated dashes
...
fix for #3989
2023-04-07 19:32:22 +02:00
Martin Kroeker
654d87d73a
Merge pull request #3994 from rgommers/fix-ssyconvf-export
...
Export `ssyconvf` symbol
2023-04-07 18:15:14 +02:00
Martin Kroeker
d677214570
Remove the badge for the dead drone.io service and add Cirrus CI in its place
2023-04-07 14:11:16 +02:00
Ralf Gommers
a4ee1c84f0
Export ssyconvf symbol
...
This was apparently missed in commit a836fe8ec when adding the
LAPACK 3.7.0 symbols. We noticed when adding wrappers for 3.7.0
routines in SciPy. For more details, see
https://github.com/rgommers/scipy/issues/143
2023-04-07 12:50:36 +01:00
Martin Kroeker
ca8544be6d
Merge pull request #3991 from martin-frbg/lapack808
...
Refactor ?GEBAL for readability (Reference-LAPACK PR 808)
2023-04-04 15:27:17 +02:00
Martin Kroeker
d175b8f56f
Refactor ?GEBAL (Reference-LAPACK PR 808)
2023-04-03 15:02:10 +02:00
Martin Kroeker
5f1fb27c40
Rename cirrus.yml to .cirrus.yml
2023-04-03 11:00:17 +02:00
Zhang Xianyi
ab0755590f
Merge pull request #3990 from martin-frbg/cirrus
...
Add Apple M1 testing via Cirrus CI
2023-04-03 16:54:40 +08:00
Martin Kroeker
65b7bf9f3e
Add Apple M1 testing via Cirrus CI
2023-04-03 10:51:38 +02:00
Martin Kroeker
516f22b8ca
Update version to 0.3.23.dev
2023-04-01 22:25:55 +02:00
Martin Kroeker
3e8f51e7cf
Update version to 0.3.23.dev
2023-04-01 22:25:07 +02:00
Martin Kroeker
f9a701b6dd
Merge pull request #3988 from xianyi/release-0.3.0
...
Merge back from release branch into develop to copy tag
2023-04-01 22:24:26 +02:00
Martin Kroeker
394a9fbafe
Increment version to 0.3.23
v0.3.23
2023-04-01 22:18:01 +02:00
Martin Kroeker
8f32384633
Increment version to 0.3.23
2023-04-01 22:17:27 +02:00
Martin Kroeker
af3606d9fb
Merge pull request #3987 from xianyi/develop
...
Merge from develop branch for 0.3.23
2023-04-01 22:16:24 +02:00
Martin Kroeker
cd2e80ca2e
Merge branch 'release-0.3.0' into develop
2023-04-01 22:15:52 +02:00
Martin Kroeker
e2614eb6ce
Merge pull request #3986 from martin-frbg/changelog0323
...
Update with 0.3.23 changes
2023-04-01 22:08:43 +02:00
Martin Kroeker
1f70481384
Update with 0.3.23 changes
2023-04-01 20:33:31 +02:00
Martin Kroeker
eb0793bfd0
Merge pull request #3984 from martin-frbg/develop
...
Fix logic bug in single-threaded C/Z SPR
2023-04-01 11:35:52 +02:00
Martin Kroeker
36fcb52094
Fix logic - we want real OR imaginary part of X to be nonzero here
2023-04-01 00:02:54 +02:00
Guillaume Horel
397108fba2
serialize shared prerequisites
2023-03-31 09:25:51 -04:00
Guillaume Horel
281e834566
do not pass -j flag to the MAKE variable
2023-03-31 09:25:51 -04:00
Martin Kroeker
d708951375
Merge pull request #3980 from martin-frbg/fix3941-2
...
Split and improve test criteria in LU computation (?GETF2)
2023-03-30 06:56:05 +02:00
Martin Kroeker
6c431239da
Split test condition in LU computation - non-denormal for computation, exact zero for reporting singularity
2023-03-29 22:14:21 +02:00
Martin Kroeker
23f2c4ca5b
Merge pull request #3978 from martin-frbg/fix3941
...
fix division-by-zero guard in zgetf2
2023-03-29 16:22:27 +02:00
Martin Kroeker
12aabb9f9b
fix conditional
2023-03-29 09:44:33 +02:00
Martin Kroeker
fd0614cbc0
Merge pull request #3975 from martin-frbg/issue3974
...
Fix build failures with NO_LAPACK
2023-03-28 22:57:27 +02:00
Martin Kroeker
912d713b52
redo lost edit
2023-03-28 18:31:04 +02:00
Martin Kroeker
dc15c18efc
Fix build failures seen with the NO_LAPACK option - cspr/csymv/csyr belong on the LAPACK list
2023-03-28 16:33:09 +02:00
Martin Kroeker
5d9d382e36
Merge pull request #3970 from linouxis9/develop
...
Improve Intel Raptor Lake detection
2023-03-28 16:22:27 +02:00