Martin Kroeker
d18efaed20
Disable gcc's tree-vectorizer pass on all operating systems
2023-04-19 23:43:43 +02:00
Martin Kroeker
99f6d31ed5
Disable gcc's tree-vectorizer pass on all operating systems
2023-04-19 23:42:55 +02:00
Martin Kroeker
7de9335c56
Disable gcc's tree-vectorizer pass on all operating systems
2023-04-19 23:42:09 +02:00
Martin Kroeker
437c0bf2b4
Merge pull request #3843 from Mousius/switch-ratio
...
Propagate SWITCH_RATIO to DYNAMIC_ARCH builds
2023-04-19 11:51:54 +02:00
Martin Kroeker
c628030669
Merge pull request #3855 from Mousius/more-switch-ratio-tuning
...
SWITCH_RATIO for Arm(R) Neoverse(TM) architecture
2023-04-18 22:45:51 +02:00
Martin Kroeker
efcf71255a
Merge pull request #4003 from martin-frbg/issue3995
...
Fix instabilities in CGEMM/CTRMM/DNRM2 on Apple M1/M2 under OSX
2023-04-18 14:55:23 +02:00
Martin Kroeker
51dd1339e7
Merge pull request #4010 from martin-frbg/issue3989-2
...
Remove any stray trailing dash from CROSS_SUFFIX
2023-04-18 14:55:02 +02:00
Martin Kroeker
479509bb37
Remove any stray trailing dash from CROSS_SUFFIX (as would result from clang -arch)
2023-04-17 21:57:25 +02:00
Chris Sidebottom
5b165420b5
SWITCH_RATIO for Arm(R) Neoverse(TM) architecture
...
This seems like a good balance of values for reasonably sized matrices. With `SWITCH_RATIO=16` the DGEMM scales better to bigger sizes but the better solution would be some kind of
thread throttling so I've gone with `SWITCH_RATIO=8`.
2023-04-17 15:42:55 +01:00
Chris Sidebottom
32f2fafde7
Propagate SWITCH_RATIO to DYNAMIC_ARCH builds
...
Previously dynamic builds were either using the default SWITCH_RATIO
or one from the higher level architecture; this patch ensures the
dynamic builds can use this parameter as well.
2023-04-17 15:34:12 +01:00
Martin Kroeker
a5e1fdd525
Merge pull request #4007 from Mousius/update-contributors
...
Add Chris Sidebottom to CONTRIBUTORS.md
2023-04-17 15:45:39 +02:00
Martin Kroeker
44164e3a3d
revert "move alpha out of register 18" (out of PR scope, no SVE on Apple hw)
2023-04-17 14:23:13 +02:00
Chris Sidebottom
bfc20c2e97
Add Chris Sidebottom to CONTRIBUTORS.md
2023-04-17 11:53:31 +01:00
Martin Kroeker
a44422f0d5
Merge pull request #3983 from thrasibule/makeflags
...
parallel build fixes
2023-04-16 13:49:05 +02:00
Martin Kroeker
73e6fcb925
Merge pull request #4006 from martin-frbg/issue4005
...
Fix ?GEMMT implementation
2023-04-16 13:30:17 +02:00
Martin Kroeker
38d7a7b562
Fix ?GEMMT
2023-04-16 00:07:58 +02:00
Martin Kroeker
8be68fa7f4
move declaration of sca to really keep the compiler from throwing it out (for now)
2023-04-15 12:02:39 +02:00
Martin Kroeker
4eac244c9a
Merge pull request #4004 from martin-frbg/ccheckif
...
fix missing blank in c_check
2023-04-14 22:57:18 +02:00
Martin Kroeker
970e611e00
fix missing blank in test
2023-04-14 19:42:34 +02:00
Martin Kroeker
f096a339e4
Use long value fields for cpu ident on OSX
2023-04-13 18:16:09 +02:00
Martin Kroeker
3727672a74
Improve workaround and keep compilers from optimizing it out
2023-04-13 18:07:52 +02:00
Martin Kroeker
108a21e47a
Move ALPHA out of register 18 (reserved on OSX)
2023-04-13 18:05:14 +02:00
Martin Kroeker
0b1acb0ba3
Move ALPHA_I out of register 18 (reserved on OSX)
2023-04-13 18:03:35 +02:00
Martin Kroeker
c7bbad09ad
Move ALPHA_I out of register 18 (reserved on OSX)
2023-04-13 18:00:47 +02:00
Martin Kroeker
cda29633a3
move ALPHA_I out of register 18 (reserved on OSX)
2023-04-13 17:59:48 +02:00
Martin Kroeker
6f759a9ce9
Merge pull request #4002 from imzhuhl/spr_detect
...
Fix x86 detection error
2023-04-13 13:18:39 +02:00
Honglin Zhu
ac650225c1
Fix x86 detection error
2023-04-13 00:08:27 +08:00
Martin Kroeker
58de28f332
Merge pull request #3999 from martin-frbg/issue3998
...
Convert CMAKE booleans to 0/1 values for gensymbol
2023-04-12 10:38:27 +02:00
Martin Kroeker
2ea00788c2
Add ?GEMMT
2023-04-11 22:46:51 +02:00
Martin Kroeker
6c45c98083
Add (only) the GEMMT functions
2023-04-11 22:41:18 +02:00
Martin Kroeker
cd8eb33a9c
Expose BUILD_LAPACK_DEPRECATED
2023-04-11 22:39:53 +02:00
Martin Kroeker
57bdc36c84
add conditionals for BUILD_LAPACK_DEPRECATED
2023-04-11 22:38:38 +02:00
Martin Kroeker
e0f8b4fef4
Merge pull request #4000 from martin-frbg/applem2
...
Support Apple A15/M2 cpus through the existing VORTEX target
2023-04-11 08:28:44 +02:00
Martin Kroeker
caa2945138
Support Apple A15/M2 cpus through the existing VORTEX target
2023-04-11 00:04:09 +02:00
Martin Kroeker
d5fbec7c20
Export ?MIN/?MAX, ?AMIN/?AMAX, CDOT/ZDOT and ?GEMMT
2023-04-10 23:49:35 +02:00
Martin Kroeker
fd20a2e8c6
Convert CMAKE booleans to 0/1 values for gensymbol
2023-04-10 22:28:00 +02:00
Martin Kroeker
326b200b08
Merge pull request #3996 from martin-frbg/issue3989
...
Protect CROSS_SUFFIX against spurious linebreaks from isolated dashes
2023-04-07 23:31:51 +02:00
Martin Kroeker
3effdc1505
Protect CROSS_PATH against spurious addition of linebreaks from isolated dashes
...
fix for #3989
2023-04-07 19:32:22 +02:00
Martin Kroeker
654d87d73a
Merge pull request #3994 from rgommers/fix-ssyconvf-export
...
Export `ssyconvf` symbol
2023-04-07 18:15:14 +02:00
Martin Kroeker
d677214570
Remove the badge for the dead drone.io service and add Cirrus CI in its place
2023-04-07 14:11:16 +02:00
Ralf Gommers
a4ee1c84f0
Export ssyconvf symbol
...
This was apparently missed in commit a836fe8ec when adding the
LAPACK 3.7.0 symbols. We noticed when adding wrappers for 3.7.0
routines in SciPy. For more details, see
https://github.com/rgommers/scipy/issues/143
2023-04-07 12:50:36 +01:00
Martin Kroeker
ca8544be6d
Merge pull request #3991 from martin-frbg/lapack808
...
Refactor ?GEBAL for readability (Reference-LAPACK PR 808)
2023-04-04 15:27:17 +02:00
Martin Kroeker
d175b8f56f
Refactor ?GEBAL (Reference-LAPACK PR 808)
2023-04-03 15:02:10 +02:00
Martin Kroeker
5f1fb27c40
Rename cirrus.yml to .cirrus.yml
2023-04-03 11:00:17 +02:00
Zhang Xianyi
ab0755590f
Merge pull request #3990 from martin-frbg/cirrus
...
Add Apple M1 testing via Cirrus CI
2023-04-03 16:54:40 +08:00
Martin Kroeker
65b7bf9f3e
Add Apple M1 testing via Cirrus CI
2023-04-03 10:51:38 +02:00
Martin Kroeker
516f22b8ca
Update version to 0.3.23.dev
2023-04-01 22:25:55 +02:00
Martin Kroeker
3e8f51e7cf
Update version to 0.3.23.dev
2023-04-01 22:25:07 +02:00
Martin Kroeker
f9a701b6dd
Merge pull request #3988 from xianyi/release-0.3.0
...
Merge back from release branch into develop to copy tag
2023-04-01 22:24:26 +02:00
Martin Kroeker
394a9fbafe
Increment version to 0.3.23
v0.3.23
2023-04-01 22:18:01 +02:00