Commit Graph

7099 Commits

Author SHA1 Message Date
Martin Kroeker
ca1791cfeb Extend tests for error exit sysv/sytd2/gehd2 (Reference-LAPACK PR 795) 2023-05-18 17:35:00 +02:00
Martin Kroeker
0c38ebd599 Extend tests for error exit sysv/sytd2/gehd2 (Reference-LAPACK PR 795) 2023-05-18 17:31:55 +02:00
Martin Kroeker
be05ba4374 Fix typos in comments and documentation of LAPACK (Reference-LAPACK PR 820) (#4045)
* Fix typos in comments and documentation (Reference-LAPACK PR 820)
2023-05-18 16:28:20 +02:00
Martin Kroeker
9f2233bfdf Merge pull request #4044 from martin-frbg/lapack814
Fix typos in comments of LAPACK sources (Reference-LAPACK PR 814)
2023-05-17 16:16:21 +02:00
Martin Kroeker
47715b5726 Fix typos in comments (Reference-LAPACK PR 814) 2023-05-17 14:36:21 +02:00
Martin Kroeker
b01894adcb Fix typos in comments (Reference-LAPACK PR 814) 2023-05-17 14:35:13 +02:00
Martin Kroeker
a82c1443db Fix typos in comments (Reference-LAPACK PR 814) 2023-05-17 14:33:46 +02:00
Martin Kroeker
617e8bcfe7 Merge pull request #4043 from martin-frbg/lapack809-811-812
Fix typos in LAPACK comments (Reference-LAPACK PRs 809,811,812)
2023-05-17 06:38:11 +02:00
Martin Kroeker
5fbd5f531b Fix typo in description of VR argument (Reference-LAPACK 812) 2023-05-16 20:05:05 +02:00
Martin Kroeker
02efa8d6be Fix typos in comments (Reference-LAPACK 811) 2023-05-16 20:01:47 +02:00
Martin Kroeker
c5f7e46526 Fix typos and errors in comments (Reference-LAPACK 809) 2023-05-16 19:54:42 +02:00
Martin Kroeker
86f48997c7 CirrusCI: Add Neoverse build with OpenMP (#4042)
* Add Neoverse build with OpenMP
2023-05-16 12:01:50 +02:00
Martin Kroeker
e2779c852f Do not build the tests when only the CBLAS interface is selected (#4041)
* Do not build the tests when only the CBLAS interface is selected
2023-05-15 20:49:56 +02:00
Martin Kroeker
ccad94162a Merge pull request #4039 from klho/develop
Bug fix and improvements for [z]imatcopy interface.
2023-05-14 10:51:24 +02:00
Ken Ho
df1b1f6a91 More detailed error message in [z]imatcopy.c. 2023-05-12 09:41:52 -07:00
Ken Ho
7a86c437b5 Change some "if" statements to "else if" following suggestion by @mmuetzel. 2023-05-10 09:13:04 -07:00
Ken Ho
33ab415f68 Bug fix and improvements for [z]imatcopy interface. 2023-05-08 14:43:56 -07:00
Martin Kroeker
c74ee11376 Add an M1-based OSX crossbuild and a NeoverseN1 build to CIRRUS CI (#3997)
* Add an M1-based OSX crossbuild and a NeoverseN1 build (plus Windows//LLVM commented out for now)
2023-05-08 14:24:38 +02:00
Martin Kroeker
65a7941aa5 Merge pull request #4036 from martin-frbg/issue4020
Mark cblas_xerbla's arguments as const in cblas.h
2023-05-08 12:54:30 +02:00
Martin Kroeker
c2078b2356 Mark xerbla's arguments as const 2023-05-07 20:15:13 +02:00
Martin Kroeker
d6a42ed574 Merge pull request #4035 from martin-frbg/issue4034
Fix (redundant) lapack-runtest target in toplevel Makefile
2023-05-06 15:51:07 +02:00
Martin Kroeker
60226b35e1 Fix (redundant) lapack-runtest target 2023-05-06 12:44:38 +02:00
Martin Kroeker
4e597ae00b Merge pull request #4031 from martin-frbg/issue4026
Add suggestions to NUM_THREADS/auxiliary buffer message
2023-05-05 09:32:32 +02:00
Martin Kroeker
e5538a62cb Add suggestions to NUM_THREADS/auxiliary buffer message 2023-05-04 22:56:39 +02:00
Martin Kroeker
6f38a946e8 Merge pull request #4028 from catap/mktemp-fix
Do not requires GNU mktemp
2023-05-03 11:25:25 +02:00
Martin Kroeker
29c717050f Merge pull request #4022 from martin-frbg/gemmtm
fix cblas_?gemmt
2023-05-03 11:24:54 +02:00
Kirill A. Korinsky
b1781ad338 Do not requires GNU mktemp
Historically the GNU mktemp was the first one which doesn't requires
`-t` to create a directory.

Here I've introduced a fallback when `-t` is required.

For example MacPorts contains similar patch: bbe8abfe26/math/OpenBLAS/files/patch-MacOSX-mktemp.diff
2023-04-29 11:13:26 +02:00
Martin Kroeker
1f6f7328eb remove redundant declaration 2023-04-27 09:14:12 +02:00
Martin Kroeker
7152d6b06d fix cblas_gemmt 2023-04-27 08:36:20 +02:00
Martin Kroeker
e9a8d5b45f Merge pull request #4015 from martin-frbg/issue4013-2
[WIP] Disable gcc's tree-vectorizer for x86_64 CGEMV
2023-04-23 18:51:12 +02:00
Martin Kroeker
72caceb324 Merge pull request #4009 from Mousius/sve-gemm
Use SVE kernel for SGEMM/DGEMM on Arm(R) Neoverse(TM) V1
2023-04-22 13:56:45 +02:00
Martin Kroeker
d1b631899b Merge pull request #4018 from mmuetzel/ci
Adapt CI rules for MSYS2 for updated ccache
2023-04-21 23:52:13 +02:00
Markus Mützel
e27e9a50b1 CI (MSYS2): Save ccache before running tests. 2023-04-21 14:10:40 +02:00
Markus Mützel
67d33e5b98 CI (MSYS2): Update location of compiler cache. 2023-04-21 13:02:23 +02:00
Martin Kroeker
84bcf6639f Disable gcc's tree-vectorizer pass on all operating systems 2023-04-20 23:24:52 +02:00
Martin Kroeker
30a0ccbd14 Merge pull request #4014 from martin-frbg/issue4013
Generally disable gcc's tree-vectorizer in x86_64 SGEMV,SSYMV,ZGEMV,C/ZDOT
2023-04-20 10:45:15 +02:00
Martin Kroeker
c9174ae8d7 Disable gcc's tree-vectorizer pass on all operating systems 2023-04-19 23:45:44 +02:00
Martin Kroeker
c2fe9cb91f Disable gcc's tree-vectorizer pass on all operating systems 2023-04-19 23:45:14 +02:00
Martin Kroeker
66b39b835c Disable gcc's tree-vectorizer pass on all operating systems 2023-04-19 23:44:45 +02:00
Martin Kroeker
bb6d6735bf Disable gcc's tree-vectorizer pass on all operating systems 2023-04-19 23:44:15 +02:00
Martin Kroeker
d18efaed20 Disable gcc's tree-vectorizer pass on all operating systems 2023-04-19 23:43:43 +02:00
Martin Kroeker
99f6d31ed5 Disable gcc's tree-vectorizer pass on all operating systems 2023-04-19 23:42:55 +02:00
Martin Kroeker
7de9335c56 Disable gcc's tree-vectorizer pass on all operating systems 2023-04-19 23:42:09 +02:00
Martin Kroeker
437c0bf2b4 Merge pull request #3843 from Mousius/switch-ratio
Propagate SWITCH_RATIO to DYNAMIC_ARCH builds
2023-04-19 11:51:54 +02:00
Martin Kroeker
c628030669 Merge pull request #3855 from Mousius/more-switch-ratio-tuning
SWITCH_RATIO for Arm(R) Neoverse(TM) architecture
2023-04-18 22:45:51 +02:00
Martin Kroeker
efcf71255a Merge pull request #4003 from martin-frbg/issue3995
Fix instabilities in CGEMM/CTRMM/DNRM2 on Apple M1/M2 under OSX
2023-04-18 14:55:23 +02:00
Martin Kroeker
51dd1339e7 Merge pull request #4010 from martin-frbg/issue3989-2
Remove any stray trailing dash from CROSS_SUFFIX
2023-04-18 14:55:02 +02:00
Martin Kroeker
479509bb37 Remove any stray trailing dash from CROSS_SUFFIX (as would result from clang -arch) 2023-04-17 21:57:25 +02:00
Chris Sidebottom
ec334e69dc Use SVE kernel for SGEMM/DGEMM on Arm(R) Neoverse(TM) V1
This re-spins #3869 with some additional copy unrolling which helps maintain SYRK performance.

After #3868, the SVE kernels represent a pretty good boost.

This re-uses ARMV8SVE as a base and I'm going to incrementally move everything to use ARMV8SVE in additional patches (as well as fix up anything that's not already in ARMV8SVE).
2023-04-17 17:38:42 +01:00
Chris Sidebottom
5b165420b5 SWITCH_RATIO for Arm(R) Neoverse(TM) architecture
This seems like a good balance of values for reasonably sized matrices. With `SWITCH_RATIO=16` the DGEMM scales better to bigger sizes but the better solution would be some kind of
thread throttling so I've gone with `SWITCH_RATIO=8`.
2023-04-17 15:42:55 +01:00