Martin Kroeker
b6d74b7fff
Add f2c-converted files for the BLAS3-based Sylvester solver
2022-11-15 16:26:44 +01:00
Martin Kroeker
b2cc310470
Add f2c-converted versions of the new BLAS3-based Sylvester solver
2022-11-15 14:23:46 +01:00
Martin Kroeker
379efbe5af
Fix typos
2022-11-15 11:03:12 +01:00
Martin Kroeker
95da5141f0
Add a BLAS3-based triangular Sylvester equation solver (Reference-LAPACK PR 651)
2022-11-14 22:21:29 +01:00
Martin Kroeker
2592853fc7
Restore OpenBLAS-specific changes
2022-11-14 21:47:37 +01:00
Martin Kroeker
52c2a0397b
Restore OpenBLAS modifications to link line
2022-11-14 17:13:08 +01:00
Martin Kroeker
bb652f65a3
Typo fix
2022-11-14 16:35:13 +01:00
Martin Kroeker
fb42a0cf8b
Add a BLAS3-based triangular Sylvester equation solver (Reference-LAPACK PR 651)
2022-11-14 14:06:50 +01:00
Martin Kroeker
13f3bbece1
Add a BLAS3-based triangular Sylvester equation solver (Reference-LAPACK PR 651)
2022-11-13 23:18:09 +01:00
Martin Kroeker
92174725d9
Add a BLAS3-based triangular Sylvester equation solver (Reference-LAPACK PR 651)
2022-11-13 23:16:12 +01:00
Martin Kroeker
6eb707d941
Add a BLAS3-based triangular Sylvester equation solver (Reference-LAPACK PR 651)
2022-11-13 23:10:13 +01:00
Martin Kroeker
7eb2653268
Add a BLAS3-based triangular Sylvester equation solver (Reference-LAPACK PR 651)
2022-11-13 23:07:10 +01:00
Martin Kroeker
4bc918a791
Add a BLAS3-based triangular Sylvester equation solver (Reference-LAPACK PR 651)
2022-11-13 23:03:31 +01:00
Martin Kroeker
35dac5677a
Merge pull request #3816 from martin-frbg/lapack638
...
Fix workspace calculation in GEQRF/GERQF (Reference-LAPACK PR 638)
2022-11-13 20:38:42 +01:00
Martin Kroeker
ee6643bc6b
Merge pull request #3815 from martin-frbg/lapack690
...
Fix workspace calculation in the left-looking variant of GEQRF (Reference-LAPACK PR690)
2022-11-13 16:26:31 +01:00
Martin Kroeker
3e2d52c502
Fix workspace calculation in GEQRF/GERQF (Reference-LAPACK PR 638)
2022-11-13 13:00:52 +01:00
Martin Kroeker
cb48c29b6f
Fix workspace calculation (Reference-LAPACK PR690)
2022-11-13 12:49:59 +01:00
Martin Kroeker
8c99d5d1b6
Merge pull request #3796 from martin-frbg/gemmt
...
Add a trivial GEMMT implementation based on a looped GEMV
2022-11-12 19:06:05 +01:00
Martin Kroeker
b53b0f6bb6
Merge pull request #3802 from martin-frbg/relafix
...
Fix cmake compilation of ReLAPACK and expose its INCLUDE_ALL option
2022-11-12 15:11:31 +01:00
Martin Kroeker
9a31faf420
Merge pull request #3811 from martin-frbg/issue3805
...
Improve gcc arch option selecting for Neoverse cpus
2022-11-10 10:57:33 +01:00
Martin Kroeker
e326ef9f0f
Merge pull request #3812 from bartoldeman/cscal-zscal-skylakex
...
Add [cz]scal microkernels for SKYLAKEX
2022-11-10 08:00:27 +01:00
Martin Kroeker
827a9c6079
Merge pull request #3814 from martin-frbg/traviswait-3
...
Travis Ci: Increase the wait time for ppc jobs again
2022-11-10 08:00:02 +01:00
Martin Kroeker
d141cf341f
Increase the wait time for ppc jobs again
2022-11-09 20:31:30 +01:00
Martin Kroeker
aad79ab516
Merge pull request #3813 from martin-frbg/azuredynosx
...
AzureCi: Limit cpu models in OSX_dynarch_cmake to keep it from running out of time
2022-11-09 20:29:17 +01:00
Martin Kroeker
09dd90ca09
Limit cpu models in OSX_dynarch_cmake
2022-11-09 15:35:57 +01:00
Martin Kroeker
f14435cb4b
Merge pull request #3810 from martin-frbg/fix3800
...
Add fallbacks to RaptorLake entry from PR3800
2022-11-09 15:28:12 +01:00
Bart Oldeman
6c1043eb41
Add [cz]scal microkernels for SKYLAKEX
...
These are as similar to dscal_microk_skylakex-2.c as possible
for consistency.
Note that before this change SKYLAKEX+ uses generic C functions for
cscal/zscal via commit 2271c350 from #2610 (which is masked by
commit 086d87a30 ). However now #3799 disables FMAs (in turn enabled
by `-march=skylake-avx512`) in the plain C code which fixes excessive
LAPACK test failures more nicely.
2022-11-09 08:57:03 -05:00
Martin Kroeker
be546ec1ad
Add gcc options for Neoverse cpus
2022-11-09 11:00:41 +01:00
Martin Kroeker
c957ad684e
Bump gcc requirement for NeoverseN2 and V1 to 10.4
2022-11-09 10:46:43 +01:00
Martin Kroeker
1865b15240
Add fallbacks to RaptorLake entry
2022-11-09 10:31:30 +01:00
Martin Kroeker
e6204d254f
Update CMakeLists.txt
2022-11-08 16:21:11 +01:00
Martin Kroeker
2e64722681
Update Makefile.rule
2022-11-08 16:20:17 +01:00
Martin Kroeker
aa2a2d9c01
Conditionally compile files that may get replaced by ReLAPACK
2022-11-08 12:04:46 +01:00
Martin Kroeker
1b77764182
Conditionally leave out bits of LAPACK to be overridden by ReLAPACK
2022-11-08 12:02:59 +01:00
Martin Kroeker
fcda11c1ae
Revert special handling of GEMMT
2022-11-05 23:48:50 +01:00
Martin Kroeker
4743d80c22
Merge pull request #3800 from thrasibule/raptorlake
...
add raptor lake ids
2022-11-05 18:05:48 +01:00
Martin Kroeker
5d02f2e83e
Merge pull request #3806 from martin-frbg/dyn_coop
...
Fix OPENBLAS_CORETYPE=COOPERLAKE not working in DYNAMIC_ARCH builds
2022-11-03 21:37:39 +01:00
Martin Kroeker
da6e426b13
fix Cooperlake not selectable via environment variable
2022-11-03 18:13:35 +01:00
Martin Kroeker
c970717157
fix missing t in xgemmt rule
...
Co-authored-by: Alexis <35051714+amontoison@users.noreply.github.com>
2022-11-01 13:51:20 +01:00
Martin Kroeker
62a44c9c5d
Merge pull request #3804 from martin-frbg/issue3803
...
Remove excess initializer (leftover from rework of PR 3793)
2022-10-31 20:42:33 +01:00
Martin Kroeker
c9d78dc3b2
Remove excess initializer (leftover from rework of PR 3793)
2022-10-31 16:57:03 +01:00
Martin Kroeker
65338a9493
Merge pull request #3799 from bartoldeman/cscal-zscal-no-fma
...
x86_64: prevent GCC and Clang from generating FMAs in cscal/zscal.
2022-10-30 18:56:10 +01:00
Martin Kroeker
ea6c5f3cf5
Add option RELAPACK_REPLACE
2022-10-30 12:55:23 +01:00
Martin Kroeker
d39978cd7f
Fix includes
2022-10-30 12:53:19 +01:00
Martin Kroeker
ce7ea72de1
Fix include paths
2022-10-30 12:50:51 +01:00
Martin Kroeker
3ebf5d219d
handle INCLUDE_ALL and optional function prefixes
2022-10-30 12:49:07 +01:00
Martin Kroeker
a082d54035
Rename to avoid conflict with OpenBLAS' toplevel config.h
2022-10-30 12:47:01 +01:00
Martin Kroeker
eeebaf2294
move INCLUDE_ALL to (c)make options
2022-10-30 12:45:54 +01:00
Martin Kroeker
06b022b139
Fix ReLAPACK source selection
2022-10-30 12:42:36 +01:00
Martin Kroeker
03bd1157d8
Merge pull request #3793 from imzhuhl/new_sbgemm
...
New sbgemm implementation for Neoverse N2
2022-10-30 12:09:46 +01:00