Commit Graph

8084 Commits

Author SHA1 Message Date
yancheng
e3fb2b5afa loongarch64: Add optimizations for imin. 2023-12-07 14:36:07 +08:00
yancheng
e46b48e372 loongarch64: Add optimizations for imax. 2023-12-07 14:36:07 +08:00
yancheng
702fc1d56d loongarch64: Add optimization for min. 2023-12-07 14:36:07 +08:00
yancheng
346b384d1c loongarch64: Add optimization for max. 2023-12-07 14:36:07 +08:00
yancheng
ff2ecc6cda loongarch64: Add optimization for amin. 2023-12-07 14:36:07 +08:00
yancheng
265b5f2e80 loongarch64: Add optimizations for amax. 2023-12-07 14:36:07 +08:00
yancheng
993ede7c70 loongarch64: Add optimizations for scal. 2023-12-07 14:36:07 +08:00
Mark Seminatore
4ebf814b42 fix bug failing to mark task as finished. 2023-12-05 23:28:37 -08:00
Mark Seminatore
5f51811728 try at new threading model 2023-12-05 22:43:36 -08:00
Martin Kroeker
a8cb611157 Merge pull request #4358 from martin-frbg/lapack954
Fix keyword used to count successful tests (Reference-LAPACK PR 954)
2023-12-05 22:20:15 +01:00
Martin Kroeker
589f2b6466 Fix search phrase used to count successful tests (Reference-LAPACK PR 954) 2023-12-05 20:10:20 +01:00
Martin Kroeker
6aa5f53e26 Merge pull request #4357 from martin-frbg/lapack953
Fix memory leak in LAPACK testing framework (Reference-LAPACK PR 953)
2023-12-05 20:03:21 +01:00
Martin Kroeker
effb7af2a2 Fix memory leak (Reference-LAPACK PR 953) 2023-12-05 17:55:38 +01:00
Martin Kroeker
5915a69734 Merge pull request #4356 from martin-frbg/lapack736-2
Add LAPACK tests for the Dynamic Mode Decomposition functions from Reference-LAPACK PR 736
2023-12-05 17:48:42 +01:00
Martin Kroeker
226a14c549 Restore library path adjustments 2023-12-05 15:50:06 +01:00
Martin Kroeker
c5fa318add Add tests for DMD (Reference-LAPACK PR 736) 2023-12-05 15:45:59 +01:00
Martin Kroeker
fa03e5497a Add tests for the DMD functions (Reference-LAPACK PR 736) 2023-12-05 15:43:28 +01:00
Martin Kroeker
a53a79e059 Add tests for the DMD functions (Reference-LAPACK PR 736) 2023-12-05 15:41:39 +01:00
Martin Kroeker
e3039fa7f6 Merge pull request #4351 from catap/cmake-old-macos
Use 64bit build on `CMAKE_SYSTEM_PROCESSOR=i386` on Darwin
2023-12-05 14:40:18 +01:00
Octavian Maghiar
4a12cf53ec [RISC-V] Improve RVV kernel generator LMUL usage
The RVV kernel generation script uses the provided LMUL to increase the number of accumulator registers.
Since the effect of the LMUL is to group together the vector registers into larger ones, it actually should be used as a multiplier in the calculation of vlenmax.
At the moment, no matter what LMUL is provided, the generated kernels would only set the maximum number of vector elements equal to VLEN/SEW.
Commit changes the use of LMUL to properly adjust vlenmax. Note that an increase in LMUL results in a decrease in the number of effective vector registers.
2023-12-04 11:13:35 +00:00
Octavian Maghiar
e4586e81b8 [RISC-V] Add RISC-V Vector 128-bit target
Current RVV x280 target depends on vlen=512-bits for Level 3 operations.
Commit adds generic target that supports vlen=128-bits.
New target uses the same scalable kernels as x280 for Level 1&2 operations, and autogenerated kernels for Level 3 operations.
Functional correctness of Level 3 operations tested on vlen=128-bits using QEMU v8.1.1 for ctests and BLAS-Tester.
2023-12-04 11:02:18 +00:00
Erik Bråthen Solem
2381132ada Darwin < 20: always write xerbla.c.o into archive
Write xerbla.c.o into archive regardless of timestamp by using ar -rs
instead of ar -ru.
2023-12-03 19:13:56 +01:00
Erik Bråthen Solem
89fa51d495 Revert 42b5e08 ("Allow weak linking on old macOS") 2023-12-03 19:06:49 +01:00
Kirill A. Korinsky
08fde5ebd2 Use 64bit build on CMAKE_SYSTEM_PROCESSOR=i386 on Darwin
Here a bit tricky things.

A value `CMAKE_SYSTEM_PROCESSOR` is came from output of `uname -m` which
migth be 32bit with 64bit building applicaiton.

So, for that case use `CMAKE_SIZEOF_VOID_P` to detect the target.

See https://trac.macports.org/ticket/68488
2023-11-30 21:24:58 +00:00
Martin Kroeker
39bf8ece20 Merge pull request #4340 from yinshiyou/la-dev
Add some refines and optimizations for LoongArch.
2023-11-29 08:22:25 +01:00
Martin Kroeker
42b5e081d8 Merge pull request #4348 from catap/macos-undefinded-dynamic-lookup
Allow weak linking on old macOS
2023-11-28 22:14:53 +01:00
Kirill A. Korinsky
a1562e4bae Allow weak linking on old macOS 2023-11-28 14:04:01 +00:00
Martin Kroeker
c4a622db9e Merge pull request #4346 from martin-frbg/issue4343
Fix CMAKE installation location of lapacke_mangling header
2023-11-28 14:01:14 +01:00
Shiyou Yin
9fe07d82fd loongarch: Add LSX optimization for dot. 2023-11-28 20:24:18 +08:00
Shiyou Yin
13b8c44b44 loongarch: Add optimization for dsdot kernel. 2023-11-28 20:24:16 +08:00
Shiyou Yin
3def6a8143 loongarch: Add LASX optimization for dot. 2023-11-28 20:24:14 +08:00
Shiyou Yin
1310a0931b loongarch: Refine build control for loongarch64.
1. Use getauxval instead of cpucfg to test hardware capability.
2. Remove unnecessary code and option for compiler check in c_check.
2023-11-28 20:23:55 +08:00
Martin Kroeker
ff92e6e707 Fix installation location of lapacke_mangling header 2023-11-28 12:53:35 +01:00
Martin Kroeker
b7a28f5e42 Merge pull request #4344 from catap/macos-always-use-ar
Enable overstep of too long args without DYNAMIC_ARCH
2023-11-28 12:39:45 +01:00
Kirill A. Korinsky
9beee55167 Enable overstep of too long args without DYNAMIC_ARCH 2023-11-27 23:41:56 +00:00
Kirill A. Korinsky
01c7010543 cmake/openblas.pc.in: fixed version and URL 2023-11-27 14:51:58 +00:00
Martin Kroeker
fc66ecd25a Merge pull request #4339 from martin-frbg/lapack-3-12-0
Update version number and documentation of Reference-LAPACK to 3.12.0
2023-11-25 23:54:05 +01:00
Martin Kroeker
08be9004f8 Update version number and copyright date to Reference-LAPACK 3.12.0 2023-11-25 18:57:17 +01:00
Martin Kroeker
578f0f9590 Update version number to 3.12.0 2023-11-25 18:53:16 +01:00
Martin Kroeker
3d9e20f614 Update version to 3.12.0 2023-11-25 18:51:54 +01:00
Martin Kroeker
f7351e493c Update Reference-LAPACK docs to 3.12.0 2023-11-25 18:49:34 +01:00
Martin Kroeker
be8661ba40 Merge pull request #4338 from martin-frbg/lapack941
Docu fix for Truncated QR With Pivoting (Reference-LAPACK PR 941)
2023-11-25 18:41:25 +01:00
Martin Kroeker
ca5a87ff1d Small documentation fix for Truncated QR With Pivoting (Reference-LAPACK PR 941) 2023-11-25 15:31:18 +01:00
Shiyou Yin
f745f02f35 benchmark: Fix missing colons in outputs of ./strsv.goto 2023-11-24 14:55:18 +08:00
Martin Kroeker
97d3c9b827 Merge pull request #4336 from martin-frbg/fix4322
Revert unintentional change to gmake linking rule in LAPACK TESTING/LIN
2023-11-22 22:44:21 +01:00
Martin Kroeker
c883abf838 Revert unintentional change to linking rule from PR 4322 2023-11-22 22:41:53 +01:00
Martin Kroeker
8138999cd0 Merge pull request #4333 from codeworm96/update_dynamic_core_readme
Update the list of default dynamic targets for x86_64 in the README to be consistent with the Makefile
2023-11-21 13:50:37 +01:00
Martin Kroeker
a938e48fa2 Merge pull request #4334 from RajalakshmiSR/Makefile_power
POWER: Fixing Makefile error
2023-11-21 10:24:25 +01:00
Rajalakshmi Srinivasaraghavan
47da601a2d POWER: Fixing Makefile error
Recent commit d99aad8ee3 added
extra `)`. This patch fixes the warning from Makefile.
2023-11-20 17:24:22 -06:00
Yuning Zhang
54be8f4d67 Update the list of default dynamic targets for x86_64 in the README to be consistent with the Makefile
Signed-off-by: Yuning Zhang <codeworm96@outlook.com>
2023-11-20 13:28:25 -08:00