Commit Graph

374 Commits

Author SHA1 Message Date
Martin Kroeker 42d8865234
fix typo 2024-08-01 12:24:45 +02:00
Martin Kroeker fcb88b9d52
enable GEMM/GEMV forwarding for riscv and ppc 2024-07-31 23:21:35 +02:00
Chris Sidebottom b26424c6a2 Allow opt into GEMM -> GEMV forwarding 2024-07-31 13:09:14 +01:00
Martin Kroeker a4e56e0452
Merge pull request #4806 from Mousius/small-gemm
Small GEMM for AArch64 with SVE
2024-07-25 21:50:04 +02:00
yamazaki-mitsufumi 821ef34635 Add A64FX to the list of CPUs supported by DYNAMIC_ARCH 2024-07-23 20:44:39 +09:00
Mark Ryan 3b715e6162 Add autodetection for riscv64
Implement DYNAMIC_ARCH support for riscv64.  Three cpu types are
supported, riscv64_generic, riscv64_zvl256b, riscv64_zvl128b.
The two non-generic kernels require CPU support for RVV 1.0 to
function correctly.  Detecting that a riscv64 device supports
RVV 1.0 is a little complicated as there are some boards on the
market that advertise support for V via hwcap but only support
RVV 0.7.1, which is not binary compatible with RVV 1.0.  The
approach taken is to first try hwprobe.  If hwprobe is not
available, we fall back to hwcap + an additional check to distinguish
between RVV 1.0 and RVV 0.7.1.

Tested on a VM with VLEN=256, a CanMV K230 with VLEN=128 (with only
the big core enabled), a Lichee Pi with RVV 0.7.1 and a VF2 with no
vector.

A compiler with RVV 1.0 support must be used to build OpenBLAS for
riscv64 when DYNAMIC_ARCH=1.

Signed-off-by: Mark Ryan <markdryan@rivosinc.com>
2024-07-15 14:24:22 +00:00
gxw 8ab2e9ec65 LoongArch: DGEMM small matrix opt 2024-06-04 16:52:45 +08:00
Martin Kroeker 4376b6f7d2
Restore Loongson LA64ARCH handling 2024-05-07 14:42:01 +02:00
Martin Kroeker fc10673fd3
Merge branch 'develop' into hugetlb-doc 2024-05-07 13:31:39 +02:00
Martin Kroeker 9c4e10fbd1
sort hugetlb and shm alloc options 2024-05-04 14:48:02 +02:00
Martin Kroeker 7c915e64ca
Silence a GCC14 warning/error in the f2c-converted LAPACK 2024-04-30 17:48:14 +02:00
Martin Kroeker ae695d4ca0
Merge pull request #4642 from XiWeiGu/loongarch64_clang
CI: Add clang test for loongarch64
2024-04-23 18:25:49 +02:00
gxw 7cd438a5ac loongarch64: Fixed clang compilation issues 2024-04-23 19:19:11 +08:00
Martin Kroeker 0ec0746ae4
Update Makefile.system 2024-04-18 16:11:20 +02:00
Martin Kroeker d6b0badc05
Fix declarations for EMBEDDED 2024-04-18 16:06:21 +02:00
Martin Kroeker 00ee5d0367
On ARM, do not assume -marm by default if OS_EMBEDDED=1 2024-04-12 15:59:45 +02:00
Chip Kerchner 1c13cda3fc Remove -openmp flag from XLF (since it doesn't support it). 2024-04-10 15:16:47 -05:00
Martin Kroeker 52b71a1673
Filter out FFLAGS that flang-new from LLVM18 no longer supports (#4569)
* Filter out FFLAGS that flang-new from LLVM18 no longer supports
2024-03-22 17:02:39 +01:00
Martin Kroeker a14176440a
Add version macro for GCC12 2024-03-10 23:22:05 +01:00
Martin Kroeker 56fad407d1
Merge pull request #4527 from ChipKerchner/fixAIXBuildIssues
Fix LAPACK unit testing build issues.
2024-03-05 17:55:08 +01:00
Chris Sidebottom 7a6fa699f2 Small GEMM for AArch64
This is a fairly conservative addition of small matrix kernels using
SVE.
2024-03-04 15:48:47 +00:00
Martin Kroeker d1409407a0
Omit redundant prefixes or suffixes in library naming 2024-02-27 21:05:59 +01:00
Chip-Kerchner 3e030cc5fe Fix LAPACK unit testing build issues. Limit AIX builds to 32 threads (to eliminate failures of some systems). 2024-02-26 12:46:05 -06:00
Martin Kroeker 2e86faa657
Merge branch 'develop' into issue4468 2024-02-23 11:39:49 +01:00
Ayappan Perumal 892f8ff3e5 Shared library support for AIX 2024-02-22 07:05:37 -06:00
Martin Kroeker ca6b4961e4
updates to fix option conflicts and config file generation 2024-02-15 14:31:11 +01:00
Martin Kroeker bb96e466ae
Introduce LIBNAMEPREFIX to avoid messing with the internal LIBPREFIX 2024-02-09 15:50:11 +01:00
Martin Kroeker 1ed69ea1c0
improve naming 2024-02-06 23:35:12 +01:00
Martin Kroeker 63fbffddf8
Add option FIXED_LIBNAME to suppress versioning and softlinking 2024-02-05 21:44:03 +01:00
Dirreke ec89466e14 Add CSKY support 2024-01-16 23:45:06 +08:00
Chris Sidebottom dc20a78188 Use functionally equivalent dynamic targets
Similar to `drivers/other/dynamic.c`, I've looked for functionally
equivalent targets and mapped them in the default DYNAMIC_ARCH build.
Users can still build specific cores using DYNAMIC_LIST.
2023-12-23 12:45:27 +00:00
Martin Kroeker 47b03fd4b4
Copy XCode15-specific workaround to Fortran flags to fix build of tests 2023-11-18 23:45:02 +01:00
Martin Kroeker 9c3c1cfbd6
Merge pull request #4304 from martin-frbg/issue4277
Move clang/gfortran OpenMP dependency rewriting out of f_check
2023-11-11 20:58:21 +01:00
Martin Kroeker 1a308a0066
Move OpenMP dependency handling for clang/gfortran combo 2023-11-10 15:27:46 +01:00
Chip Kerchner 206e76187e Fix FCOMMON_OPT for power. Error out for certain C and Fortran compiler combos in AIX. 2023-11-07 18:08:57 -06:00
Rajalakshmi Srinivasaraghavan 980f702f72 POWER: AIX: Make use of power10 optimization
POWER10 optimizations are disabled when using default AIX assembler.
As we have fixed many issues recently, enabling optimization path
for default assembler.
2023-10-19 18:48:19 -05:00
Martin Kroeker b41cab0875
Need to use override to actually strip down the already defined FFLAGS for NAG and CCE Fortran 2023-10-16 22:20:59 +02:00
Martin Kroeker 103d6f4e42
Require "classic ld" with XCODE 15.x on Mac 2023-10-10 16:15:52 +02:00
Rajalakshmi Srinivasaraghavan a11e1e10f4 powerpc: Fix build errors with xlf
This patch fixes errors when using xlf as fortran compiler on Linux.
Tested with gcc/xlf and clang/xlf compiler combinations.
2023-09-29 10:32:34 -05:00
Martin Kroeker bb47183222
Force -qextname for trailing underscore generation when IBM xlf is used with gcc 2023-09-24 10:13:47 +02:00
Martin Kroeker 09911f077e
Disable SVE targets for DYNAMIC_ARCH when compiling with (homebrew)gcc on macOS/arm64 2023-09-05 16:33:40 +02:00
Ian McInerney 8a8a8479be Fix cooperlake and sapphire rapids march flags on clang
The march=cooperlake and march=sapphirerapids flags were never getting
added when building with Clang targetting those architectures. Instead
it was falling back to the skylake AVX512 implementation.

Clang added support for these two architectures in Clang 9 and Clang 12,
so introduce new checks for those versions to enable the appropriate
march flag, and fallback to skylake otherwise.
2023-08-14 16:12:35 +01:00
gxw d46772e037 LoongArch64: Add compiler feature checks 2023-08-05 10:21:43 +08:00
Martin Kroeker e8bc8a0ee7
Add support for the new generation flang that comes with LLVM17 2023-08-04 15:32:19 +02:00
Chris Sidebottom f971ef55f2 Add ARMV8SVE to AArch64 Dynamic Dispatch
In order to enable support for future cores which have similar tunings
(in this case I'm doing this for the Arm(R) Neoverse(TM) V2 core), this generically detects SVE support and enables it. This should better manage the size and complexity of dynamic dispatch rather than just copy pasting the same parameters.

To make `ARMV8SVE` more representive of the common 128-bit SVE case,
I've split it and similar parameters from A64FX which has the wider
512-bit SVE.
2023-07-25 18:35:15 +01:00
Martin Kroeker c3a2d407a0
Merge pull request #4048 from imzhuhl/spr_sbgemm_fix
Sapphire Rapids sbgemm fix
2023-06-17 20:47:09 +02:00
gxw 67d1e72e8b LoongArch64: Add ABI detection for loongarch64
If lp64d ABI is supported, it is used; otherwise,
it falls back to the lp64 ABI.
2023-06-08 20:25:35 +08:00
Honglin Zhu 0b83088887 spr dynamic arch support 2023-05-19 10:48:18 +08:00
Martin Kroeker ebe50458f3
Do not add a -tp to the flags of the nvc compiler if there is one already in CFLAGS 2023-02-09 09:29:27 +01:00
Martin Kroeker 3e64fa72c4
Settings from Makefile(_kernel).conf should be available to DYNAMIC_ARCH kernel builds 2022-12-29 23:05:22 +01:00