Chip Kerchner
36bd3eeddf
Vectorize BF16 GEMV (VSX & MMA). Use GEMM_GEMV_FORWARD_BF16 (for Power).
2024-10-13 13:46:11 -05:00
Martin Kroeker
a492181665
filter out Loongarch -mabi options for flang-new
2024-10-03 15:58:47 +02:00
Martin Kroeker
a1073f5eed
Merge pull request #4900 from XiWeiGu/la64_core_rename
...
LoongArch64: Rename core
2024-10-01 15:29:16 +02:00
gxw
48698b2b1d
LoongArch64: Rename core
...
Use microarchitecture name instead of meaningless strings to name the core,
the legacy core is still retained.
1. Rename LOONGSONGENERIC to LA64_GENERIC
2. Rename LOONGSON3R5 to LA464
3. Rename LOONGSON2K1000 to LA264
2024-09-29 09:35:21 +08:00
Martin Kroeker
969bb949b1
Strip any mtune option from FFLAGS is the compiler is flang-new
2024-09-19 11:10:28 +02:00
Martin Kroeker
383e0b133e
remove suppression of gcc14's incompatible pointer error
2024-09-11 22:21:09 +02:00
Martin Kroeker
42d8865234
fix typo
2024-08-01 12:24:45 +02:00
Martin Kroeker
fcb88b9d52
enable GEMM/GEMV forwarding for riscv and ppc
2024-07-31 23:21:35 +02:00
Chris Sidebottom
b26424c6a2
Allow opt into GEMM -> GEMV forwarding
2024-07-31 13:09:14 +01:00
Martin Kroeker
a4e56e0452
Merge pull request #4806 from Mousius/small-gemm
...
Small GEMM for AArch64 with SVE
2024-07-25 21:50:04 +02:00
yamazaki-mitsufumi
821ef34635
Add A64FX to the list of CPUs supported by DYNAMIC_ARCH
2024-07-23 20:44:39 +09:00
Mark Ryan
3b715e6162
Add autodetection for riscv64
...
Implement DYNAMIC_ARCH support for riscv64. Three cpu types are
supported, riscv64_generic, riscv64_zvl256b, riscv64_zvl128b.
The two non-generic kernels require CPU support for RVV 1.0 to
function correctly. Detecting that a riscv64 device supports
RVV 1.0 is a little complicated as there are some boards on the
market that advertise support for V via hwcap but only support
RVV 0.7.1, which is not binary compatible with RVV 1.0. The
approach taken is to first try hwprobe. If hwprobe is not
available, we fall back to hwcap + an additional check to distinguish
between RVV 1.0 and RVV 0.7.1.
Tested on a VM with VLEN=256, a CanMV K230 with VLEN=128 (with only
the big core enabled), a Lichee Pi with RVV 0.7.1 and a VF2 with no
vector.
A compiler with RVV 1.0 support must be used to build OpenBLAS for
riscv64 when DYNAMIC_ARCH=1.
Signed-off-by: Mark Ryan <markdryan@rivosinc.com>
2024-07-15 14:24:22 +00:00
gxw
8ab2e9ec65
LoongArch: DGEMM small matrix opt
2024-06-04 16:52:45 +08:00
Martin Kroeker
4376b6f7d2
Restore Loongson LA64ARCH handling
2024-05-07 14:42:01 +02:00
Martin Kroeker
fc10673fd3
Merge branch 'develop' into hugetlb-doc
2024-05-07 13:31:39 +02:00
Martin Kroeker
9c4e10fbd1
sort hugetlb and shm alloc options
2024-05-04 14:48:02 +02:00
Martin Kroeker
7c915e64ca
Silence a GCC14 warning/error in the f2c-converted LAPACK
2024-04-30 17:48:14 +02:00
Martin Kroeker
ae695d4ca0
Merge pull request #4642 from XiWeiGu/loongarch64_clang
...
CI: Add clang test for loongarch64
2024-04-23 18:25:49 +02:00
gxw
7cd438a5ac
loongarch64: Fixed clang compilation issues
2024-04-23 19:19:11 +08:00
Martin Kroeker
0ec0746ae4
Update Makefile.system
2024-04-18 16:11:20 +02:00
Martin Kroeker
d6b0badc05
Fix declarations for EMBEDDED
2024-04-18 16:06:21 +02:00
Martin Kroeker
00ee5d0367
On ARM, do not assume -marm by default if OS_EMBEDDED=1
2024-04-12 15:59:45 +02:00
Chip Kerchner
1c13cda3fc
Remove -openmp flag from XLF (since it doesn't support it).
2024-04-10 15:16:47 -05:00
Martin Kroeker
52b71a1673
Filter out FFLAGS that flang-new from LLVM18 no longer supports ( #4569 )
...
* Filter out FFLAGS that flang-new from LLVM18 no longer supports
2024-03-22 17:02:39 +01:00
Martin Kroeker
a14176440a
Add version macro for GCC12
2024-03-10 23:22:05 +01:00
Martin Kroeker
56fad407d1
Merge pull request #4527 from ChipKerchner/fixAIXBuildIssues
...
Fix LAPACK unit testing build issues.
2024-03-05 17:55:08 +01:00
Chris Sidebottom
7a6fa699f2
Small GEMM for AArch64
...
This is a fairly conservative addition of small matrix kernels using
SVE.
2024-03-04 15:48:47 +00:00
Martin Kroeker
d1409407a0
Omit redundant prefixes or suffixes in library naming
2024-02-27 21:05:59 +01:00
Chip-Kerchner
3e030cc5fe
Fix LAPACK unit testing build issues. Limit AIX builds to 32 threads (to eliminate failures of some systems).
2024-02-26 12:46:05 -06:00
Martin Kroeker
2e86faa657
Merge branch 'develop' into issue4468
2024-02-23 11:39:49 +01:00
Ayappan Perumal
892f8ff3e5
Shared library support for AIX
2024-02-22 07:05:37 -06:00
Martin Kroeker
ca6b4961e4
updates to fix option conflicts and config file generation
2024-02-15 14:31:11 +01:00
Martin Kroeker
bb96e466ae
Introduce LIBNAMEPREFIX to avoid messing with the internal LIBPREFIX
2024-02-09 15:50:11 +01:00
Martin Kroeker
1ed69ea1c0
improve naming
2024-02-06 23:35:12 +01:00
Martin Kroeker
63fbffddf8
Add option FIXED_LIBNAME to suppress versioning and softlinking
2024-02-05 21:44:03 +01:00
Dirreke
ec89466e14
Add CSKY support
2024-01-16 23:45:06 +08:00
Chris Sidebottom
dc20a78188
Use functionally equivalent dynamic targets
...
Similar to `drivers/other/dynamic.c`, I've looked for functionally
equivalent targets and mapped them in the default DYNAMIC_ARCH build.
Users can still build specific cores using DYNAMIC_LIST.
2023-12-23 12:45:27 +00:00
Martin Kroeker
47b03fd4b4
Copy XCode15-specific workaround to Fortran flags to fix build of tests
2023-11-18 23:45:02 +01:00
Martin Kroeker
9c3c1cfbd6
Merge pull request #4304 from martin-frbg/issue4277
...
Move clang/gfortran OpenMP dependency rewriting out of f_check
2023-11-11 20:58:21 +01:00
Martin Kroeker
1a308a0066
Move OpenMP dependency handling for clang/gfortran combo
2023-11-10 15:27:46 +01:00
Chip Kerchner
206e76187e
Fix FCOMMON_OPT for power. Error out for certain C and Fortran compiler combos in AIX.
2023-11-07 18:08:57 -06:00
Rajalakshmi Srinivasaraghavan
980f702f72
POWER: AIX: Make use of power10 optimization
...
POWER10 optimizations are disabled when using default AIX assembler.
As we have fixed many issues recently, enabling optimization path
for default assembler.
2023-10-19 18:48:19 -05:00
Martin Kroeker
b41cab0875
Need to use override to actually strip down the already defined FFLAGS for NAG and CCE Fortran
2023-10-16 22:20:59 +02:00
Martin Kroeker
103d6f4e42
Require "classic ld" with XCODE 15.x on Mac
2023-10-10 16:15:52 +02:00
Rajalakshmi Srinivasaraghavan
a11e1e10f4
powerpc: Fix build errors with xlf
...
This patch fixes errors when using xlf as fortran compiler on Linux.
Tested with gcc/xlf and clang/xlf compiler combinations.
2023-09-29 10:32:34 -05:00
Martin Kroeker
bb47183222
Force -qextname for trailing underscore generation when IBM xlf is used with gcc
2023-09-24 10:13:47 +02:00
Martin Kroeker
09911f077e
Disable SVE targets for DYNAMIC_ARCH when compiling with (homebrew)gcc on macOS/arm64
2023-09-05 16:33:40 +02:00
Ian McInerney
8a8a8479be
Fix cooperlake and sapphire rapids march flags on clang
...
The march=cooperlake and march=sapphirerapids flags were never getting
added when building with Clang targetting those architectures. Instead
it was falling back to the skylake AVX512 implementation.
Clang added support for these two architectures in Clang 9 and Clang 12,
so introduce new checks for those versions to enable the appropriate
march flag, and fallback to skylake otherwise.
2023-08-14 16:12:35 +01:00
gxw
d46772e037
LoongArch64: Add compiler feature checks
2023-08-05 10:21:43 +08:00
Martin Kroeker
e8bc8a0ee7
Add support for the new generation flang that comes with LLVM17
2023-08-04 15:32:19 +02:00