OpenBLAS/kernel
Mark Ryan 3b715e6162 Add autodetection for riscv64
Implement DYNAMIC_ARCH support for riscv64.  Three cpu types are
supported, riscv64_generic, riscv64_zvl256b, riscv64_zvl128b.
The two non-generic kernels require CPU support for RVV 1.0 to
function correctly.  Detecting that a riscv64 device supports
RVV 1.0 is a little complicated as there are some boards on the
market that advertise support for V via hwcap but only support
RVV 0.7.1, which is not binary compatible with RVV 1.0.  The
approach taken is to first try hwprobe.  If hwprobe is not
available, we fall back to hwcap + an additional check to distinguish
between RVV 1.0 and RVV 0.7.1.

Tested on a VM with VLEN=256, a CanMV K230 with VLEN=128 (with only
the big core enabled), a Lichee Pi with RVV 0.7.1 and a VF2 with no
vector.

A compiler with RVV 1.0 support must be used to build OpenBLAS for
riscv64 when DYNAMIC_ARCH=1.

Signed-off-by: Mark Ryan <markdryan@rivosinc.com>
2024-07-15 14:24:22 +00:00
..
alpha alpha: Remove include of version.h 2022-08-11 15:02:58 +01:00
arm Update scal.c 2024-07-04 22:28:01 +02:00
arm64 Merge pull request #4702 from bashimao/detect-nv-grace 2024-06-30 22:48:48 +02:00
csky Add CSKY support 2024-01-16 23:45:06 +08:00
e2k Add default KERNEL file for Elbrus E2K arch 2022-01-22 18:59:36 +01:00
generic loongarch64: Update dgemm_kernel_16x4 to dgemm_kernel_16x6 2024-05-08 10:10:26 +08:00
ia64 Add ia64 implementation of ?sum 2019-03-30 22:18:03 +01:00
loongarch64 LoongArch: Fixed issue 4728 2024-06-06 16:43:09 +08:00
mips Update sscal_msa.c 2024-06-23 12:55:19 +02:00
mips64 exclude the alpha=0 branch as it does not handle NaN or Inf in x 2024-06-23 00:54:39 +02:00
power Vectorize SBGEMM incopy - 4x faster. 2024-07-09 13:10:03 -05:00
riscv64 Add autodetection for riscv64 2024-07-15 14:24:22 +00:00
simd fix the CI failure of lack the head 2020-11-12 17:35:17 +08:00
sparc temporarily(?) disable the alpha=0 branch as it fails to handle INF,NAN 2024-06-27 22:18:27 +02:00
x86 temporarily(?) disable da=0 shortcut to handle x=Inf or NAN 2024-06-23 17:48:18 +02:00
x86_64 temporarily(?) disable the alpha=0 branch to handle Inf/NaN in x 2024-06-22 21:08:57 +02:00
zarch Fix erroneous mapping of SUM kernels to ASUM 2024-02-27 11:28:50 +01:00
CMakeLists.txt Adding USE_GEMM3M macro to kernel targets, so that the *gemm3m functions and parameters can be included into the gotoblas structure. Fixes #4500 2024-02-12 02:29:58 +01:00
Makefile powerpc: Fix build errors with Open XL C 2023-10-04 14:04:03 -05:00
Makefile.L1 Conditionally add -mfma to compiler options where needed 2020-12-17 11:34:05 +01:00
Makefile.L2 make SSYMV available to BUILD_DOUBLE-only builds 2023-02-22 00:30:20 +01:00
Makefile.L3 Disable GEMM3M for generic targets (not implemented) 2024-06-06 14:39:50 +02:00
Makefile.LA Support NO_LAPACK=1 to build the lib without LAPACK functions. 2011-03-04 11:51:32 +08:00
setparam-ref.c Add autodetection for riscv64 2024-07-15 14:24:22 +00:00