432 Commits

Author SHA1 Message Date
Chip Kerchner
36bd3eeddf Vectorize BF16 GEMV (VSX & MMA). Use GEMM_GEMV_FORWARD_BF16 (for Power). 2024-10-13 13:46:11 -05:00
Martin Kroeker
b0346e72f4 update names of loongarch64 targets for cross-compilation 2024-10-06 22:48:33 +02:00
Martin Kroeker
9c707dc6b9 Update dynamic arch list to new target scheme 2024-10-06 22:46:03 +02:00
Martin Kroeker
b4495a8fb8 Merge branch 'develop' into arm64_cmake_small_matrix_opt 2024-10-03 20:04:52 +02:00
Martin Kroeker
4f00f02567 Do not add -mabi flags for Loongson when the compiler is flang 2024-10-03 16:06:33 +02:00
Martin Kroeker
de421b7764 Merge pull request #4904 from XiWeiGu/la64_cross_cmake
LoongArch64: Enable cmake cross-compilation
2024-10-03 15:53:57 +02:00
Martin Kroeker
0228d36211 move -fopenmp to CFLAGS 2024-09-30 21:38:05 +02:00
gxw
7087b0a7d0 ARM64: Enable SMALL_MATRIX_OPT when compiling with CMake 2024-09-29 10:31:26 +08:00
gxw
30af9278dc LoongArch64: Enable cmake cross-compilation 2024-09-29 10:13:30 +08:00
psykose
1265eee85c fix cmake typo for power10 cc version check
fixes 668f48f4fc
2024-08-09 20:38:58 +02:00
Martin Kroeker
cc36db643e Support new LAPACK build option LAPACK_STRLEN 2024-08-06 17:31:03 +02:00
Martin Kroeker
e8bd97ab4b add RISCV64 entries for DYNAMIC_ARCH 2024-08-03 23:56:59 +02:00
Martin Kroeker
9eecd0d33b enable GEMM/GEMV forwarding for riscv and ppc 2024-07-31 23:29:12 +02:00
Chris Sidebottom
b26424c6a2 Allow opt into GEMM -> GEMV forwarding 2024-07-31 13:09:14 +01:00
yamazaki-mitsufumi
821ef34635 Add A64FX to the list of CPUs supported by DYNAMIC_ARCH 2024-07-23 20:44:39 +09:00
Jaap Aarts
cea4abcac0 Fix compiling on mingw 2024-07-04 14:56:16 +02:00
Jaap Aarts
9d0abe2d26 Add support for RISCV64_GENERIC in cmake 2024-07-03 01:49:37 +02:00
Martin Kroeker
d25ee4d0f5 Fix detection of Intel ifx and apply -fp-model option to it 2024-06-14 23:58:45 +02:00
Martin Kroeker
21c0f769ef ensure that cpu-specific -march options are always applied to icx 2024-06-14 23:54:27 +02:00
Alexander Neumann
dd4505c5dd Fix CMake warning 2024-05-30 09:04:23 +02:00
Martin Kroeker
8b4996a2d5 Override icx's default fast math mode to ensure correct NaN handling 2024-05-26 13:16:03 +02:00
Martin Kroeker
6494f432df Fix INTERFACE64 builds on Loongarch64 2024-05-18 16:49:03 +02:00
Martin Kroeker
a3f6b13bc9 remove spurious brace 2024-05-16 09:25:53 +02:00
Martin Kroeker
668f48f4fc Use CMAKE_C_COMPILER_VERSION instead of dumpversion calls (#4698)
* Use CMAKE_C_COMPILER_VERSION throughout
2024-05-15 23:58:14 +02:00
Martin Kroeker
f5c080f083 Fix CMAKE syntax in kernel file parsing of IFNEQ conditionals (#4695)
* Fix syntax in parsing of IFNEQ
2024-05-15 20:58:31 +02:00
Martin Kroeker
3d26837a35 Suppress GCC14 error exit in the f2c-converted LAPACK 2024-04-30 19:05:18 +02:00
Martin Kroeker
69aa93e34f Fix Loongson compiler flag check 2024-04-23 21:57:42 +02:00
Martin Kroeker
015042f7b5 Fix Loongson compiler flag test 2024-04-23 21:55:57 +02:00
مهدي شينون (Mehdi Chinoune)
cda55f2fd2 Don't pass -exhaustive-register-search directly to clang compiler
`-exhaustive-register-search` is an LLVM code generation flag that shouldn't be passed directly to clang compiler.
2024-04-06 05:54:48 +01:00
Martin Kroeker
3af736fb9d Add support for Cortex-A76 2024-04-02 19:42:23 +02:00
Martin Kroeker
52b71a1673 Filter out FFLAGS that flang-new from LLVM18 no longer supports (#4569)
* Filter out FFLAGS that flang-new from LLVM18 no longer supports
2024-03-22 17:02:39 +01:00
Martin Kroeker
2e86faa657 Merge branch 'develop' into issue4468 2024-02-23 11:39:49 +01:00
Martin Kroeker
8fc2c2db04 Fix missing support for INTERFACE64 on ARM64 and MIPS64 2024-02-22 22:14:13 +01:00
Martin Kroeker
82b81c0bbe Dont fail if there is no Fortran compiler 2024-02-22 22:11:50 +01:00
Martin Kroeker
a0e3f77e0b add FIXED_LIBNAME, PREFIX and SUFFIX 2024-02-15 12:17:38 +01:00
Martin Kroeker
ffbfc3c692 Add libname prefix and suffix 2024-02-15 12:16:34 +01:00
Martin Kroeker
0c43c6fa99 Merge pull request #4341 from catap/openblas.pc.in
cmake/openblas.pc.in: fixed version and URL
2023-12-31 13:25:06 +01:00
Martin Kroeker
e9c32ed165 Merge pull request #4384 from yetist/develop
Fix: build failed on LoongArch
2023-12-27 14:05:01 +01:00
Martin Kroeker
1106460bb3 remove redundant targets from the default ARM64 DYNAMIC_ARCH list 2023-12-25 12:29:56 +01:00
Wu Xiaotian
0baf462dbc Fix: build failed on LoongArch
According to the documentation at https://github.com/loongson/la-abi-specs/blob/release/lapcs.adoc#the-base-abi-variants, valid -mabi parameters are lp64s, lp64f, lp64d, ilp32s, ilp32f and ilp32d.
2023-12-25 16:04:43 +08:00
barracuda156
8c143331b0 PPC970: drop -mcpu=970 which seems to produce faulty code
Fixes: https://github.com/OpenMathLib/OpenBLAS/issues/4376
2023-12-15 22:56:06 +08:00
barracuda156
981e315b30 cc.cmake: use -force_cpusubtype_ALL for Darwin PPC 2023-12-14 12:01:31 +08:00
barracuda156
a8d3619f65 cc.cmake: add optflags for G5 and G4 kernels 2023-12-13 19:42:56 +08:00
barracuda156
c732f275a2 system_check.cmake: fix arch detection for Darwin PowerPC 2023-12-11 21:05:31 +08:00
Kirill A. Korinsky
08fde5ebd2 Use 64bit build on CMAKE_SYSTEM_PROCESSOR=i386 on Darwin
Here a bit tricky things.

A value `CMAKE_SYSTEM_PROCESSOR` is came from output of `uname -m` which
migth be 32bit with 64bit building applicaiton.

So, for that case use `CMAKE_SIZEOF_VOID_P` to detect the target.

See https://trac.macports.org/ticket/68488
2023-11-30 21:24:58 +00:00
Kirill A. Korinsky
01c7010543 cmake/openblas.pc.in: fixed version and URL 2023-11-27 14:51:58 +00:00
Martin Kroeker
5bf87c86f5 Implement truncated QR with pivoting (Reference-LAPACK PR 891) 2023-11-15 12:10:20 +01:00
Martin Kroeker
58427ff74d Deprecate ?GELQS and ?GEQRS from TESTING/LIN (Reference-LAPACK PR 900) (#4307)
* Move ?GELQS and ?GEQRS from TESTING/LIN to DEPRECATED (Reference-LAPACK PR 900)

* Add f2c-converted versions of ?GELQS and ?GEQRS
2023-11-12 10:54:39 +01:00
Martin Kroeker
49689fbef7 Add support for compiling SVE kernels with the NVIDIA HPC compiler 2023-08-25 17:11:04 +02:00
Martin Kroeker
562ef5fdca Merge pull request #4169 from felixonmars/patch-1
Use defined variable for riscv64 in arch.cmake
2023-08-12 17:20:56 +02:00