Commit Graph

432 Commits

Author SHA1 Message Date
Chip Kerchner 36bd3eeddf Vectorize BF16 GEMV (VSX & MMA). Use GEMM_GEMV_FORWARD_BF16 (for Power). 2024-10-13 13:46:11 -05:00
Martin Kroeker b0346e72f4
update names of loongarch64 targets for cross-compilation 2024-10-06 22:48:33 +02:00
Martin Kroeker 9c707dc6b9
Update dynamic arch list to new target scheme 2024-10-06 22:46:03 +02:00
Martin Kroeker b4495a8fb8
Merge branch 'develop' into arm64_cmake_small_matrix_opt 2024-10-03 20:04:52 +02:00
Martin Kroeker 4f00f02567
Do not add -mabi flags for Loongson when the compiler is flang 2024-10-03 16:06:33 +02:00
Martin Kroeker de421b7764
Merge pull request #4904 from XiWeiGu/la64_cross_cmake
LoongArch64: Enable cmake cross-compilation
2024-10-03 15:53:57 +02:00
Martin Kroeker 0228d36211
move -fopenmp to CFLAGS 2024-09-30 21:38:05 +02:00
gxw 7087b0a7d0 ARM64: Enable SMALL_MATRIX_OPT when compiling with CMake 2024-09-29 10:31:26 +08:00
gxw 30af9278dc LoongArch64: Enable cmake cross-compilation 2024-09-29 10:13:30 +08:00
psykose 1265eee85c fix cmake typo for power10 cc version check
fixes 668f48f4fc
2024-08-09 20:38:58 +02:00
Martin Kroeker cc36db643e
Support new LAPACK build option LAPACK_STRLEN 2024-08-06 17:31:03 +02:00
Martin Kroeker e8bd97ab4b
add RISCV64 entries for DYNAMIC_ARCH 2024-08-03 23:56:59 +02:00
Martin Kroeker 9eecd0d33b
enable GEMM/GEMV forwarding for riscv and ppc 2024-07-31 23:29:12 +02:00
Chris Sidebottom b26424c6a2 Allow opt into GEMM -> GEMV forwarding 2024-07-31 13:09:14 +01:00
yamazaki-mitsufumi 821ef34635 Add A64FX to the list of CPUs supported by DYNAMIC_ARCH 2024-07-23 20:44:39 +09:00
Jaap Aarts cea4abcac0
Fix compiling on mingw 2024-07-04 14:56:16 +02:00
Jaap Aarts 9d0abe2d26 Add support for RISCV64_GENERIC in cmake 2024-07-03 01:49:37 +02:00
Martin Kroeker d25ee4d0f5
Fix detection of Intel ifx and apply -fp-model option to it 2024-06-14 23:58:45 +02:00
Martin Kroeker 21c0f769ef
ensure that cpu-specific -march options are always applied to icx 2024-06-14 23:54:27 +02:00
Alexander Neumann dd4505c5dd
Fix CMake warning 2024-05-30 09:04:23 +02:00
Martin Kroeker 8b4996a2d5
Override icx's default fast math mode to ensure correct NaN handling 2024-05-26 13:16:03 +02:00
Martin Kroeker 6494f432df
Fix INTERFACE64 builds on Loongarch64 2024-05-18 16:49:03 +02:00
Martin Kroeker a3f6b13bc9
remove spurious brace 2024-05-16 09:25:53 +02:00
Martin Kroeker 668f48f4fc
Use CMAKE_C_COMPILER_VERSION instead of dumpversion calls (#4698)
* Use CMAKE_C_COMPILER_VERSION throughout
2024-05-15 23:58:14 +02:00
Martin Kroeker f5c080f083
Fix CMAKE syntax in kernel file parsing of IFNEQ conditionals (#4695)
* Fix syntax in parsing of IFNEQ
2024-05-15 20:58:31 +02:00
Martin Kroeker 3d26837a35
Suppress GCC14 error exit in the f2c-converted LAPACK 2024-04-30 19:05:18 +02:00
Martin Kroeker 69aa93e34f
Fix Loongson compiler flag check 2024-04-23 21:57:42 +02:00
Martin Kroeker 015042f7b5
Fix Loongson compiler flag test 2024-04-23 21:55:57 +02:00
مهدي شينون (Mehdi Chinoune) cda55f2fd2
Don't pass `-exhaustive-register-search` directly to clang compiler
`-exhaustive-register-search` is an LLVM code generation flag that shouldn't be passed directly to clang compiler.
2024-04-06 05:54:48 +01:00
Martin Kroeker 3af736fb9d
Add support for Cortex-A76 2024-04-02 19:42:23 +02:00
Martin Kroeker 52b71a1673
Filter out FFLAGS that flang-new from LLVM18 no longer supports (#4569)
* Filter out FFLAGS that flang-new from LLVM18 no longer supports
2024-03-22 17:02:39 +01:00
Martin Kroeker 2e86faa657
Merge branch 'develop' into issue4468 2024-02-23 11:39:49 +01:00
Martin Kroeker 8fc2c2db04
Fix missing support for INTERFACE64 on ARM64 and MIPS64 2024-02-22 22:14:13 +01:00
Martin Kroeker 82b81c0bbe
Dont fail if there is no Fortran compiler 2024-02-22 22:11:50 +01:00
Martin Kroeker a0e3f77e0b
add FIXED_LIBNAME, PREFIX and SUFFIX 2024-02-15 12:17:38 +01:00
Martin Kroeker ffbfc3c692
Add libname prefix and suffix 2024-02-15 12:16:34 +01:00
Martin Kroeker 0c43c6fa99
Merge pull request #4341 from catap/openblas.pc.in
cmake/openblas.pc.in: fixed version and URL
2023-12-31 13:25:06 +01:00
Martin Kroeker e9c32ed165
Merge pull request #4384 from yetist/develop
Fix: build failed on LoongArch
2023-12-27 14:05:01 +01:00
Martin Kroeker 1106460bb3
remove redundant targets from the default ARM64 DYNAMIC_ARCH list 2023-12-25 12:29:56 +01:00
Wu Xiaotian 0baf462dbc Fix: build failed on LoongArch
According to the documentation at https://github.com/loongson/la-abi-specs/blob/release/lapcs.adoc#the-base-abi-variants, valid -mabi parameters are lp64s, lp64f, lp64d, ilp32s, ilp32f and ilp32d.
2023-12-25 16:04:43 +08:00
barracuda156 8c143331b0 PPC970: drop -mcpu=970 which seems to produce faulty code
Fixes: https://github.com/OpenMathLib/OpenBLAS/issues/4376
2023-12-15 22:56:06 +08:00
barracuda156 981e315b30 cc.cmake: use -force_cpusubtype_ALL for Darwin PPC 2023-12-14 12:01:31 +08:00
barracuda156 a8d3619f65 cc.cmake: add optflags for G5 and G4 kernels 2023-12-13 19:42:56 +08:00
barracuda156 c732f275a2 system_check.cmake: fix arch detection for Darwin PowerPC 2023-12-11 21:05:31 +08:00
Kirill A. Korinsky 08fde5ebd2
Use 64bit build on `CMAKE_SYSTEM_PROCESSOR=i386` on Darwin
Here a bit tricky things.

A value `CMAKE_SYSTEM_PROCESSOR` is came from output of `uname -m` which
migth be 32bit with 64bit building applicaiton.

So, for that case use `CMAKE_SIZEOF_VOID_P` to detect the target.

See https://trac.macports.org/ticket/68488
2023-11-30 21:24:58 +00:00
Kirill A. Korinsky 01c7010543
cmake/openblas.pc.in: fixed version and URL 2023-11-27 14:51:58 +00:00
Martin Kroeker 5bf87c86f5
Implement truncated QR with pivoting (Reference-LAPACK PR 891) 2023-11-15 12:10:20 +01:00
Martin Kroeker 58427ff74d
Deprecate ?GELQS and ?GEQRS from TESTING/LIN (Reference-LAPACK PR 900) (#4307)
* Move ?GELQS and ?GEQRS from TESTING/LIN to DEPRECATED (Reference-LAPACK PR 900)

* Add f2c-converted versions of ?GELQS and ?GEQRS
2023-11-12 10:54:39 +01:00
Martin Kroeker 49689fbef7
Add support for compiling SVE kernels with the NVIDIA HPC compiler 2023-08-25 17:11:04 +02:00
Martin Kroeker 562ef5fdca
Merge pull request #4169 from felixonmars/patch-1
Use defined variable for riscv64 in arch.cmake
2023-08-12 17:20:56 +02:00