Chip Kerchner
36bd3eeddf
Vectorize BF16 GEMV (VSX & MMA). Use GEMM_GEMV_FORWARD_BF16 (for Power).
2024-10-13 13:46:11 -05:00
Martin Kroeker
b0346e72f4
update names of loongarch64 targets for cross-compilation
2024-10-06 22:48:33 +02:00
Martin Kroeker
9c707dc6b9
Update dynamic arch list to new target scheme
2024-10-06 22:46:03 +02:00
Martin Kroeker
b4495a8fb8
Merge branch 'develop' into arm64_cmake_small_matrix_opt
2024-10-03 20:04:52 +02:00
Martin Kroeker
4f00f02567
Do not add -mabi flags for Loongson when the compiler is flang
2024-10-03 16:06:33 +02:00
Martin Kroeker
de421b7764
Merge pull request #4904 from XiWeiGu/la64_cross_cmake
...
LoongArch64: Enable cmake cross-compilation
2024-10-03 15:53:57 +02:00
Martin Kroeker
0228d36211
move -fopenmp to CFLAGS
2024-09-30 21:38:05 +02:00
gxw
7087b0a7d0
ARM64: Enable SMALL_MATRIX_OPT when compiling with CMake
2024-09-29 10:31:26 +08:00
gxw
30af9278dc
LoongArch64: Enable cmake cross-compilation
2024-09-29 10:13:30 +08:00
psykose
1265eee85c
fix cmake typo for power10 cc version check
...
fixes 668f48f4fc
2024-08-09 20:38:58 +02:00
Martin Kroeker
cc36db643e
Support new LAPACK build option LAPACK_STRLEN
2024-08-06 17:31:03 +02:00
Martin Kroeker
e8bd97ab4b
add RISCV64 entries for DYNAMIC_ARCH
2024-08-03 23:56:59 +02:00
Martin Kroeker
9eecd0d33b
enable GEMM/GEMV forwarding for riscv and ppc
2024-07-31 23:29:12 +02:00
Chris Sidebottom
b26424c6a2
Allow opt into GEMM -> GEMV forwarding
2024-07-31 13:09:14 +01:00
yamazaki-mitsufumi
821ef34635
Add A64FX to the list of CPUs supported by DYNAMIC_ARCH
2024-07-23 20:44:39 +09:00
Jaap Aarts
cea4abcac0
Fix compiling on mingw
2024-07-04 14:56:16 +02:00
Jaap Aarts
9d0abe2d26
Add support for RISCV64_GENERIC in cmake
2024-07-03 01:49:37 +02:00
Martin Kroeker
d25ee4d0f5
Fix detection of Intel ifx and apply -fp-model option to it
2024-06-14 23:58:45 +02:00
Martin Kroeker
21c0f769ef
ensure that cpu-specific -march options are always applied to icx
2024-06-14 23:54:27 +02:00
Alexander Neumann
dd4505c5dd
Fix CMake warning
2024-05-30 09:04:23 +02:00
Martin Kroeker
8b4996a2d5
Override icx's default fast math mode to ensure correct NaN handling
2024-05-26 13:16:03 +02:00
Martin Kroeker
6494f432df
Fix INTERFACE64 builds on Loongarch64
2024-05-18 16:49:03 +02:00
Martin Kroeker
a3f6b13bc9
remove spurious brace
2024-05-16 09:25:53 +02:00
Martin Kroeker
668f48f4fc
Use CMAKE_C_COMPILER_VERSION instead of dumpversion calls ( #4698 )
...
* Use CMAKE_C_COMPILER_VERSION throughout
2024-05-15 23:58:14 +02:00
Martin Kroeker
f5c080f083
Fix CMAKE syntax in kernel file parsing of IFNEQ conditionals ( #4695 )
...
* Fix syntax in parsing of IFNEQ
2024-05-15 20:58:31 +02:00
Martin Kroeker
3d26837a35
Suppress GCC14 error exit in the f2c-converted LAPACK
2024-04-30 19:05:18 +02:00
Martin Kroeker
69aa93e34f
Fix Loongson compiler flag check
2024-04-23 21:57:42 +02:00
Martin Kroeker
015042f7b5
Fix Loongson compiler flag test
2024-04-23 21:55:57 +02:00
مهدي شينون (Mehdi Chinoune)
cda55f2fd2
Don't pass `-exhaustive-register-search` directly to clang compiler
...
`-exhaustive-register-search` is an LLVM code generation flag that shouldn't be passed directly to clang compiler.
2024-04-06 05:54:48 +01:00
Martin Kroeker
3af736fb9d
Add support for Cortex-A76
2024-04-02 19:42:23 +02:00
Martin Kroeker
52b71a1673
Filter out FFLAGS that flang-new from LLVM18 no longer supports ( #4569 )
...
* Filter out FFLAGS that flang-new from LLVM18 no longer supports
2024-03-22 17:02:39 +01:00
Martin Kroeker
2e86faa657
Merge branch 'develop' into issue4468
2024-02-23 11:39:49 +01:00
Martin Kroeker
8fc2c2db04
Fix missing support for INTERFACE64 on ARM64 and MIPS64
2024-02-22 22:14:13 +01:00
Martin Kroeker
82b81c0bbe
Dont fail if there is no Fortran compiler
2024-02-22 22:11:50 +01:00
Martin Kroeker
a0e3f77e0b
add FIXED_LIBNAME, PREFIX and SUFFIX
2024-02-15 12:17:38 +01:00
Martin Kroeker
ffbfc3c692
Add libname prefix and suffix
2024-02-15 12:16:34 +01:00
Martin Kroeker
0c43c6fa99
Merge pull request #4341 from catap/openblas.pc.in
...
cmake/openblas.pc.in: fixed version and URL
2023-12-31 13:25:06 +01:00
Martin Kroeker
e9c32ed165
Merge pull request #4384 from yetist/develop
...
Fix: build failed on LoongArch
2023-12-27 14:05:01 +01:00
Martin Kroeker
1106460bb3
remove redundant targets from the default ARM64 DYNAMIC_ARCH list
2023-12-25 12:29:56 +01:00
Wu Xiaotian
0baf462dbc
Fix: build failed on LoongArch
...
According to the documentation at https://github.com/loongson/la-abi-specs/blob/release/lapcs.adoc#the-base-abi-variants , valid -mabi parameters are lp64s, lp64f, lp64d, ilp32s, ilp32f and ilp32d.
2023-12-25 16:04:43 +08:00
barracuda156
8c143331b0
PPC970: drop -mcpu=970 which seems to produce faulty code
...
Fixes: https://github.com/OpenMathLib/OpenBLAS/issues/4376
2023-12-15 22:56:06 +08:00
barracuda156
981e315b30
cc.cmake: use -force_cpusubtype_ALL for Darwin PPC
2023-12-14 12:01:31 +08:00
barracuda156
a8d3619f65
cc.cmake: add optflags for G5 and G4 kernels
2023-12-13 19:42:56 +08:00
barracuda156
c732f275a2
system_check.cmake: fix arch detection for Darwin PowerPC
2023-12-11 21:05:31 +08:00
Kirill A. Korinsky
08fde5ebd2
Use 64bit build on `CMAKE_SYSTEM_PROCESSOR=i386` on Darwin
...
Here a bit tricky things.
A value `CMAKE_SYSTEM_PROCESSOR` is came from output of `uname -m` which
migth be 32bit with 64bit building applicaiton.
So, for that case use `CMAKE_SIZEOF_VOID_P` to detect the target.
See https://trac.macports.org/ticket/68488
2023-11-30 21:24:58 +00:00
Kirill A. Korinsky
01c7010543
cmake/openblas.pc.in: fixed version and URL
2023-11-27 14:51:58 +00:00
Martin Kroeker
5bf87c86f5
Implement truncated QR with pivoting (Reference-LAPACK PR 891)
2023-11-15 12:10:20 +01:00
Martin Kroeker
58427ff74d
Deprecate ?GELQS and ?GEQRS from TESTING/LIN (Reference-LAPACK PR 900) ( #4307 )
...
* Move ?GELQS and ?GEQRS from TESTING/LIN to DEPRECATED (Reference-LAPACK PR 900)
* Add f2c-converted versions of ?GELQS and ?GEQRS
2023-11-12 10:54:39 +01:00
Martin Kroeker
49689fbef7
Add support for compiling SVE kernels with the NVIDIA HPC compiler
2023-08-25 17:11:04 +02:00
Martin Kroeker
562ef5fdca
Merge pull request #4169 from felixonmars/patch-1
...
Use defined variable for riscv64 in arch.cmake
2023-08-12 17:20:56 +02:00