Chip Kerchner
76227e2948
Initial commit for vectorized BF16 GEMV. Added GEMM_GEMV_FORWARD_BF16 to enable using BF16 GEMV for one dimension matrices. Updated unit test to support inc_x != 1 or inc_y for GEMV.
2024-09-06 14:03:31 -05:00
psykose
1265eee85c
fix cmake typo for power10 cc version check
...
fixes 668f48f4fc
2024-08-09 20:38:58 +02:00
Martin Kroeker
cc36db643e
Support new LAPACK build option LAPACK_STRLEN
2024-08-06 17:31:03 +02:00
Martin Kroeker
e8bd97ab4b
add RISCV64 entries for DYNAMIC_ARCH
2024-08-03 23:56:59 +02:00
Martin Kroeker
9eecd0d33b
enable GEMM/GEMV forwarding for riscv and ppc
2024-07-31 23:29:12 +02:00
Chris Sidebottom
b26424c6a2
Allow opt into GEMM -> GEMV forwarding
2024-07-31 13:09:14 +01:00
yamazaki-mitsufumi
821ef34635
Add A64FX to the list of CPUs supported by DYNAMIC_ARCH
2024-07-23 20:44:39 +09:00
Jaap Aarts
cea4abcac0
Fix compiling on mingw
2024-07-04 14:56:16 +02:00
Jaap Aarts
9d0abe2d26
Add support for RISCV64_GENERIC in cmake
2024-07-03 01:49:37 +02:00
Martin Kroeker
d25ee4d0f5
Fix detection of Intel ifx and apply -fp-model option to it
2024-06-14 23:58:45 +02:00
Martin Kroeker
21c0f769ef
ensure that cpu-specific -march options are always applied to icx
2024-06-14 23:54:27 +02:00
Alexander Neumann
dd4505c5dd
Fix CMake warning
2024-05-30 09:04:23 +02:00
Martin Kroeker
8b4996a2d5
Override icx's default fast math mode to ensure correct NaN handling
2024-05-26 13:16:03 +02:00
Martin Kroeker
6494f432df
Fix INTERFACE64 builds on Loongarch64
2024-05-18 16:49:03 +02:00
Martin Kroeker
a3f6b13bc9
remove spurious brace
2024-05-16 09:25:53 +02:00
Martin Kroeker
668f48f4fc
Use CMAKE_C_COMPILER_VERSION instead of dumpversion calls ( #4698 )
...
* Use CMAKE_C_COMPILER_VERSION throughout
2024-05-15 23:58:14 +02:00
Martin Kroeker
f5c080f083
Fix CMAKE syntax in kernel file parsing of IFNEQ conditionals ( #4695 )
...
* Fix syntax in parsing of IFNEQ
2024-05-15 20:58:31 +02:00
Martin Kroeker
3d26837a35
Suppress GCC14 error exit in the f2c-converted LAPACK
2024-04-30 19:05:18 +02:00
Martin Kroeker
69aa93e34f
Fix Loongson compiler flag check
2024-04-23 21:57:42 +02:00
Martin Kroeker
015042f7b5
Fix Loongson compiler flag test
2024-04-23 21:55:57 +02:00
مهدي شينون (Mehdi Chinoune)
cda55f2fd2
Don't pass `-exhaustive-register-search` directly to clang compiler
...
`-exhaustive-register-search` is an LLVM code generation flag that shouldn't be passed directly to clang compiler.
2024-04-06 05:54:48 +01:00
Martin Kroeker
3af736fb9d
Add support for Cortex-A76
2024-04-02 19:42:23 +02:00
Martin Kroeker
52b71a1673
Filter out FFLAGS that flang-new from LLVM18 no longer supports ( #4569 )
...
* Filter out FFLAGS that flang-new from LLVM18 no longer supports
2024-03-22 17:02:39 +01:00
Martin Kroeker
2e86faa657
Merge branch 'develop' into issue4468
2024-02-23 11:39:49 +01:00
Martin Kroeker
8fc2c2db04
Fix missing support for INTERFACE64 on ARM64 and MIPS64
2024-02-22 22:14:13 +01:00
Martin Kroeker
82b81c0bbe
Dont fail if there is no Fortran compiler
2024-02-22 22:11:50 +01:00
Martin Kroeker
a0e3f77e0b
add FIXED_LIBNAME, PREFIX and SUFFIX
2024-02-15 12:17:38 +01:00
Martin Kroeker
ffbfc3c692
Add libname prefix and suffix
2024-02-15 12:16:34 +01:00
Martin Kroeker
0c43c6fa99
Merge pull request #4341 from catap/openblas.pc.in
...
cmake/openblas.pc.in: fixed version and URL
2023-12-31 13:25:06 +01:00
Martin Kroeker
e9c32ed165
Merge pull request #4384 from yetist/develop
...
Fix: build failed on LoongArch
2023-12-27 14:05:01 +01:00
Martin Kroeker
1106460bb3
remove redundant targets from the default ARM64 DYNAMIC_ARCH list
2023-12-25 12:29:56 +01:00
Wu Xiaotian
0baf462dbc
Fix: build failed on LoongArch
...
According to the documentation at https://github.com/loongson/la-abi-specs/blob/release/lapcs.adoc#the-base-abi-variants , valid -mabi parameters are lp64s, lp64f, lp64d, ilp32s, ilp32f and ilp32d.
2023-12-25 16:04:43 +08:00
barracuda156
8c143331b0
PPC970: drop -mcpu=970 which seems to produce faulty code
...
Fixes: https://github.com/OpenMathLib/OpenBLAS/issues/4376
2023-12-15 22:56:06 +08:00
barracuda156
981e315b30
cc.cmake: use -force_cpusubtype_ALL for Darwin PPC
2023-12-14 12:01:31 +08:00
barracuda156
a8d3619f65
cc.cmake: add optflags for G5 and G4 kernels
2023-12-13 19:42:56 +08:00
barracuda156
c732f275a2
system_check.cmake: fix arch detection for Darwin PowerPC
2023-12-11 21:05:31 +08:00
Kirill A. Korinsky
08fde5ebd2
Use 64bit build on `CMAKE_SYSTEM_PROCESSOR=i386` on Darwin
...
Here a bit tricky things.
A value `CMAKE_SYSTEM_PROCESSOR` is came from output of `uname -m` which
migth be 32bit with 64bit building applicaiton.
So, for that case use `CMAKE_SIZEOF_VOID_P` to detect the target.
See https://trac.macports.org/ticket/68488
2023-11-30 21:24:58 +00:00
Kirill A. Korinsky
01c7010543
cmake/openblas.pc.in: fixed version and URL
2023-11-27 14:51:58 +00:00
Martin Kroeker
5bf87c86f5
Implement truncated QR with pivoting (Reference-LAPACK PR 891)
2023-11-15 12:10:20 +01:00
Martin Kroeker
58427ff74d
Deprecate ?GELQS and ?GEQRS from TESTING/LIN (Reference-LAPACK PR 900) ( #4307 )
...
* Move ?GELQS and ?GEQRS from TESTING/LIN to DEPRECATED (Reference-LAPACK PR 900)
* Add f2c-converted versions of ?GELQS and ?GEQRS
2023-11-12 10:54:39 +01:00
Martin Kroeker
49689fbef7
Add support for compiling SVE kernels with the NVIDIA HPC compiler
2023-08-25 17:11:04 +02:00
Martin Kroeker
562ef5fdca
Merge pull request #4169 from felixonmars/patch-1
...
Use defined variable for riscv64 in arch.cmake
2023-08-12 17:20:56 +02:00
Martin Kroeker
0e5d56ae4a
Merge pull request #4170 from felixonmars/patch-2
...
Fix 64-bit fortran options for riscv64
2023-08-12 09:21:05 +02:00
Markus Mützel
57256623f4
fc.cmake: Add support for LLVM Flang.
2023-08-05 13:16:06 +02:00
Felix Yan
f5506b002c
Add 64-bit flag on INTERFACE64 only
2023-07-28 16:19:14 +03:00
Felix Yan
4ed6414c17
Fix 64-bit fortran options for riscv64
...
64-bit builds are currently broken without this flag.
Makefiles have done this already: 5720fa02c5/Makefile.system (L831)
2023-07-28 04:53:27 +03:00
Felix Yan
007cd834c1
Use defined variable for riscv64 in arch.cmake
...
It's defined in #4137
2023-07-28 04:50:16 +03:00
Chris Sidebottom
f971ef55f2
Add ARMV8SVE to AArch64 Dynamic Dispatch
...
In order to enable support for future cores which have similar tunings
(in this case I'm doing this for the Arm(R) Neoverse(TM) V2 core), this generically detects SVE support and enables it. This should better manage the size and complexity of dynamic dispatch rather than just copy pasting the same parameters.
To make `ARMV8SVE` more representive of the common 128-bit SVE case,
I've split it and similar parameters from A64FX which has the wider
512-bit SVE.
2023-07-25 18:35:15 +01:00
Martin Kroeker
b61e64da6f
Merge pull request #4142 from exyntech/armv8-as-arm64
...
Fix armv8 detection in system_check.cmake
2023-07-15 23:15:49 +02:00
Martin Kroeker
f82a197143
Merge pull request #4137 from felixonmars/patch-1
...
Fix riscv64 detection in system_check.cmake
2023-07-15 19:41:06 +02:00