Commit Graph

106 Commits

Author SHA1 Message Date
Martin Kroeker a3f6b13bc9
remove spurious brace 2024-05-16 09:25:53 +02:00
Martin Kroeker 668f48f4fc
Use CMAKE_C_COMPILER_VERSION instead of dumpversion calls (#4698)
* Use CMAKE_C_COMPILER_VERSION throughout
2024-05-15 23:58:14 +02:00
Martin Kroeker 3d26837a35
Suppress GCC14 error exit in the f2c-converted LAPACK 2024-04-30 19:05:18 +02:00
مهدي شينون (Mehdi Chinoune) cda55f2fd2
Don't pass `-exhaustive-register-search` directly to clang compiler
`-exhaustive-register-search` is an LLVM code generation flag that shouldn't be passed directly to clang compiler.
2024-04-06 05:54:48 +01:00
Martin Kroeker 52b71a1673
Filter out FFLAGS that flang-new from LLVM18 no longer supports (#4569)
* Filter out FFLAGS that flang-new from LLVM18 no longer supports
2024-03-22 17:02:39 +01:00
Martin Kroeker a0e3f77e0b
add FIXED_LIBNAME, PREFIX and SUFFIX 2024-02-15 12:17:38 +01:00
Martin Kroeker 49689fbef7
Add support for compiling SVE kernels with the NVIDIA HPC compiler 2023-08-25 17:11:04 +02:00
Martin Kroeker ac698cedad
Add compiler options for ARM64 SVE targets in DYNAMIC_ARCH builds 2023-07-05 09:47:49 +02:00
Martin Kroeker d2144b2981
Add NVHPC 2023-06-09 19:01:15 +02:00
Martin Kroeker de937b3194
Add clang option to avoid running out of registers in AVX512 assembly 2023-03-17 21:22:37 +01:00
Martin Kroeker e964ebd0d0
Add compiler option for AVX512-capable Ryzen(4) 2023-02-02 19:04:05 +01:00
Martin Kroeker a0a4f7c447
Add -mfma to -mavx2 for clang, and add AVX2 declaration for Zen in DYNAMIC_ARCH builds 2022-09-13 22:47:00 +02:00
Martin Kroeker 85fd3c4279
Support compilation with the Cray C and Fortran compilers (#3712)
* Add support for the Cray Fortran compiler
2022-08-04 20:42:18 +02:00
Martin Kroeker 18b19d135b
C_LAPACK: Fixes to make it compile with MSVC (#3605)
* Fix f2c-like support functions to compile with MSVC, and
re-enable C_LAPACK for MSVC in CMAKE

* Add MSVC&flang build to Azure CI in order to check C_LAPACK correctness
2022-04-17 17:49:38 +02:00
Martin Kroeker b7873605d4
Use f2c translations of LAPACK when no Fortran compiler is available (#3539)
* Add C equivalents of the Fortran routines from Reference-LAPACK as fallbacks, and C_LAPACK variable to trigger their use
2022-04-09 22:38:58 +02:00
Rafael Cardoso Fernandes Sousa d38110a5ce Use CMake variables instead of as 2021-12-10 17:46:53 -06:00
Rafael Cardoso Fernandes Sousa 214fbcee15 Fix cmake for power 2021-12-09 08:28:17 -06:00
Markus Mützel de2ed66596 cmake: Set SUFFIX64 also for NOFORTRAN 2021-11-15 08:53:52 +01:00
Wangyang Guo 3dc6052c7e initial support for Sapphire Rapids platform 2021-10-12 01:30:40 -07:00
Martin Kroeker e02df9fc55
Propagate BUILD_BFLOAT16 to CFLAGS 2021-09-14 16:12:27 +02:00
Wangyang Guo 76ea8db4da Small Matrix: enable by default for x86_64 arch
If no customized GEMM_SMALL_M_PERMIT kernel defined, it will just by pass to normal path.
2021-08-05 02:59:36 +00:00
Wangyang Guo fee5abd84b Small Matrix: support cmake build 2021-08-04 08:50:15 +00:00
Martin Kroeker 30f23be0f9
Rework setting of -mfma to only apply it where necessary 2021-07-22 12:00:03 +02:00
User User-User 91e2b11d3c add to cmake listings too 2021-06-20 15:32:42 +02:00
刘雨培 725432efaa pass NO_AVX512 macro def 2021-04-07 00:10:41 +08:00
Martin Kroeker 33b5670122
Merge pull request #3096 from martin-frbg/fixclangcmake
Fix Cooperlake/DYNAMIC_ARCH builds with clang on Windows
2021-02-02 13:33:15 +01:00
Martin Kroeker 95e19e2e23
fix case in compiler name check
Co-authored-by: xoviat <49173759+xoviat@users.noreply.github.com>
2021-02-02 10:53:46 +01:00
Martin Kroeker 99ac042702
remove spurious lines (probably editor malfunction) 2021-02-01 21:02:53 +01:00
Martin Kroeker 774b9f8653
handle AppleClang in Cooperlake support condition 2021-02-01 20:18:53 +01:00
Martin Kroeker eb1d2344f7
Fix compiler version check for Intel Cooperlake support (clang-cl does not accept -dumpversion) 2021-02-01 19:45:25 +01:00
xoviat b60de4447a add cortex-m platform 2021-01-19 08:57:44 -06:00
Martin Kroeker 438a8e5624
Fix placement of getarch call and spurious cpu property accumulation in DYNAMIC_ARCH builds 2020-11-07 20:26:12 +01:00
Martin Kroeker 0155cd53a3
Add -msse3 where needed for DYNAMIC_ARCH builds 2020-11-03 23:45:49 +01:00
Martin Kroeker b9bc76aec4
Add files via upload 2020-11-02 22:43:50 +01:00
Martin Kroeker f64243ff57
Add compiler options for sse/sse2/ssse3/sse4.1 2020-10-16 10:47:06 +02:00
Martin Kroeker e3a29f6b58
Change "HALF" and "sh" to "BFLOAT16" and "sb" 2020-10-12 00:07:37 +02:00
Martin Kroeker 68e6823d36
Adapt for supporting only a subset of variable types 2020-10-11 15:01:32 +02:00
Martin Kroeker e1b7123bbe
Merge pull request #2867 from Qiyu8/usimd-floatdot
Optimize the performance of dot by using universal intrinsics in X86/ARM
2020-10-10 12:10:25 +02:00
Qiyu8 f32d34a015 add sse3 compiler flag 2020-10-10 10:36:15 +08:00
Martin Kroeker a5feea6611
make BLAS3_MEM_ALLOC_THRESHOLD configurable on non-Windows 2020-10-04 23:01:06 +02:00
Martin Kroeker c4aeeeb9f4
Activate all BUILD_ options if none was specified 2020-09-15 23:15:34 +02:00
Martin Kroeker 26792d2096
Copy BUILD_* directives to the compiler options to allow ifdef in tests 2020-09-13 21:47:55 +02:00
Martin Kroeker 68b1713c30
Merge pull request #2811 from martin-frbg/issue2806
Make NO_AVX512 option override the AVX512 compile test in CMAKE builds as well
2020-09-01 17:19:14 +02:00
Martin Kroeker bd3207b4b4
Update system.cmake 2020-08-19 22:51:10 +02:00
Martin Kroeker b8ebfc9335
Update system.cmake 2020-08-19 22:30:19 +02:00
Martin Kroeker 71d33c952d
Typo fix 2020-08-19 17:44:23 +02:00
Martin Kroeker 6a3c074786
-march=cooperlake requires gcc10 2020-08-19 17:22:12 +02:00
Chen, Guobing e740c4873d Enable COOPERLAKE build target
Enable new build target platform -- COOPERLAKE. This target platform
supports all the SKYLAKEX supported ISAs + avx512bf16. So all the
SKYLAKEX specific kernels/drivers and related code are now extended
to be also active on COOPERLAKE. Besides, new BF16 related kernels
are active under this target.
2020-08-13 06:18:00 +08:00
Martin Kroeker 6876221cf3
Remove optimization level limit for flang again and add -fno-unroll-loops for AOCC flang 2.x instead 2020-06-14 17:40:24 +02:00
Martin Kroeker 3ce469a34f
Limit optimization level to O1 for flang and add -frecursive 2020-06-09 16:11:13 +02:00