Commit Graph

91 Commits

Author SHA1 Message Date
Rafael Cardoso Fernandes Sousa
d38110a5ce Use CMake variables instead of as 2021-12-10 17:46:53 -06:00
Rafael Cardoso Fernandes Sousa
214fbcee15 Fix cmake for power 2021-12-09 08:28:17 -06:00
Markus Mützel
de2ed66596 cmake: Set SUFFIX64 also for NOFORTRAN 2021-11-15 08:53:52 +01:00
Wangyang Guo
3dc6052c7e initial support for Sapphire Rapids platform 2021-10-12 01:30:40 -07:00
Martin Kroeker
e02df9fc55 Propagate BUILD_BFLOAT16 to CFLAGS 2021-09-14 16:12:27 +02:00
Wangyang Guo
76ea8db4da Small Matrix: enable by default for x86_64 arch
If no customized GEMM_SMALL_M_PERMIT kernel defined, it will just by pass to normal path.
2021-08-05 02:59:36 +00:00
Wangyang Guo
fee5abd84b Small Matrix: support cmake build 2021-08-04 08:50:15 +00:00
Martin Kroeker
30f23be0f9 Rework setting of -mfma to only apply it where necessary 2021-07-22 12:00:03 +02:00
User User-User
91e2b11d3c add to cmake listings too 2021-06-20 15:32:42 +02:00
刘雨培
725432efaa pass NO_AVX512 macro def 2021-04-07 00:10:41 +08:00
Martin Kroeker
33b5670122 Merge pull request #3096 from martin-frbg/fixclangcmake
Fix Cooperlake/DYNAMIC_ARCH builds with clang on Windows
2021-02-02 13:33:15 +01:00
Martin Kroeker
95e19e2e23 fix case in compiler name check
Co-authored-by: xoviat <49173759+xoviat@users.noreply.github.com>
2021-02-02 10:53:46 +01:00
Martin Kroeker
99ac042702 remove spurious lines (probably editor malfunction) 2021-02-01 21:02:53 +01:00
Martin Kroeker
774b9f8653 handle AppleClang in Cooperlake support condition 2021-02-01 20:18:53 +01:00
Martin Kroeker
eb1d2344f7 Fix compiler version check for Intel Cooperlake support (clang-cl does not accept -dumpversion) 2021-02-01 19:45:25 +01:00
xoviat
b60de4447a add cortex-m platform 2021-01-19 08:57:44 -06:00
Martin Kroeker
438a8e5624 Fix placement of getarch call and spurious cpu property accumulation in DYNAMIC_ARCH builds 2020-11-07 20:26:12 +01:00
Martin Kroeker
0155cd53a3 Add -msse3 where needed for DYNAMIC_ARCH builds 2020-11-03 23:45:49 +01:00
Martin Kroeker
b9bc76aec4 Add files via upload 2020-11-02 22:43:50 +01:00
Martin Kroeker
f64243ff57 Add compiler options for sse/sse2/ssse3/sse4.1 2020-10-16 10:47:06 +02:00
Martin Kroeker
e3a29f6b58 Change "HALF" and "sh" to "BFLOAT16" and "sb" 2020-10-12 00:07:37 +02:00
Martin Kroeker
68e6823d36 Adapt for supporting only a subset of variable types 2020-10-11 15:01:32 +02:00
Martin Kroeker
e1b7123bbe Merge pull request #2867 from Qiyu8/usimd-floatdot
Optimize the performance of dot by using universal intrinsics in X86/ARM
2020-10-10 12:10:25 +02:00
Qiyu8
f32d34a015 add sse3 compiler flag 2020-10-10 10:36:15 +08:00
Martin Kroeker
a5feea6611 make BLAS3_MEM_ALLOC_THRESHOLD configurable on non-Windows 2020-10-04 23:01:06 +02:00
Martin Kroeker
c4aeeeb9f4 Activate all BUILD_ options if none was specified 2020-09-15 23:15:34 +02:00
Martin Kroeker
26792d2096 Copy BUILD_* directives to the compiler options to allow ifdef in tests 2020-09-13 21:47:55 +02:00
Martin Kroeker
68b1713c30 Merge pull request #2811 from martin-frbg/issue2806
Make NO_AVX512 option override the AVX512 compile test in CMAKE builds as well
2020-09-01 17:19:14 +02:00
Martin Kroeker
bd3207b4b4 Update system.cmake 2020-08-19 22:51:10 +02:00
Martin Kroeker
b8ebfc9335 Update system.cmake 2020-08-19 22:30:19 +02:00
Martin Kroeker
71d33c952d Typo fix 2020-08-19 17:44:23 +02:00
Martin Kroeker
6a3c074786 -march=cooperlake requires gcc10 2020-08-19 17:22:12 +02:00
Chen, Guobing
e740c4873d Enable COOPERLAKE build target
Enable new build target platform -- COOPERLAKE. This target platform
supports all the SKYLAKEX supported ISAs + avx512bf16. So all the
SKYLAKEX specific kernels/drivers and related code are now extended
to be also active on COOPERLAKE. Besides, new BF16 related kernels
are active under this target.
2020-08-13 06:18:00 +08:00
Martin Kroeker
6876221cf3 Remove optimization level limit for flang again and add -fno-unroll-loops for AOCC flang 2.x instead 2020-06-14 17:40:24 +02:00
Martin Kroeker
3ce469a34f Limit optimization level to O1 for flang and add -frecursive 2020-06-09 16:11:13 +02:00
Martin Kroeker
bb12c2c854 Limit MAX_STACK_ALLOC availability to non-Wndows 2020-06-04 19:07:27 +02:00
Martin Kroeker
6e97df7b47 Add CMAKE support for MAX_STACK_ALLOC setting 2020-06-04 14:45:31 +02:00
Rajalakshmi Srinivasaraghavan
7eb55504b1 RFC : Add half precision gemm for bfloat16 in OpenBLAS
This patch adds support for bfloat16 data type matrix multiplication kernel.
For architectures that don't support bfloat16, it is defined as unsigned short
(2 bytes).  Default unroll sizes can be changed as per architecture as done for
SGEMM and for now 8 and 4 are used for M and N.  Size of ncopy/tcopy can be
changed as per architecture requirement and for now, size 2 is used.

Added shgemm in kernel/power/KERNEL.POWER9 and tested in powerpc64le and
powerpc64.  For reference, added a small test compare_sgemm_shgemm.c to compare
sgemm and shgemm output.

This patch does not cover OpenBLAS test, benchmark and lapack tests for shgemm.
Complex type implementation can be discussed and added once this is approved.
2020-04-14 14:55:08 -05:00
Martin Kroeker
7f0d523b42 Make BUFFER_SIZE configurable 2020-02-09 23:32:57 +01:00
Martin Kroeker
e3d846ab57 Do not use -march=native with the PGI compiler 2019-08-16 08:58:10 +02:00
Martin Kroeker
f69a0be712 Add getarch flags to disable AVX on x86
(and other small fixes to match Makefile behaviour)
2019-07-06 15:02:39 +02:00
Michael Lass
7a9a4dbc4f Fix detection of AVX512 capable compilers in getarch
21eda8b5 introduced a check in getarch.c to test if the compiler is capable of
AVX512. This check currently fails, since the used __AVX2__ macro is only
defined if getarch itself was compiled with AVX2/AVX512 support. Make sure this
is the case by building getarch with -march=native on x86_64. It is only
supposed to run on the build host anyway.
2019-06-05 17:30:56 +02:00
Martin Kroeker
1e52572be3 Add option USE_LOCKING for single-threaded build with locking support 2019-05-15 23:19:30 +02:00
luz.paz
daf2fec12d Misc. typo fixes
Found via `codespell -q 3 -w -L ith,als,dum,nd,amin,nto,wis,ba -S ./relapack,./kernel,./lapack-netlib`
2019-04-29 17:03:56 -04:00
Martin Kroeker
5952e586ce Support DYNAMIC_LIST option in cmake
e.g. cmake -DDYNAMIC_ARCH=1 -DDYNAMIC_LIST="NEHALEM;HASWELL;ZEN" ..
original issue was #1639
2019-02-05 23:51:40 +01:00
Martin Kroeker
58dd7e4501 Change ARMV8 target to ARMV7 for BINARY=32 2019-01-26 17:52:33 +01:00
Martin Kroeker
76b4b8980f Use -dumpversion with gcc only 2018-12-23 19:08:19 +01:00
Martin Kroeker
49e0f485da Add -mavx2 for TARGET=HASWELL if compiler supports and requires it 2018-12-23 17:26:09 +01:00
Martin Kroeker
081ceb3e02 Propagate version number for openblas_get_config 2018-11-29 00:12:04 +01:00
Martin Kroeker
81c9985c3a Use KERNEL_DEFINITIONS rather than COMMON_OPTS to pass -march=skylake-avx512 2018-10-11 11:03:27 +02:00