Commit Graph

51 Commits

Author SHA1 Message Date
Martin Kroeker 2fc712469d
Avoid creating spurious non-suffixed c/zgemm_kernels
Plain cgemm_kernel and zgemm_kernel are not used anywhere, only cgemm_kernel_b etc.
Needlessly building them (without any define like NN, CN, etc.) just happened to work on most platforms, but not on arm64. See #1870
2018-12-06 13:56:06 +01:00
Arjan van de Ven 99c7bba8e4 Initial support for SkylakeX / AVX512
This patch adds the basic infrastructure for adding the SkylakeX (Intel Skylake server)
target. The SkylakeX target will use the AVX512 (AVX512VL level) instruction set,
which brings 2 basic things:
1) 512 bit wide SIMD (2x width of AVX2)
2) 32 SIMD registers (2x the number on AVX2)

This initial patch only contains a trivial transofrmation of the Haswell SGEMM kernel
to AVX512VL; more will follow later but this patch aims to get the infrastructure
in place for this "later".

Full performance tuning has not been done yet; with more registers and wider SIMD
it's in theory possible to retune the kernels but even without that there's an
interesting enough performance increase (30-40% range) with just this change.
2018-06-03 07:58:52 +00:00
Martin Kroeker 485df77612
Make USE_TRMM depend on TARGET_CORE not TARGET
Fixes #1432 (and possibly other DTRMM-related failures on Haswell and related architectures when built with cmake)
2018-01-26 23:20:00 +01:00
Martin Kroeker c7a8512d12 Cmake fixes for DYNAMIC_ARCH builds and whitespace in path names (#1323)
* prebuild.cmake: Put quotes around path names that may contain whitespace
(Copied from alexkaratakis' PR #1295)
* kernel/CMakeLists.txt: Fix common_lapack header inclusion and DYNAMIC_ARCH generation of ?neg_tcopy and ?laswp_ncopy files
* lapack/CMakeLists.txt: Use correct template for ?laswp_(plus,minus) functions
2017-10-09 23:34:18 +02:00
Sacha Refshauge 47ebce4d1a Clean up, fix old typos. Simplify arch usages. Move system arch check to earlier position. 2017-08-21 00:37:29 +10:00
Sacha Refshauge 69b560751c Improvements to previous commit (cross-compile).
Fix typos and bad if statements discovered in 0.2.20.
2017-08-20 22:50:31 +10:00
Sacha Refshauge 11911fd941 Add kernel/Makefile.LA to CMake 2017-08-20 00:59:14 +10:00
Isuru Fernando d3b677fe87 Add commonobjs 2017-08-07 23:12:40 +05:30
Isuru Fernando 505b218829 Merge remote-tracking branch 'upstream/develop' into dyn 2017-08-06 19:07:00 +05:30
Isuru Fernando d9346930dd Merge remote-tracking branch 'upstream/develop' into develop 2017-08-04 07:57:55 +05:30
Isuru Fernando 7892434572 Add hemm3m and symm3m objects 2017-08-02 18:24:54 +05:30
Isuru Fernando d798487213 Fixes for dynamic_arch. almost there 2017-08-02 17:25:49 +05:30
Isuru Fernando 251715d9ef configure kernel_core.h 2017-08-01 23:23:55 +05:30
Isuru Fernando 50deeb49b7 configure setparam 2017-08-01 22:32:47 +05:30
Isuru Fernando 4260215adf Support DYNAMIC_ARCH with cmake 2017-08-01 22:25:52 +05:30
Isuru Fernando d245caa49a Support out-of-source build 2017-08-01 15:16:14 +05:30
Isuru Fernando dc24914415 check compiler is msvc instead of msvc 2017-07-28 11:49:39 +05:30
Denis Steckelmacher c9ff735da6 Add ZEN support (tested for auto-detected static backend) 2017-03-19 15:32:50 +01:00
John Biddiscombe 053044ae4d Replace CMAKE_SOURCE_DIR/CMAKE_BINARY_DIR with PROJECT_SOURCE_DIR/PROJECT_BINARY_DIR
If OpenBLAS is built using add_subdirectory(OpenBlas) as part of another project
then the paths set by CMAKE_XXX_DIR are relative to the parent project
and not the OpenBLAS project.
2016-05-25 09:13:28 +02:00
Zhang Xianyi d06b92906a Add gemm3m building for CMake. 2016-02-12 05:02:51 +08:00
Zhang Xianyi 53b6023a6c Fix cmake bug on MSVC 32-bit. 2015-10-26 14:52:13 -05:00
Zhang Xianyi 309875de3c Fix cmake bug on x86 32-bit.
e.g. Build 32-bit on 64-bit Linux.
cmake -DBINARY=32
2015-10-27 02:54:53 +08:00
Zhang Xianyi 8fade093aa Fixed cmake bug on Visual Studio. 2015-10-20 14:37:22 -05:00
Zhang Xianyi 96f0bbe067 Fixed cmake bug on haswell. 2015-10-21 02:24:54 +08:00
Zhang Xianyi d8392c1245 Fixe cmake config bugs. 2015-10-20 04:30:55 +08:00
Zhang Xianyi f874465bb8 Use cmake to build OpenBLAS GENERIC Target on MSVC x86 64-bit.
Disable CBLAS and LAPACK.
2015-08-10 14:10:44 -05:00
Zhang Xianyi 7ac7e147d4 Fixed cmake building bugs on Linux. Disable LAPACK by default. 2015-08-04 04:37:05 +08:00
Hank Anderson 518e2424a8 Fixed bad filename for cpuid.S compile. 2015-02-25 11:51:29 -06:00
Hank Anderson 0d8e227ea7 Changed strategy for setting preprocessor definitions.
Instead of generating separate object files for each permutation of
defines for a source file, GenerateNamedObjects now writes an entirely
new source file and inserts the defines as #define c statements.

This solves a problem I ran into with ar.exe where it was refusing to
link objects that had the same filename despite having different paths.
2015-02-24 12:26:33 -06:00
Hank Anderson 12d1fb2e40 Fixed incorrect object name in kernel CMakeLists.txt 2015-02-24 10:30:16 -06:00
Hank Anderson 1b7f427401 Added conj gemv objects for complex build. 2015-02-23 10:24:31 -06:00
Hank Anderson b2284647a3 More complex objects. 2015-02-23 07:51:05 -06:00
Hank Anderson a6116e5859 Added some more complex-only objects. 2015-02-22 17:49:28 -06:00
Hank Anderson 714638c187 Added some TRMM objects for complex types. 2015-02-19 16:11:51 -06:00
Hank Anderson e27c372e53 Fixed reuse of float_char from parent loop.
Fixed in/it/on/otcopy names.
2015-02-19 13:53:29 -06:00
Hank Anderson f3f2b3d768 Added complex and single netlib-lapack fortran sources to lapack.cmake. 2015-02-19 12:26:11 -06:00
Hank Anderson 9492298048 Added other float types to Makefile.L3. 2015-02-18 13:01:05 -06:00
Hank Anderson 14fd3d35de Added checks for missing defines in kernel. 2015-02-18 10:25:01 -06:00
Hank Anderson cebc07cebd ParseMakefileVars now recursively parses included makefiles. 2015-02-17 22:09:41 -06:00
Hank Anderson 33c5e8db7f Added a helper function for setting the L1 kernel defaults.
Added loop to build objects with different KERNEL defines.
2015-02-17 21:36:23 -06:00
Hank Anderson 4662a0b13a Changed generate functions to iterate through a list of float types.
This will generate obj files for SINGLE/DOUBLE/COMPLEX/DOUBLE COMPLEX.
2015-02-15 17:44:37 -06:00
Hank Anderson 162791e30e Added common objects from kernel Makefile. 2015-02-10 12:42:05 -06:00
Hank Anderson c0624a26be Fixed some dgemm_copy function names. 2015-02-09 14:34:29 -06:00
Hank Anderson 4bfaf1ce66 Removed some list appends I missed. 2015-02-09 12:56:55 -06:00
Hank Anderson e8c39138c6 Removed return value from GenerateNamedObjects.
It sets DBLAS_OBJS directly to save a bunch of list appending in the
CMakeLists.txt files.
2015-02-09 12:28:09 -06:00
Hank Anderson f992799226 Added the rest of Makefile.L3. 2015-02-09 10:47:35 -06:00
Hank Anderson 4c65afcce1 Changed kernel filenames to vars. These will need to be read from KERNEL.
Added some kernel/L3 objects.
2015-02-09 09:52:14 -06:00
Hank Anderson 7fa5c4e2fd Fixed some case issues with ARCH.
Added some kernel and driver/others objects.
2015-02-08 15:29:18 -06:00
Hank Anderson fa0e6a6c93 Added the rest of the L1 kernel makefile. 2015-02-07 21:37:46 -06:00
Hank Anderson 38681fb1c6 Added more kernel files. 2015-02-07 12:54:30 -06:00