Martin Kroeker
2fc712469d
Avoid creating spurious non-suffixed c/zgemm_kernels
...
Plain cgemm_kernel and zgemm_kernel are not used anywhere, only cgemm_kernel_b etc.
Needlessly building them (without any define like NN, CN, etc.) just happened to work on most platforms, but not on arm64. See #1870
2018-12-06 13:56:06 +01:00
Arjan van de Ven
99c7bba8e4
Initial support for SkylakeX / AVX512
...
This patch adds the basic infrastructure for adding the SkylakeX (Intel Skylake server)
target. The SkylakeX target will use the AVX512 (AVX512VL level) instruction set,
which brings 2 basic things:
1) 512 bit wide SIMD (2x width of AVX2)
2) 32 SIMD registers (2x the number on AVX2)
This initial patch only contains a trivial transofrmation of the Haswell SGEMM kernel
to AVX512VL; more will follow later but this patch aims to get the infrastructure
in place for this "later".
Full performance tuning has not been done yet; with more registers and wider SIMD
it's in theory possible to retune the kernels but even without that there's an
interesting enough performance increase (30-40% range) with just this change.
2018-06-03 07:58:52 +00:00
Martin Kroeker
485df77612
Make USE_TRMM depend on TARGET_CORE not TARGET
...
Fixes #1432 (and possibly other DTRMM-related failures on Haswell and related architectures when built with cmake)
2018-01-26 23:20:00 +01:00
Martin Kroeker
c7a8512d12
Cmake fixes for DYNAMIC_ARCH builds and whitespace in path names ( #1323 )
...
* prebuild.cmake: Put quotes around path names that may contain whitespace
(Copied from alexkaratakis' PR #1295 )
* kernel/CMakeLists.txt: Fix common_lapack header inclusion and DYNAMIC_ARCH generation of ?neg_tcopy and ?laswp_ncopy files
* lapack/CMakeLists.txt: Use correct template for ?laswp_(plus,minus) functions
2017-10-09 23:34:18 +02:00
Sacha Refshauge
47ebce4d1a
Clean up, fix old typos. Simplify arch usages. Move system arch check to earlier position.
2017-08-21 00:37:29 +10:00
Sacha Refshauge
69b560751c
Improvements to previous commit (cross-compile).
...
Fix typos and bad if statements discovered in 0.2.20.
2017-08-20 22:50:31 +10:00
Sacha Refshauge
11911fd941
Add kernel/Makefile.LA to CMake
2017-08-20 00:59:14 +10:00
Isuru Fernando
d3b677fe87
Add commonobjs
2017-08-07 23:12:40 +05:30
Isuru Fernando
505b218829
Merge remote-tracking branch 'upstream/develop' into dyn
2017-08-06 19:07:00 +05:30
Isuru Fernando
d9346930dd
Merge remote-tracking branch 'upstream/develop' into develop
2017-08-04 07:57:55 +05:30
Isuru Fernando
7892434572
Add hemm3m and symm3m objects
2017-08-02 18:24:54 +05:30
Isuru Fernando
d798487213
Fixes for dynamic_arch. almost there
2017-08-02 17:25:49 +05:30
Isuru Fernando
251715d9ef
configure kernel_core.h
2017-08-01 23:23:55 +05:30
Isuru Fernando
50deeb49b7
configure setparam
2017-08-01 22:32:47 +05:30
Isuru Fernando
4260215adf
Support DYNAMIC_ARCH with cmake
2017-08-01 22:25:52 +05:30
Isuru Fernando
d245caa49a
Support out-of-source build
2017-08-01 15:16:14 +05:30
Isuru Fernando
dc24914415
check compiler is msvc instead of msvc
2017-07-28 11:49:39 +05:30
Denis Steckelmacher
c9ff735da6
Add ZEN support (tested for auto-detected static backend)
2017-03-19 15:32:50 +01:00
John Biddiscombe
053044ae4d
Replace CMAKE_SOURCE_DIR/CMAKE_BINARY_DIR with PROJECT_SOURCE_DIR/PROJECT_BINARY_DIR
...
If OpenBLAS is built using add_subdirectory(OpenBlas) as part of another project
then the paths set by CMAKE_XXX_DIR are relative to the parent project
and not the OpenBLAS project.
2016-05-25 09:13:28 +02:00
Zhang Xianyi
d06b92906a
Add gemm3m building for CMake.
2016-02-12 05:02:51 +08:00
Zhang Xianyi
53b6023a6c
Fix cmake bug on MSVC 32-bit.
2015-10-26 14:52:13 -05:00
Zhang Xianyi
309875de3c
Fix cmake bug on x86 32-bit.
...
e.g. Build 32-bit on 64-bit Linux.
cmake -DBINARY=32
2015-10-27 02:54:53 +08:00
Zhang Xianyi
8fade093aa
Fixed cmake bug on Visual Studio.
2015-10-20 14:37:22 -05:00
Zhang Xianyi
96f0bbe067
Fixed cmake bug on haswell.
2015-10-21 02:24:54 +08:00
Zhang Xianyi
d8392c1245
Fixe cmake config bugs.
2015-10-20 04:30:55 +08:00
Zhang Xianyi
f874465bb8
Use cmake to build OpenBLAS GENERIC Target on MSVC x86 64-bit.
...
Disable CBLAS and LAPACK.
2015-08-10 14:10:44 -05:00
Zhang Xianyi
7ac7e147d4
Fixed cmake building bugs on Linux. Disable LAPACK by default.
2015-08-04 04:37:05 +08:00
Hank Anderson
518e2424a8
Fixed bad filename for cpuid.S compile.
2015-02-25 11:51:29 -06:00
Hank Anderson
0d8e227ea7
Changed strategy for setting preprocessor definitions.
...
Instead of generating separate object files for each permutation of
defines for a source file, GenerateNamedObjects now writes an entirely
new source file and inserts the defines as #define c statements.
This solves a problem I ran into with ar.exe where it was refusing to
link objects that had the same filename despite having different paths.
2015-02-24 12:26:33 -06:00
Hank Anderson
12d1fb2e40
Fixed incorrect object name in kernel CMakeLists.txt
2015-02-24 10:30:16 -06:00
Hank Anderson
1b7f427401
Added conj gemv objects for complex build.
2015-02-23 10:24:31 -06:00
Hank Anderson
b2284647a3
More complex objects.
2015-02-23 07:51:05 -06:00
Hank Anderson
a6116e5859
Added some more complex-only objects.
2015-02-22 17:49:28 -06:00
Hank Anderson
714638c187
Added some TRMM objects for complex types.
2015-02-19 16:11:51 -06:00
Hank Anderson
e27c372e53
Fixed reuse of float_char from parent loop.
...
Fixed in/it/on/otcopy names.
2015-02-19 13:53:29 -06:00
Hank Anderson
f3f2b3d768
Added complex and single netlib-lapack fortran sources to lapack.cmake.
2015-02-19 12:26:11 -06:00
Hank Anderson
9492298048
Added other float types to Makefile.L3.
2015-02-18 13:01:05 -06:00
Hank Anderson
14fd3d35de
Added checks for missing defines in kernel.
2015-02-18 10:25:01 -06:00
Hank Anderson
cebc07cebd
ParseMakefileVars now recursively parses included makefiles.
2015-02-17 22:09:41 -06:00
Hank Anderson
33c5e8db7f
Added a helper function for setting the L1 kernel defaults.
...
Added loop to build objects with different KERNEL defines.
2015-02-17 21:36:23 -06:00
Hank Anderson
4662a0b13a
Changed generate functions to iterate through a list of float types.
...
This will generate obj files for SINGLE/DOUBLE/COMPLEX/DOUBLE COMPLEX.
2015-02-15 17:44:37 -06:00
Hank Anderson
162791e30e
Added common objects from kernel Makefile.
2015-02-10 12:42:05 -06:00
Hank Anderson
c0624a26be
Fixed some dgemm_copy function names.
2015-02-09 14:34:29 -06:00
Hank Anderson
4bfaf1ce66
Removed some list appends I missed.
2015-02-09 12:56:55 -06:00
Hank Anderson
e8c39138c6
Removed return value from GenerateNamedObjects.
...
It sets DBLAS_OBJS directly to save a bunch of list appending in the
CMakeLists.txt files.
2015-02-09 12:28:09 -06:00
Hank Anderson
f992799226
Added the rest of Makefile.L3.
2015-02-09 10:47:35 -06:00
Hank Anderson
4c65afcce1
Changed kernel filenames to vars. These will need to be read from KERNEL.
...
Added some kernel/L3 objects.
2015-02-09 09:52:14 -06:00
Hank Anderson
7fa5c4e2fd
Fixed some case issues with ARCH.
...
Added some kernel and driver/others objects.
2015-02-08 15:29:18 -06:00
Hank Anderson
fa0e6a6c93
Added the rest of the L1 kernel makefile.
2015-02-07 21:37:46 -06:00
Hank Anderson
38681fb1c6
Added more kernel files.
2015-02-07 12:54:30 -06:00