Jake Arkinstall
d7a77091a3
Addressed issue #3100 , removing an unnecessary write to the include directory
2021-02-10 12:11:17 +00:00
Martin Kroeker
33b5670122
Merge pull request #3096 from martin-frbg/fixclangcmake
...
Fix Cooperlake/DYNAMIC_ARCH builds with clang on Windows
2021-02-02 13:33:15 +01:00
Martin Kroeker
95e19e2e23
fix case in compiler name check
...
Co-authored-by: xoviat <49173759+xoviat@users.noreply.github.com>
2021-02-02 10:53:46 +01:00
Martin Kroeker
99ac042702
remove spurious lines (probably editor malfunction)
2021-02-01 21:02:53 +01:00
Martin Kroeker
774b9f8653
handle AppleClang in Cooperlake support condition
2021-02-01 20:18:53 +01:00
Martin Kroeker
eb1d2344f7
Fix compiler version check for Intel Cooperlake support (clang-cl does not accept -dumpversion)
2021-02-01 19:45:25 +01:00
Martin Kroeker
0cc36770f1
Merge pull request #3073 from xoviat/embedded
...
add embedded option
2021-01-31 18:02:41 +01:00
Martin Kroeker
cb61d3b46b
Add DYNAMIC_LIST support for ARM64
2021-01-25 13:13:20 +01:00
xoviat
b60de4447a
add cortex-m platform
2021-01-19 08:57:44 -06:00
Martin Kroeker
89ae305e11
Workaround for cmake having its own C_COMPILER variable
2021-01-13 12:30:26 +01:00
Martin Kroeker
ec4d77c47c
Add -mfma for HAVE_FMA3 in the non-DYNAMIC_ARCH case as well
2020-11-13 09:16:34 +01:00
Martin Kroeker
a29338aaa6
Remove extraneous quotes that caused a cmake policy warning
2020-11-07 20:27:42 +01:00
Martin Kroeker
438a8e5624
Fix placement of getarch call and spurious cpu property accumulation in DYNAMIC_ARCH builds
2020-11-07 20:26:12 +01:00
Martin Kroeker
0155cd53a3
Add -msse3 where needed for DYNAMIC_ARCH builds
2020-11-03 23:45:49 +01:00
Martin Kroeker
a9f9354296
Fix target test
2020-11-02 23:17:46 +01:00
Martin Kroeker
b9bc76aec4
Add files via upload
2020-11-02 22:43:50 +01:00
Martin Kroeker
e5f8c2bf8a
typo fix
2020-11-01 22:25:43 +01:00
Martin Kroeker
6baf8af658
Disable EXPRECISION for the combination of DYNAMIC_CORE and GENERIC target
2020-11-01 22:11:48 +01:00
Chen, Guobing
a7b1f9b1bb
Implementation of BF16 based gemv
...
1. Add a new API -- sbgemv to support bfloat16 based gemv
2. Implement a generic kernel for sbgemv
3. Implement an avx512-bf16 based kernel for sbgemv
Signed-off-by: Chen, Guobing <guobing.chen@intel.com>
2020-10-29 02:08:23 +08:00
Martin Kroeker
eddc65c7b7
Add POWER10 support flag (unconditionally for now)
2020-10-20 01:09:49 +02:00
Martin Kroeker
f5902ab0a1
Support cross-compiling for Apple Vortex
2020-10-18 19:10:58 +02:00
Martin Kroeker
f64243ff57
Add compiler options for sse/sse2/ssse3/sse4.1
2020-10-16 10:47:06 +02:00
Martin Kroeker
786c0a3ce8
Add sse options for use of intrinics with older compilers
2020-10-16 10:41:53 +02:00
Martin Kroeker
756802df61
Merge pull request #2890 from martin-frbg/s-d-sum
...
Revert special handling of Windows xNRM2 and enable C+intrinsics kern…
2020-10-14 09:02:03 +02:00
Martin Kroeker
75e3a92df6
Add express -mavx and -msse options (and fix a stray = for cooperlake)
2020-10-14 01:01:58 +02:00
Martin Kroeker
e3a29f6b58
Change "HALF" and "sh" to "BFLOAT16" and "sb"
2020-10-12 00:07:37 +02:00
Martin Kroeker
68e6823d36
Adapt for supporting only a subset of variable types
2020-10-11 15:01:32 +02:00
Martin Kroeker
88928650c4
Merge pull request #2883 from martin-frbg/issue2872
...
Minor CMAKE fixes
2020-10-11 10:30:33 +02:00
Martin Kroeker
82a497ec5d
restore PRESCOTT default for DYNAMIC_LIST
2020-10-11 00:43:09 +02:00
Martin Kroeker
de27e4f5fb
Stop DYNAMIC_ARCH build if the toplevel source contains a stray config_kernel.h from a gmake build
...
This is unlikely to happen in practice, but if it does, the rogue file would get included instead of the dynamically generated version for each target_core, leading to very confusing errors like "invalid operands (undefined UND and ABS sections)" in compilation of the assembly kernels as macros like PREFETCH would remain undefined
2020-10-11 00:40:22 +02:00
Martin Kroeker
e1b7123bbe
Merge pull request #2867 from Qiyu8/usimd-floatdot
...
Optimize the performance of dot by using universal intrinsics in X86/ARM
2020-10-10 12:10:25 +02:00
Qiyu8
f32d34a015
add sse3 compiler flag
2020-10-10 10:36:15 +08:00
Martin Kroeker
a5feea6611
make BLAS3_MEM_ALLOC_THRESHOLD configurable on non-Windows
2020-10-04 23:01:06 +02:00
Martin Kroeker
2367726578
Remove redundant status message
2020-09-30 23:28:49 +02:00
Martin Kroeker
c4aeeeb9f4
Activate all BUILD_ options if none was specified
2020-09-15 23:15:34 +02:00
Martin Kroeker
91c84e1c01
Merge pull request #2796 from Guobing-Chen/BF16_dot_coversion_apis
...
Add bfloat16 based dot and conversion with single/double
2020-09-14 15:00:19 +02:00
Martin Kroeker
26792d2096
Copy BUILD_* directives to the compiler options to allow ifdef in tests
2020-09-13 21:47:55 +02:00
Chen, Guobing
deaeb6c5b8
Add bfloat16 based dot and conversion with single/double
...
1. Added bfloat16 based dot as new API: shdot
2. Implemented generic kernel and cooperlake-specific (AVX512-BF16) kernel for shdot
3. Added 4 conversion APIs for bfloat16 data type <=> single/double: shstobf16 shdtobf16 sbf16tos dbf16tod
shstobf16 -- convert single float array to bfloat16 array
shdtobf16 -- convert double float array to bfloat16 array
sbf16tos -- convert bfloat16 array to single float array
dbf16tod -- convert bfloat16 array to double float array
4. Implemented generic kernels for all 4 conversion APIs, and cooperlake-specific kernel for shstobf16 and shdtobf16
5. Update level1 thread facilitate functions and macros to support multi-threading for these new APIs
6. Fix Cooperlake platform detection/specify issue when under dynamic-arch building
7. Change the typedef of bfloat16 from unsigned short to more strict uint16_t
Signed-off-by: Chen, Guobing <guobing.chen@intel.com>
2020-09-04 02:31:25 +08:00
Martin Kroeker
68b1713c30
Merge pull request #2811 from martin-frbg/issue2806
...
Make NO_AVX512 option override the AVX512 compile test in CMAKE builds as well
2020-09-01 17:19:14 +02:00
Martin Kroeker
0a4c5c4c44
Merge pull request #2807 from martin-frbg/issue2804
...
Work around ARMV8 build-time cpu detection problems on non-Linux systems
2020-08-31 23:44:56 +02:00
Martin Kroeker
5feb087c05
Handle Apple labeling armv8 as arm64 rather than aarch64
2020-08-31 20:02:08 +02:00
Martin Kroeker
7c0977c267
Add OpenMP dependency to pkgconfig file if needed
2020-08-22 13:53:44 +02:00
Martin Kroeker
bd3207b4b4
Update system.cmake
2020-08-19 22:51:10 +02:00
Martin Kroeker
b8ebfc9335
Update system.cmake
2020-08-19 22:30:19 +02:00
Martin Kroeker
7c1986640b
fallback from cooperlake to skylake if gcc<10
2020-08-19 20:48:39 +02:00
Martin Kroeker
71d33c952d
Typo fix
2020-08-19 17:44:23 +02:00
Martin Kroeker
6a3c074786
-march=cooperlake requires gcc10
2020-08-19 17:22:12 +02:00
Martin Kroeker
430f741b30
-march=cooperlake requires gcc10
2020-08-19 17:17:53 +02:00
Chen, Guobing
e740c4873d
Enable COOPERLAKE build target
...
Enable new build target platform -- COOPERLAKE. This target platform
supports all the SKYLAKEX supported ISAs + avx512bf16. So all the
SKYLAKEX specific kernels/drivers and related code are now extended
to be also active on COOPERLAKE. Besides, new BF16 related kernels
are active under this target.
2020-08-13 06:18:00 +08:00
Martin Kroeker
cb097beba2
Merge pull request #2741 from martin-frbg/issue2739
...
Adjust A53 SGEMM parameters to reflect recent switch to 8x8 kernel
2020-07-29 10:01:14 +02:00