Commit Graph

172 Commits

Author SHA1 Message Date
luz.paz
daf2fec12d Misc. typo fixes
Found via `codespell -q 3 -w -L ith,als,dum,nd,amin,nto,wis,ba -S ./relapack,./kernel,./lapack-netlib`
2019-04-29 17:03:56 -04:00
Martin Kroeker
ccfb7ead15 Merge pull request #2072 from martin-frbg/sum
Add (C)BLAS extension ?sum
2019-04-23 20:11:36 +02:00
Martin Kroeker
e06b8438b4 Merge pull request #2080 from martin-frbg/issue2075
Add -lm and disable EXPRECISION support on *BSD
2019-04-02 21:40:58 +02:00
Martin Kroeker
9229d6859b Add -lm and disable EXPRECISION support on *BSD
fixes #2075
2019-04-02 09:38:18 +02:00
Martin Kroeker
d17da6c6a4 Add cmake defaults for ?sum kernels 2019-03-31 11:57:01 +02:00
Martin Kroeker
1679de5e59 Detect 32bit environment on 64bit ARM hardware
for #2056, using same approach as #2058
2019-03-31 10:50:43 +02:00
Sacha
c3e30b2bc2 Change 64-bit detection as explained in #2056 2019-03-13 23:21:54 +10:00
Martin Kroeker
fd34820b99 Fix AVX512 test always returning false due to missing compiler option 2019-02-25 17:58:31 +01:00
Martin Kroeker
5952e586ce Support DYNAMIC_LIST option in cmake
e.g. cmake -DDYNAMIC_ARCH=1 -DDYNAMIC_LIST="NEHALEM;HASWELL;ZEN" ..
original issue was #1639
2019-02-05 23:51:40 +01:00
Martin Kroeker
58dd7e4501 Change ARMV8 target to ARMV7 for BINARY=32 2019-01-26 17:52:33 +01:00
Martin Kroeker
802f0dbde1 More fixes for cross-compiling ARM64 targets
Fixed core naming for DYNAMIC_ARCH. Corrected GEMM_DEFAULT entries and added SYMV_P. Replaced outdated VULCAN define for ThunderX2T99 with ARMV8 to get basic definitions back. For issue #1908
2019-01-03 22:17:31 +01:00
Martin Kroeker
20d1aad13f Fix missing quotes around thunderx targets 2019-01-02 20:15:35 +01:00
Martin Kroeker
e1eab96502 Merge pull request #1931 from martin-frbg/pr1921
Add -mavx2 to TARGET=HASWELL builds
2018-12-23 23:15:54 +01:00
Martin Kroeker
76b4b8980f Use -dumpversion with gcc only 2018-12-23 19:08:19 +01:00
Martin Kroeker
49e0f485da Add -mavx2 for TARGET=HASWELL if compiler supports and requires it 2018-12-23 17:26:09 +01:00
Martin Kroeker
26a3402773 Reflect ARMV8 target definition changes from PR1876
and create config target directory for cross-compiles.
2018-12-23 12:26:01 +01:00
Martin Kroeker
133c278ee5 Add DYNAMIC_CORE list for ARM64
cf #1908
2018-12-07 17:42:23 +01:00
Martin Kroeker
dceff5542c Handle Android environments that identify as Linux (#1898)
* Handle Android environments that identify as Linux

termux terminal emulator does this, causing build failures through missed defines in common.h
2018-12-01 20:56:11 +01:00
Martin Kroeker
081ceb3e02 Propagate version number for openblas_get_config 2018-11-29 00:12:04 +01:00
Andrew
40cce0e353 handle cmake too 2018-11-06 09:45:49 +00:00
Martin Kroeker
2263d3906c Merge pull request #1812 from martin-frbg/issue1806-2
Use KERNEL_DEFINITIONS rather than COMMON_OPTS to pass -march=skylake…
2018-10-11 21:51:31 +02:00
Martin Kroeker
81c9985c3a Use KERNEL_DEFINITIONS rather than COMMON_OPTS to pass -march=skylake-avx512 2018-10-11 11:03:27 +02:00
Martin Kroeker
56ebc7b53e Merge pull request #1808 from martin-frbg/issue1806
Add -march=skylake-avx512 to CFLAGS when the target is Skylake
2018-10-11 07:48:08 +02:00
Martin Kroeker
8a11ec19d1 Syntax fix 2018-10-10 23:47:35 +02:00
Martin Kroeker
fa53b903db Add -march=skylake-avx512 to CFLAGS when the target is Skylake
Should fix 1806 and #1801
2018-10-10 19:22:01 +02:00
Martin Kroeker
84bcdf9c66 Revert "Add -march=skylake-avx512 when required" 2018-10-10 19:15:32 +02:00
Martin Kroeker
a9b51b8448 Merge pull request #1798 from martin-frbg/cmake-avx512
Add -march=skylake-avx512 when required
2018-10-08 21:15:17 +02:00
Martin Kroeker
eba394c711 Add -march=skylake-avx512 when required
fixes #1797
2018-10-08 19:18:12 +02:00
Martin Kroeker
02ef20a1e4 Merge pull request #1786 from martin-frbg/immintrin
Check for Immintrin.h presence in the AVX512 compatibility test as well
2018-10-04 09:07:09 +02:00
Martin Kroeker
4c3643ed7f Check availability of immintrin.h in the AVX512 compatibility test 2018-10-04 07:36:49 +02:00
Yuri
2349e15149 Allow to install the 'interfare64' version concurrently with the regular version 2018-09-15 21:00:03 -07:00
Martin Kroeker
b1b743f434 Merge branch 'develop' into interim033 2018-08-25 19:45:19 +02:00
Martin Kroeker
2a589c4b28 Add USE_TLS option to switch between old and new memory.c 2018-08-25 19:36:12 +02:00
Martin Kroeker
25f2d25cfe Merge pull request #1697 from martin-frbg/issue1696
Do not treat WIndows UWB builds as cross-compiling
2018-07-25 19:55:29 +02:00
Martin Kroeker
73131fa30a Do not treat WIndows UWB builds as cross-compiling 2018-07-24 17:46:33 +02:00
Martin Kroeker
b74aef2816 Add -march=skylake-avx512 to AVX512 compile check and suppress its output 2018-07-03 14:41:44 +02:00
Martin Kroeker
26e1cfb653 Merge pull request #1607 from martin-frbg/dynarch
Move some x86_64 DYNAMIC_ARCH targets to new DYNAMIC_OLDER option
2018-06-14 16:52:55 +02:00
Martin Kroeker
02634b549b Add template for OpenBLASConfig.cmake 2018-06-10 09:25:46 +02:00
Martin Kroeker
1cbd8f3ae4 Move some DYNAMIC_ARCH targets to new DYNAMIC_OLDER option 2018-06-09 16:30:46 +02:00
Martin Kroeker
cf234a0561 Merge pull request #1589 from fenrus75/skylakex
Initial support for SkylakeX / AVX512
2018-06-06 22:07:09 +02:00
Martin Kroeker
e4718b1fee Better AVX512 test case 2018-06-06 16:51:30 +02:00
Martin Kroeker
7fb62aed7e Check build system support for AVX512 instructions 2018-06-05 23:29:33 +02:00
Arjan van de Ven
99c7bba8e4 Initial support for SkylakeX / AVX512
This patch adds the basic infrastructure for adding the SkylakeX (Intel Skylake server)
target. The SkylakeX target will use the AVX512 (AVX512VL level) instruction set,
which brings 2 basic things:
1) 512 bit wide SIMD (2x width of AVX2)
2) 32 SIMD registers (2x the number on AVX2)

This initial patch only contains a trivial transofrmation of the Haswell SGEMM kernel
to AVX512VL; more will follow later but this patch aims to get the infrastructure
in place for this "later".

Full performance tuning has not been done yet; with more registers and wider SIMD
it's in theory possible to retune the kernels but even without that there's an
interesting enough performance increase (30-40% range) with just this change.
2018-06-03 07:58:52 +00:00
Martin Kroeker
6791294312 Merge pull request #1559 from martin-frbg/buildconf
Add build-time configuration options to pkgconfig file
2018-05-14 18:49:53 +02:00
Martin Kroeker
7d7564568c Add build-time configuration options to pkgconfig file 2018-05-14 00:09:35 +02:00
Zhiyong Dang
1b83341d19 Fix race condition in blas_server_omp.c
Change-Id: Ic896276cd073d6b41930c7c5a29d66348cd1725d
2018-04-27 17:00:42 +08:00
Sacha
f81815e48a Fix CMake cross-compiling
Without specifying thread count, NUM_THREADS would not be defined and CMake would fail.
This is because core count cannot be determined when cross-compiling.
2018-02-28 10:25:25 +10:00
xoviat
038bfbb86c CMake: Remove unused wall option when FC=flang 2018-01-26 14:09:48 -06:00
Martin Kroeker
599de9e598 Restore LAPACKE files for Xgeqpf, Xggsvd and Xggsvp
These were inadvertently dropped from the list in my PR #1095
2017-12-21 19:43:09 +01:00
Martin Kroeker
0dc291d3fa Merge pull request #1377 from isuruf/threads
Allow overriding NUM_THREADS in cmake
2017-12-01 16:22:35 +01:00