Martin Kroeker
abbc65cff4
Cortex X1 is only Arm8.2
2022-03-28 17:40:27 +02:00
Martin Kroeker
57dd92a662
Add initial support for ARMV9 Cortex 510/710/X1/X2
2022-03-27 15:26:42 +02:00
Markus Mützel
aeb561d234
Add support for Intel Fortran compilers.
...
Port changes from upstream Reference-LAPACK.
2022-03-25 13:37:15 +01:00
Markus Mützel
00f44bfff7
cmake: Check if Fortran compiler is usable before enabling it.
2022-01-21 13:27:17 +01:00
Martin Kroeker
a9e297e476
Fix handling of ifdef/ifndef
2022-01-09 23:31:59 +01:00
Martin Kroeker
b6b024232d
Merge pull request #3508 from snadampal/v1_n2
...
OpenBLAS: aarch64: Add neoverse-v1/n2 architecture specifics
2022-01-09 14:50:26 +01:00
Sunita Nadampalli
19c8f615dc
OpenBLAS: aarch64: Add neoverse-v1/n2 architecture specifics
2022-01-07 00:28:17 +00:00
jgillis
ea3db69faa
Fix cmake crosscompilation for core2 target
...
Missing HAVE_SSE* cmake variables cause cc.cmake to forget about `-msse*` flags
2021-12-29 22:50:20 +01:00
Rafael Cardoso Fernandes Sousa
d38110a5ce
Use CMake variables instead of as
2021-12-10 17:46:53 -06:00
Rafael Cardoso Fernandes Sousa
214fbcee15
Fix cmake for power
2021-12-09 08:28:17 -06:00
Martin Kroeker
454edd741c
Merge pull request #3425 from binebrank/arm_sve_dgemm
...
Add dgemm kernel for arm64 SVE
2021-11-26 16:14:55 +01:00
Martin Kroeker
bcfbdc81b2
Merge pull request #3459 from rafaelcfsousa/fix_cmake
...
Fix issues when building OpenBLAS with cmake
2021-11-26 15:19:24 +01:00
Bine Brank
1af73ce38e
Adapt CMake for SVE
2021-11-26 10:35:01 +01:00
Rafael Cardoso Fernandes Sousa
d5c9353f1b
Modify the order that cmake set the KERNEL variables (generic now is fallback)
2021-11-24 20:08:35 -06:00
Rafael Cardoso Fernandes Sousa
fb891f33da
Fix the cmake parser to identify more patterns
2021-11-24 14:07:28 -06:00
Martin Kroeker
a3cd36acff
Add CMAKE support for cross-compiling to MIPS32
2021-11-20 17:34:28 +01:00
Markus Mützel
de2ed66596
cmake: Set SUFFIX64 also for NOFORTRAN
2021-11-15 08:53:52 +01:00
Martin Kroeker
02ea3db8e7
Merge pull request #3404 from guowangy/spr-build
...
Initial build support for Sapphire Rapids
2021-10-17 23:05:11 +02:00
مهدي شينون (Mehdi Chinoune)
efd7ac241d
Fix MinGW/Clang 64 bits detection.
...
CMAKE_COMPILER_IS_GNUCC is only valid for GCC.
2021-10-16 08:02:27 +01:00
Wangyang Guo
3dc6052c7e
initial support for Sapphire Rapids platform
2021-10-12 01:30:40 -07:00
Martin Kroeker
e02df9fc55
Propagate BUILD_BFLOAT16 to CFLAGS
2021-09-14 16:12:27 +02:00
Martin Kroeker
1c0a8a714a
Add defaults for SBGEMV kernels
2021-09-14 16:10:58 +02:00
Martin Kroeker
af19cda65a
Add "recursive" option for IBM xlf compiler ( #3359 )
...
* Add correct "recursive" option for xlf (from reference-lapack issue 606)
2021-09-04 18:26:59 +02:00
Martin Kroeker
bec9d9f63d
Merge pull request #3335 from guowangy/small-matrix-latest
...
Add GEMM optimization for small matrix and single/double kernel for skylakex
2021-08-29 22:33:33 +02:00
cianciosa
4c766cd11f
Fix a small syntax error. A ( was accidently deleted.
2021-08-11 12:08:34 -04:00
cianciosa
c28560129f
Check the total number of arguments passed insead of if the ARGV# is defined. This fixes a problem when compling openblas as a subproject of another code.
2021-08-11 12:00:07 -04:00
Wangyang Guo
76ea8db4da
Small Matrix: enable by default for x86_64 arch
...
If no customized GEMM_SMALL_M_PERMIT kernel defined, it will just by pass to normal path.
2021-08-05 02:59:36 +00:00
Wangyang Guo
fee5abd84b
Small Matrix: support cmake build
2021-08-04 08:50:15 +00:00
gxw
0b8f7c8c10
Add cmake support for LOONGARCH64
2021-08-02 10:00:41 +08:00
Martin Kroeker
47ba85f314
Fix regex to match kernels suffixed with cpuname too
2021-07-22 17:24:15 +02:00
Martin Kroeker
30f23be0f9
Rework setting of -mfma to only apply it where necessary
2021-07-22 12:00:03 +02:00
User User-User
91e2b11d3c
add to cmake listings too
2021-06-20 15:32:42 +02:00
Martin Kroeker
13fa9f737d
Modify defines for CR and RC to work around name collision on Windows
2021-06-16 12:17:25 +02:00
Martin Kroeker
db50b24a4a
Add entries for the new Householder Reconstruction functions from 3.9.1
2021-05-02 19:55:15 +02:00
Martin Kroeker
40000d1f64
Add entries for Householder reconstruction functions from 3.9.1
2021-05-02 19:21:59 +02:00
刘雨培
725432efaa
pass NO_AVX512 macro def
2021-04-07 00:10:41 +08:00
Jake Arkinstall
d7a77091a3
Addressed issue #3100 , removing an unnecessary write to the include directory
2021-02-10 12:11:17 +00:00
Martin Kroeker
33b5670122
Merge pull request #3096 from martin-frbg/fixclangcmake
...
Fix Cooperlake/DYNAMIC_ARCH builds with clang on Windows
2021-02-02 13:33:15 +01:00
Martin Kroeker
95e19e2e23
fix case in compiler name check
...
Co-authored-by: xoviat <49173759+xoviat@users.noreply.github.com>
2021-02-02 10:53:46 +01:00
Martin Kroeker
99ac042702
remove spurious lines (probably editor malfunction)
2021-02-01 21:02:53 +01:00
Martin Kroeker
774b9f8653
handle AppleClang in Cooperlake support condition
2021-02-01 20:18:53 +01:00
Martin Kroeker
eb1d2344f7
Fix compiler version check for Intel Cooperlake support (clang-cl does not accept -dumpversion)
2021-02-01 19:45:25 +01:00
Martin Kroeker
0cc36770f1
Merge pull request #3073 from xoviat/embedded
...
add embedded option
2021-01-31 18:02:41 +01:00
Martin Kroeker
cb61d3b46b
Add DYNAMIC_LIST support for ARM64
2021-01-25 13:13:20 +01:00
xoviat
b60de4447a
add cortex-m platform
2021-01-19 08:57:44 -06:00
Martin Kroeker
89ae305e11
Workaround for cmake having its own C_COMPILER variable
2021-01-13 12:30:26 +01:00
Martin Kroeker
ec4d77c47c
Add -mfma for HAVE_FMA3 in the non-DYNAMIC_ARCH case as well
2020-11-13 09:16:34 +01:00
Martin Kroeker
a29338aaa6
Remove extraneous quotes that caused a cmake policy warning
2020-11-07 20:27:42 +01:00
Martin Kroeker
438a8e5624
Fix placement of getarch call and spurious cpu property accumulation in DYNAMIC_ARCH builds
2020-11-07 20:26:12 +01:00
Martin Kroeker
0155cd53a3
Add -msse3 where needed for DYNAMIC_ARCH builds
2020-11-03 23:45:49 +01:00
Martin Kroeker
a9f9354296
Fix target test
2020-11-02 23:17:46 +01:00
Martin Kroeker
b9bc76aec4
Add files via upload
2020-11-02 22:43:50 +01:00
Martin Kroeker
e5f8c2bf8a
typo fix
2020-11-01 22:25:43 +01:00
Martin Kroeker
6baf8af658
Disable EXPRECISION for the combination of DYNAMIC_CORE and GENERIC target
2020-11-01 22:11:48 +01:00
Chen, Guobing
a7b1f9b1bb
Implementation of BF16 based gemv
...
1. Add a new API -- sbgemv to support bfloat16 based gemv
2. Implement a generic kernel for sbgemv
3. Implement an avx512-bf16 based kernel for sbgemv
Signed-off-by: Chen, Guobing <guobing.chen@intel.com>
2020-10-29 02:08:23 +08:00
Martin Kroeker
eddc65c7b7
Add POWER10 support flag (unconditionally for now)
2020-10-20 01:09:49 +02:00
Martin Kroeker
f5902ab0a1
Support cross-compiling for Apple Vortex
2020-10-18 19:10:58 +02:00
Martin Kroeker
f64243ff57
Add compiler options for sse/sse2/ssse3/sse4.1
2020-10-16 10:47:06 +02:00
Martin Kroeker
786c0a3ce8
Add sse options for use of intrinics with older compilers
2020-10-16 10:41:53 +02:00
Martin Kroeker
756802df61
Merge pull request #2890 from martin-frbg/s-d-sum
...
Revert special handling of Windows xNRM2 and enable C+intrinsics kern…
2020-10-14 09:02:03 +02:00
Martin Kroeker
75e3a92df6
Add express -mavx and -msse options (and fix a stray = for cooperlake)
2020-10-14 01:01:58 +02:00
Martin Kroeker
e3a29f6b58
Change "HALF" and "sh" to "BFLOAT16" and "sb"
2020-10-12 00:07:37 +02:00
Martin Kroeker
68e6823d36
Adapt for supporting only a subset of variable types
2020-10-11 15:01:32 +02:00
Martin Kroeker
88928650c4
Merge pull request #2883 from martin-frbg/issue2872
...
Minor CMAKE fixes
2020-10-11 10:30:33 +02:00
Martin Kroeker
82a497ec5d
restore PRESCOTT default for DYNAMIC_LIST
2020-10-11 00:43:09 +02:00
Martin Kroeker
de27e4f5fb
Stop DYNAMIC_ARCH build if the toplevel source contains a stray config_kernel.h from a gmake build
...
This is unlikely to happen in practice, but if it does, the rogue file would get included instead of the dynamically generated version for each target_core, leading to very confusing errors like "invalid operands (undefined UND and ABS sections)" in compilation of the assembly kernels as macros like PREFETCH would remain undefined
2020-10-11 00:40:22 +02:00
Martin Kroeker
e1b7123bbe
Merge pull request #2867 from Qiyu8/usimd-floatdot
...
Optimize the performance of dot by using universal intrinsics in X86/ARM
2020-10-10 12:10:25 +02:00
Qiyu8
f32d34a015
add sse3 compiler flag
2020-10-10 10:36:15 +08:00
Martin Kroeker
a5feea6611
make BLAS3_MEM_ALLOC_THRESHOLD configurable on non-Windows
2020-10-04 23:01:06 +02:00
Martin Kroeker
2367726578
Remove redundant status message
2020-09-30 23:28:49 +02:00
Martin Kroeker
c4aeeeb9f4
Activate all BUILD_ options if none was specified
2020-09-15 23:15:34 +02:00
Martin Kroeker
91c84e1c01
Merge pull request #2796 from Guobing-Chen/BF16_dot_coversion_apis
...
Add bfloat16 based dot and conversion with single/double
2020-09-14 15:00:19 +02:00
Martin Kroeker
26792d2096
Copy BUILD_* directives to the compiler options to allow ifdef in tests
2020-09-13 21:47:55 +02:00
Chen, Guobing
deaeb6c5b8
Add bfloat16 based dot and conversion with single/double
...
1. Added bfloat16 based dot as new API: shdot
2. Implemented generic kernel and cooperlake-specific (AVX512-BF16) kernel for shdot
3. Added 4 conversion APIs for bfloat16 data type <=> single/double: shstobf16 shdtobf16 sbf16tos dbf16tod
shstobf16 -- convert single float array to bfloat16 array
shdtobf16 -- convert double float array to bfloat16 array
sbf16tos -- convert bfloat16 array to single float array
dbf16tod -- convert bfloat16 array to double float array
4. Implemented generic kernels for all 4 conversion APIs, and cooperlake-specific kernel for shstobf16 and shdtobf16
5. Update level1 thread facilitate functions and macros to support multi-threading for these new APIs
6. Fix Cooperlake platform detection/specify issue when under dynamic-arch building
7. Change the typedef of bfloat16 from unsigned short to more strict uint16_t
Signed-off-by: Chen, Guobing <guobing.chen@intel.com>
2020-09-04 02:31:25 +08:00
Martin Kroeker
68b1713c30
Merge pull request #2811 from martin-frbg/issue2806
...
Make NO_AVX512 option override the AVX512 compile test in CMAKE builds as well
2020-09-01 17:19:14 +02:00
Martin Kroeker
0a4c5c4c44
Merge pull request #2807 from martin-frbg/issue2804
...
Work around ARMV8 build-time cpu detection problems on non-Linux systems
2020-08-31 23:44:56 +02:00
Martin Kroeker
5feb087c05
Handle Apple labeling armv8 as arm64 rather than aarch64
2020-08-31 20:02:08 +02:00
Martin Kroeker
7c0977c267
Add OpenMP dependency to pkgconfig file if needed
2020-08-22 13:53:44 +02:00
Martin Kroeker
bd3207b4b4
Update system.cmake
2020-08-19 22:51:10 +02:00
Martin Kroeker
b8ebfc9335
Update system.cmake
2020-08-19 22:30:19 +02:00
Martin Kroeker
7c1986640b
fallback from cooperlake to skylake if gcc<10
2020-08-19 20:48:39 +02:00
Martin Kroeker
71d33c952d
Typo fix
2020-08-19 17:44:23 +02:00
Martin Kroeker
6a3c074786
-march=cooperlake requires gcc10
2020-08-19 17:22:12 +02:00
Martin Kroeker
430f741b30
-march=cooperlake requires gcc10
2020-08-19 17:17:53 +02:00
Chen, Guobing
e740c4873d
Enable COOPERLAKE build target
...
Enable new build target platform -- COOPERLAKE. This target platform
supports all the SKYLAKEX supported ISAs + avx512bf16. So all the
SKYLAKEX specific kernels/drivers and related code are now extended
to be also active on COOPERLAKE. Besides, new BF16 related kernels
are active under this target.
2020-08-13 06:18:00 +08:00
Martin Kroeker
cb097beba2
Merge pull request #2741 from martin-frbg/issue2739
...
Adjust A53 SGEMM parameters to reflect recent switch to 8x8 kernel
2020-07-29 10:01:14 +02:00
Martin Kroeker
64e2e4aaf3
missing braces
2020-07-27 20:19:22 +00:00
Martin Kroeker
921ec4e9e2
Adjust A53 SGEMM parameters to reflect move to 8x8 kernel
2020-07-27 19:54:46 +00:00
Ashwin Sekhar T K
4e1be0e481
ARM64: Add THUNDERX3T110 Target
2020-07-26 23:32:24 -07:00
Martin Kroeker
9e21a100e3
Add trivial check for stdatomic.h
2020-07-20 22:52:09 +00:00
Martin Kroeker
9d000ecaa2
include CheckLanguage module
2020-07-16 22:36:35 +00:00
Martin Kroeker
a847d00366
handle missing lack of fortran compiler more gracefully
2020-07-16 22:17:39 +00:00
Martin Kroeker
6eaeb01263
Merge pull request #2658 from RajalakshmiSR/p10
...
powerpc: Add support for future processor
2020-06-23 00:02:37 +02:00
Martin Kroeker
6876221cf3
Remove optimization level limit for flang again and add -fno-unroll-loops for AOCC flang 2.x instead
2020-06-14 17:40:24 +02:00
Martin Kroeker
1dd712131e
Fix spelling of flang option -Mrecursive and add -Kieee
2020-06-14 00:09:31 +02:00
Rajalakshmi Srinivasaraghavan
9fe930f205
powerpc: Add support for future processor
...
This is the initial patch to support build infrastructure
for POWER10 architecture.
2020-06-11 15:47:20 -05:00
Martin Kroeker
3ce469a34f
Limit optimization level to O1 for flang and add -frecursive
2020-06-09 16:11:13 +02:00
Martin Kroeker
79cd69fea4
Merge pull request #2644 from martin-frbg/cmake-maxstack
...
Add CMAKE support for MAX_STACK_ALLOC setting
2020-06-05 08:33:48 +02:00
Martin Kroeker
bb12c2c854
Limit MAX_STACK_ALLOC availability to non-Wndows
2020-06-04 19:07:27 +02:00
Martin Kroeker
6e97df7b47
Add CMAKE support for MAX_STACK_ALLOC setting
2020-06-04 14:45:31 +02:00
Martin Kroeker
4db00121dc
Disable EXPRECISION and add -lm on OSX (same as the BSDs and Linux)
2020-05-31 12:39:36 +02:00
Martin Kroeker
cd10b35fe9
Handle trailing spaces and empty condition variables
2020-05-09 13:42:33 +02:00
Martin Kroeker
5dd14e3d48
Make building the bfloat16 functions conditional on option BUILD_HALF ( #2590 )
...
* make building the bfloat16 BLAS functions conditional on BUILD_HALF
* pass the BUILD_HALF option to gensymbol
* Pass BUILD_HALF as a compiler define for dynamic_arch builds
2020-05-01 09:58:30 +02:00
Martin Kroeker
3bd56846bb
Silence a debug message
2020-04-27 16:27:09 +02:00
Martin Kroeker
e7bbdfdf84
Have CMAKE parse conditional lines in KERNEL files
...
Supports ifeq and ifneq, but requires both to have an else branch
2020-04-27 15:20:03 +02:00
Martin Kroeker
70869d571f
Quote include paths for getarch to protect any embedded spaces
2020-04-24 10:30:44 +02:00
Martin Kroeker
4f70512b97
Update kernel.cmake
2020-04-19 08:10:26 +02:00
Martin Kroeker
d0737b0142
Update kernel.cmake
2020-04-18 21:36:28 +02:00
Martin Kroeker
a83a59b038
Use generic kernels for ishama,shasum,shdot,shrot
2020-04-18 15:53:51 +02:00
Martin Kroeker
0a19bd813c
Use generic codes for shamax and shcopy
2020-04-18 12:52:51 +02:00
Martin Kroeker
f361de30a3
Use generic axpy.c for SHAXPY as x86 lacks saxpy.c
2020-04-18 11:07:16 +02:00
Martin Kroeker
9f6d6f6cb6
use saxpy.c instead of axpy.S for SHAXPY
2020-04-17 22:27:58 +02:00
Rajalakshmi Srinivasaraghavan
22bb50fb81
cmake fixes
2020-04-17 13:35:17 -05:00
Rajalakshmi Srinivasaraghavan
7eb55504b1
RFC : Add half precision gemm for bfloat16 in OpenBLAS
...
This patch adds support for bfloat16 data type matrix multiplication kernel.
For architectures that don't support bfloat16, it is defined as unsigned short
(2 bytes). Default unroll sizes can be changed as per architecture as done for
SGEMM and for now 8 and 4 are used for M and N. Size of ncopy/tcopy can be
changed as per architecture requirement and for now, size 2 is used.
Added shgemm in kernel/power/KERNEL.POWER9 and tested in powerpc64le and
powerpc64. For reference, added a small test compare_sgemm_shgemm.c to compare
sgemm and shgemm output.
This patch does not cover OpenBLAS test, benchmark and lapack tests for shgemm.
Complex type implementation can be discussed and added once this is approved.
2020-04-14 14:55:08 -05:00
Martin Kroeker
a05243d0f2
ifort and pgfort need "recursive" for compiling LAPACK as well
...
as shown in Reference-LAPACK issue 401 (their PR 403)
2020-04-01 15:38:07 +02:00
Martin Kroeker
8c7c1395da
Merge pull request #2521 from martin-frbg/cm-avx512
...
Use proper extension on the avx512 testcase filename
2020-03-22 01:03:42 +01:00
Martin Kroeker
1d9773b800
Use proper extension on the avx512 testcase filename
...
The need to call it .tmp existed only when it was generated by a tmpfile call, and the "-x c" option to tell the compiler it is actually a C source is not universally supported (this broke the test with clang-cl at least)
2020-03-20 23:05:53 +01:00
Martin Kroeker
6d54c94760
Make ifort on Windows create lowercase symbols with appended underscore
...
tentative fix for #2472
2020-03-20 01:08:10 +01:00
مهدي شينون (Mehdi Chinoune)
21f6c4b5a9
fixes #2480
2020-03-02 17:22:28 +01:00
Ali Saidi
c623a965f9
Add Neoverse-N1 core
...
The implementation is a hybird of the ARMV8 one with some of the
improved TX2 rountines along with specifying -march=v8.2-a
2020-02-29 03:22:04 +00:00
Martin Kroeker
ca4f7dceff
Add parameters for EMAG8180 DYNAMIC_ARCH support with cmake
2020-02-24 20:23:18 +01:00
Martin Kroeker
1ddf9f1067
Add EMAG8180 to arm64 DYNAMIC_ARCH list for cmake
2020-02-24 20:16:18 +01:00
Martin Kroeker
7f0d523b42
Make BUFFER_SIZE configurable
2020-02-09 23:32:57 +01:00
Martin Kroeker
8dc9fd4dfe
Add -march option for AVX512
2020-01-30 12:41:18 +01:00
Martin Kroeker
375b1875c8
[WIP] Update LAPACK to 3.9.0 ( #2353 )
...
* Update make.inc entries for LAPACK 3.9.0
Reference-LAPACK PR 347 changed some variable names and relative paths
* Update LAPACK to 3.9.0
* Add new functions from LAPACK 3.9.0
* Add new functions from LAPACK 3.9.0
* Restore LOADER command
as it makes it easier to specify pthread as needed
* Restore LOADER
* Restore EIG/LIN prefixes in cmdbase
* add binary path to lapack_testing.py call
* Restore OpenMP version check
* Restore OpenMP version check
* Restore fix for out-of-bounds array accesses
from #2096
2020-01-01 13:18:53 +01:00
Martin Kroeker
a4896b5538
Update DYNAMIC_ARCH support for ARM64 and PPC ( #2332 )
...
* Update DYNAMIC_ARCH list of ARM64 targets for gmake
* Update arm64 cpu list for runtime detection
* Update DYNAMIC_ARCH list of ARM64 targets for cmake and add POWERPC targets
2019-12-04 11:06:03 +01:00
k.dunikowski
8691825944
Fixed a minor cmake problem, occuring when DYNAMIC_CORE=ON and CMAKE_C_FLAGS was empty
2019-10-28 08:51:05 +01:00
Martin Kroeker
eb45eb6942
Fix C compiler handling and BINARY=32 mode in CMAKE builds ( #2248 )
...
* Fix compiler identification and option setting
* Handle BINARY=32 option on X86_64
* Add xGEMM3M unroll parameters for crossbuild-target CORE2
* Replace bogus mingw64/32bit CI job with actual 32bit build
mingw64 is not multilib-capable, so using an x86_64-mingw with BINARY=32 in the CI was not going to work anyway (but build passed while BINARY=32 was ignored).
2019-09-10 08:27:06 +02:00
Martin Kroeker
fde8a8e6a0
Improve cmake build behaviour with non-host cpu targets ( #2246 )
...
1. Supply appropriate values for C/Z GEMM unroll when cross-compiling for CORE2 or ARMV7
2. Add the required xLOCAL_BUFFER_SIZE parameters for cross-compiling CORE2
3. Add -DFORCE_<target> option to getarch when building with -DTARGET=target
for #2245
2019-09-03 22:41:17 +02:00
Martin Kroeker
1fec0570f6
Add cgemm and zgemm unroll factors for core2
2019-09-02 15:03:45 +02:00
Martin Kroeker
bf0d92a310
Add arch data for cross-compiling to CORE2
...
for #2235
2019-08-28 17:35:56 +02:00
Martin Kroeker
e3d846ab57
Do not use -march=native with the PGI compiler
2019-08-16 08:58:10 +02:00
Tyler Reddy
3f6ab1582a
MAINT: remove legacy CMake endif()
...
* clean up a case where CMake endif()
contained the conditional used in the
if(), which is no longer needed /
discouraged since our minimum required
CMake version supports the modern syntax
2019-07-22 21:24:57 -06:00
Martin Kroeker
8fb76134bc
Mingw32 needs leading underscore on object names
...
(also copy BUNDERSCORE settings for FORTRAN from the corresponding Makefile)
2019-07-06 15:07:15 +02:00
Martin Kroeker
04d671aae2
Make disabling DYNAMIC_ARCH on unsupported systems work
...
needs to be unset in the cache for the change to have any effect
2019-07-06 15:05:04 +02:00
Martin Kroeker
f69a0be712
Add getarch flags to disable AVX on x86
...
(and other small fixes to match Makefile behaviour)
2019-07-06 15:02:39 +02:00
Martin Kroeker
ece0bfb881
Merge pull request #2158 from martin-frbg/issue2143
...
Remove any inadvertent use of -march=native from DYNAMIC_ARCH builds
2019-06-10 14:08:11 +02:00
Martin Kroeker
1f4b6a5d5d
Remove any inadvertent use of -march=native from DYNAMIC_ARCH builds
...
from #2143 , -march=native precludes use of more specific options like -march=skylake-avx512 in individual kernels, and defeats the purpose of dynamic arch anyway.
2019-06-10 09:50:13 +02:00
Martin Kroeker
be8f70d269
Merge pull request #2157 from martin-frbg/2154-2
...
Add gfortran workaround for potential ABI violation
2019-06-09 12:19:08 +02:00
Martin Kroeker
e674e1c735
Update fc.cmake
2019-06-09 09:31:13 +02:00
Martin Kroeker
6ca898b63b
Add gfortran workaround for potential ABI violation
...
for #2154
2019-06-08 23:17:03 +02:00
Michael Lass
7a9a4dbc4f
Fix detection of AVX512 capable compilers in getarch
...
21eda8b5 introduced a check in getarch.c to test if the compiler is capable of
AVX512. This check currently fails, since the used __AVX2__ macro is only
defined if getarch itself was compiled with AVX2/AVX512 support. Make sure this
is the case by building getarch with -march=native on x86_64. It is only
supposed to run on the build host anyway.
2019-06-05 17:30:56 +02:00
Martin Kroeker
1e52572be3
Add option USE_LOCKING for single-threaded build with locking support
2019-05-15 23:19:30 +02:00
luz.paz
daf2fec12d
Misc. typo fixes
...
Found via `codespell -q 3 -w -L ith,als,dum,nd,amin,nto,wis,ba -S ./relapack,./kernel,./lapack-netlib`
2019-04-29 17:03:56 -04:00
Martin Kroeker
ccfb7ead15
Merge pull request #2072 from martin-frbg/sum
...
Add (C)BLAS extension ?sum
2019-04-23 20:11:36 +02:00
Martin Kroeker
e06b8438b4
Merge pull request #2080 from martin-frbg/issue2075
...
Add -lm and disable EXPRECISION support on *BSD
2019-04-02 21:40:58 +02:00
Martin Kroeker
9229d6859b
Add -lm and disable EXPRECISION support on *BSD
...
fixes #2075
2019-04-02 09:38:18 +02:00
Martin Kroeker
d17da6c6a4
Add cmake defaults for ?sum kernels
2019-03-31 11:57:01 +02:00
Martin Kroeker
1679de5e59
Detect 32bit environment on 64bit ARM hardware
...
for #2056 , using same approach as #2058
2019-03-31 10:50:43 +02:00
Sacha
c3e30b2bc2
Change 64-bit detection as explained in #2056
2019-03-13 23:21:54 +10:00