Commit Graph

3830 Commits

Author SHA1 Message Date
Martin Kroeker 04d671aae2
Make disabling DYNAMIC_ARCH on unsupported systems work
needs to be unset in the cache for the change to have any effect
2019-07-06 15:05:04 +02:00
Martin Kroeker f69a0be712
Add getarch flags to disable AVX on x86
(and other small fixes to match Makefile behaviour)
2019-07-06 15:02:39 +02:00
Martin Kroeker ae9e8b131e
Add mingw builds to Appveyor config 2019-07-06 14:30:33 +02:00
Martin Kroeker 9086543f50
Utest needs CBLAS but not necessarily FORTRAN 2019-07-06 14:29:47 +02:00
Martin Kroeker abea977ded
Merge pull request #2162 from martin-frbg/pgi
Fixes for PGI compiler
2019-07-03 19:16:30 +02:00
Martin Kroeker 6b6c9b1441
Merge pull request #2172 from quickwritereader/develop
power9 cgemm/ctrmm. new sgemm 8x16
2019-07-01 21:06:02 +02:00
AbdelRauf a97b301aaa cgemm/ctrmm power9 2019-07-01 14:07:54 +00:00
Martin Kroeker 2f13f04224
Merge pull request #2170 from pkubaj/patch-1
Fix build on PPC970 for FreeBSD
2019-06-30 23:29:02 +02:00
pkubaj 7c7505a778
Fix build for PPC970 on FreeBSD pt.2
FreeBSD needs those macros too.
2019-06-28 10:31:45 +00:00
pkubaj 5a4f1a2118
Fix build for PPC970 on FreeBSD pt. 1
FreeBSD needs DCBT_ARG=0 as well.
2019-06-28 10:29:44 +00:00
Martin Kroeker 3b761892df
Merge pull request #2169 from pkubaj/develop
Fix build on FreeBSD/powerpc64.
2019-06-25 12:56:33 +02:00
Piotr Kubaj eebfeba768 Fix build on FreeBSD/powerpc64.
Signed-off-by: Piotr Kubaj <pkubaj@anongoth.pl>
2019-06-25 10:58:56 +02:00
Martin Kroeker 7684c4f8f8
PGI compiler does not like -march=native 2019-06-20 19:56:01 +02:00
Martin Kroeker 7faf42b7bb
Merge pull request #2167 from kavanabhat/dtrmm_power8_segfault
Fix DTRMMKERNEL register save for power8 64-bit mode (Fix for #2166)
2019-06-19 14:38:01 +02:00
kavanabhat a575f1e4c7
Update dtrmm_kernel_16x4_power8.S 2019-06-19 15:27:14 +05:30
AbdelRauf cdbfb891da new sgemm 8x16 2019-06-17 15:33:38 +00:00
Martin Kroeker 280552b988
Fix mov syntax 2019-06-16 18:35:43 +02:00
Martin Kroeker bbd4bb0154
Zero ecx with a mov instruction
PGI assembler does not like the initialization in the constraints.
2019-06-16 15:04:10 +02:00
Martin Kroeker 6d3efb2b58
Update Makefile.x86_64 2019-06-14 08:08:11 +02:00
Martin Kroeker d9ff2cd90d
Do not force gcc options on non-gcc compilers
fixes compile failure with pgi 18.10 as reported on OpenBLAS-users
2019-06-13 23:01:35 +02:00
Martin Kroeker 2a43062de7
Merge pull request #2159 from martin-frbg/issue2149
Avoid unintentional activation of TLS codepath via USE_TLS=0
2019-06-10 19:12:45 +02:00
Martin Kroeker 4ea794a522
Avoid unintentional activation of TLS code via USE_TLS=0
fixes #2149
2019-06-10 17:24:15 +02:00
Martin Kroeker ece0bfb881
Merge pull request #2158 from martin-frbg/issue2143
Remove any inadvertent use of -march=native from DYNAMIC_ARCH builds
2019-06-10 14:08:11 +02:00
Martin Kroeker 1f4b6a5d5d
Remove any inadvertent use of -march=native from DYNAMIC_ARCH builds
from #2143, -march=native precludes use of more specific options like -march=skylake-avx512 in individual kernels, and defeats the purpose of dynamic arch anyway.
2019-06-10 09:50:13 +02:00
Martin Kroeker be8f70d269
Merge pull request #2157 from martin-frbg/2154-2
Add gfortran workaround for potential ABI violation
2019-06-09 12:19:08 +02:00
Martin Kroeker e674e1c735
Update fc.cmake 2019-06-09 09:31:13 +02:00
Martin Kroeker 6ca898b63b
Add gfortran workaround for potential ABI violation
for #2154
2019-06-08 23:17:03 +02:00
Martin Kroeker 26411acd56
Merge pull request #2148 from TiborGY/cpp_thread_test_2
Thread safety tester using C++11 threading (cleaned history)
2019-06-07 13:23:07 +02:00
Martin Kroeker 0ab4076dd8
Merge pull request #2156 from martin-frbg/issue2154
Add gfortran workaround for C->FORTRAN ABI violation
2019-06-06 13:43:12 +02:00
Martin Kroeker a0caa762b3
Add gfortran workaround for ABI violations
for #2154 (see gcc bug 90329)
2019-06-06 10:24:16 +02:00
Martin Kroeker 900d5a3205
Add gfortran workaround for ABI violations in LAPACKE
for #2154 (see gcc bug 90329)
2019-06-06 10:18:40 +02:00
Martin Kroeker a17cf36225
Merge pull request #2153 from quickwritereader/develop
improved power9 zgemm,sgemm
2019-06-06 07:42:56 +02:00
AbdelRauf 148c4cc5fd conflict resolve 2019-06-05 20:50:50 +00:00
AbdelRauf d0c3543c3f power9 zgemm ztrmm optimized 2019-06-05 20:07:16 +00:00
Martin Kroeker 909ad04aef
Merge pull request #2145 from martin-frbg/1912-3
Separate implementations of AMAX and IAMAX on arm
2019-06-05 20:27:45 +02:00
Martin Kroeker 417efd41c6
Merge pull request #2110 from pc2/cpu-detection
Fix detection of Skylake processors when using GCC
2019-06-05 20:27:05 +02:00
Michael Lass 9cdc828afa c_check: Unlink correct file 2019-06-05 17:31:01 +02:00
Michael Lass 7a9a4dbc4f Fix detection of AVX512 capable compilers in getarch
21eda8b5 introduced a check in getarch.c to test if the compiler is capable of
AVX512. This check currently fails, since the used __AVX2__ macro is only
defined if getarch itself was compiled with AVX2/AVX512 support. Make sure this
is the case by building getarch with -march=native on x86_64. It is only
supposed to run on the build host anyway.
2019-06-05 17:30:56 +02:00
AbdelRauf a469b32cf4 sgemm pipeline improved, zgemm rewritten without inner packs, ABI lxvx v20 fixed with vs52 2019-06-04 07:11:30 +00:00
Martin Kroeker 27649b9543
Document NO_AVX512
for #2151
2019-06-03 11:01:33 +02:00
TiborGY 16f3df5d35
add c++ thread test option to Makefile.rule 2019-06-01 21:36:41 +02:00
TiborGY 1aded69821
hook up c++ thread safety test (main Makefile) 2019-06-01 21:32:52 +02:00
TiborGY c00289ba54
upload thread safety test folder 2019-06-01 21:30:06 +02:00
AbdelRauf 8fe794f059 improved zgemm power9 based on power8 2019-05-30 15:31:25 +00:00
Martin Kroeker 74c10b57c6
Use generic kernels for complex (I)AMAX to support softfp 2019-05-30 11:38:11 +02:00
Martin Kroeker c5495d2056
Ensure correct output for DAMAX with softfp 2019-05-30 11:25:43 +02:00
Martin Kroeker c70496b108
Separate implementations of AMAX and IAMAX on arm
As noted in #1912 and comment on #1942, the combined implementation happens to "do the right thing" on hardfp, but cannot return both value and index on softfp where they would have to share the return register
2019-05-29 15:02:51 +02:00
Martin Kroeker ca8d8835f5
Merge pull request #2144 from xianyi/revert-2142-issue1912-2
Revert "Add softfp support in min/max kernels"
2019-05-29 14:09:10 +02:00
Martin Kroeker d76b20b4d2
Revert "Add softfp support in min/max kernels" 2019-05-29 14:07:17 +02:00
Martin Kroeker 85af04da3c
Merge pull request #2142 from martin-frbg/issue1912-2
Add softfp support in min/max kernels
2019-05-28 22:56:08 +02:00