wjc404
8a074b3965
Update dgemm_kernel_4x8_haswell.S
2019-07-17 23:47:30 +08:00
wjc404
211ab03b14
Update dgemm_kernel_4x8_haswell.S
2019-07-17 22:39:15 +08:00
wjc404
1733f927e6
Update dgemm_kernel_4x8_haswell.S
2019-07-17 21:27:41 +08:00
wjc404
182b06d6ad
Update dgemm_kernel_4x8_haswell.S
2019-07-17 17:02:35 +08:00
wjc404
7a9050d681
Update dgemm_kernel_4x8_haswell.S
2019-07-17 00:55:06 +08:00
wjc404
0ba29fd262
Update dgemm_kernel_4x8_haswell.S for zen2
...
replaced a bunch of vpermpd instructions with vpermilpd and vperm2f128
2019-07-17 00:46:51 +08:00
Martin Kroeker
bafa021ed6
Merge pull request #2181 from isuruf/install_name
...
Change install_name on osx to match linux
2019-07-09 20:08:52 +02:00
Isuru Fernando
b89d9762a2
Change install_name on osx to match linux
2019-07-08 17:14:35 -05:00
Martin Kroeker
08dedf4c5e
Merge pull request #2177 from martin-frbg/noaff
...
Fix surprising behaviour of NO_AFFINITY=0
2019-07-07 18:28:21 +02:00
Martin Kroeker
b89c781637
Fix surprising behaviour of NO_AFFINITY=0
2019-07-07 16:04:45 +02:00
Martin Kroeker
dd7ff77f4b
Merge pull request #2175 from martin-frbg/cmake-mingw-fixes
...
Fix CMAKE compilation with MinGW32 and add it to Appveyor
2019-07-06 18:07:19 +02:00
Martin Kroeker
8fb76134bc
Mingw32 needs leading underscore on object names
...
(also copy BUNDERSCORE settings for FORTRAN from the corresponding Makefile)
2019-07-06 15:07:15 +02:00
Martin Kroeker
04d671aae2
Make disabling DYNAMIC_ARCH on unsupported systems work
...
needs to be unset in the cache for the change to have any effect
2019-07-06 15:05:04 +02:00
Martin Kroeker
f69a0be712
Add getarch flags to disable AVX on x86
...
(and other small fixes to match Makefile behaviour)
2019-07-06 15:02:39 +02:00
Martin Kroeker
ae9e8b131e
Add mingw builds to Appveyor config
2019-07-06 14:30:33 +02:00
Martin Kroeker
9086543f50
Utest needs CBLAS but not necessarily FORTRAN
2019-07-06 14:29:47 +02:00
Martin Kroeker
abea977ded
Merge pull request #2162 from martin-frbg/pgi
...
Fixes for PGI compiler
2019-07-03 19:16:30 +02:00
Martin Kroeker
6b6c9b1441
Merge pull request #2172 from quickwritereader/develop
...
power9 cgemm/ctrmm. new sgemm 8x16
2019-07-01 21:06:02 +02:00
AbdelRauf
a97b301aaa
cgemm/ctrmm power9
2019-07-01 14:07:54 +00:00
Martin Kroeker
2f13f04224
Merge pull request #2170 from pkubaj/patch-1
...
Fix build on PPC970 for FreeBSD
2019-06-30 23:29:02 +02:00
pkubaj
7c7505a778
Fix build for PPC970 on FreeBSD pt.2
...
FreeBSD needs those macros too.
2019-06-28 10:31:45 +00:00
pkubaj
5a4f1a2118
Fix build for PPC970 on FreeBSD pt. 1
...
FreeBSD needs DCBT_ARG=0 as well.
2019-06-28 10:29:44 +00:00
Martin Kroeker
3b761892df
Merge pull request #2169 from pkubaj/develop
...
Fix build on FreeBSD/powerpc64.
2019-06-25 12:56:33 +02:00
Piotr Kubaj
eebfeba768
Fix build on FreeBSD/powerpc64.
...
Signed-off-by: Piotr Kubaj <pkubaj@anongoth.pl>
2019-06-25 10:58:56 +02:00
Martin Kroeker
7684c4f8f8
PGI compiler does not like -march=native
2019-06-20 19:56:01 +02:00
Martin Kroeker
7faf42b7bb
Merge pull request #2167 from kavanabhat/dtrmm_power8_segfault
...
Fix DTRMMKERNEL register save for power8 64-bit mode (Fix for #2166 )
2019-06-19 14:38:01 +02:00
kavanabhat
a575f1e4c7
Update dtrmm_kernel_16x4_power8.S
2019-06-19 15:27:14 +05:30
AbdelRauf
cdbfb891da
new sgemm 8x16
2019-06-17 15:33:38 +00:00
Martin Kroeker
280552b988
Fix mov syntax
2019-06-16 18:35:43 +02:00
Martin Kroeker
bbd4bb0154
Zero ecx with a mov instruction
...
PGI assembler does not like the initialization in the constraints.
2019-06-16 15:04:10 +02:00
Martin Kroeker
6d3efb2b58
Update Makefile.x86_64
2019-06-14 08:08:11 +02:00
Martin Kroeker
d9ff2cd90d
Do not force gcc options on non-gcc compilers
...
fixes compile failure with pgi 18.10 as reported on OpenBLAS-users
2019-06-13 23:01:35 +02:00
Martin Kroeker
2a43062de7
Merge pull request #2159 from martin-frbg/issue2149
...
Avoid unintentional activation of TLS codepath via USE_TLS=0
2019-06-10 19:12:45 +02:00
Martin Kroeker
4ea794a522
Avoid unintentional activation of TLS code via USE_TLS=0
...
fixes #2149
2019-06-10 17:24:15 +02:00
Martin Kroeker
ece0bfb881
Merge pull request #2158 from martin-frbg/issue2143
...
Remove any inadvertent use of -march=native from DYNAMIC_ARCH builds
2019-06-10 14:08:11 +02:00
Martin Kroeker
1f4b6a5d5d
Remove any inadvertent use of -march=native from DYNAMIC_ARCH builds
...
from #2143 , -march=native precludes use of more specific options like -march=skylake-avx512 in individual kernels, and defeats the purpose of dynamic arch anyway.
2019-06-10 09:50:13 +02:00
Martin Kroeker
be8f70d269
Merge pull request #2157 from martin-frbg/2154-2
...
Add gfortran workaround for potential ABI violation
2019-06-09 12:19:08 +02:00
Martin Kroeker
e674e1c735
Update fc.cmake
2019-06-09 09:31:13 +02:00
Martin Kroeker
6ca898b63b
Add gfortran workaround for potential ABI violation
...
for #2154
2019-06-08 23:17:03 +02:00
Martin Kroeker
26411acd56
Merge pull request #2148 from TiborGY/cpp_thread_test_2
...
Thread safety tester using C++11 threading (cleaned history)
2019-06-07 13:23:07 +02:00
Martin Kroeker
0ab4076dd8
Merge pull request #2156 from martin-frbg/issue2154
...
Add gfortran workaround for C->FORTRAN ABI violation
2019-06-06 13:43:12 +02:00
Martin Kroeker
a0caa762b3
Add gfortran workaround for ABI violations
...
for #2154 (see gcc bug 90329)
2019-06-06 10:24:16 +02:00
Martin Kroeker
900d5a3205
Add gfortran workaround for ABI violations in LAPACKE
...
for #2154 (see gcc bug 90329)
2019-06-06 10:18:40 +02:00
Martin Kroeker
a17cf36225
Merge pull request #2153 from quickwritereader/develop
...
improved power9 zgemm,sgemm
2019-06-06 07:42:56 +02:00
AbdelRauf
148c4cc5fd
conflict resolve
2019-06-05 20:50:50 +00:00
AbdelRauf
d0c3543c3f
power9 zgemm ztrmm optimized
2019-06-05 20:07:16 +00:00
Martin Kroeker
909ad04aef
Merge pull request #2145 from martin-frbg/1912-3
...
Separate implementations of AMAX and IAMAX on arm
2019-06-05 20:27:45 +02:00
Martin Kroeker
417efd41c6
Merge pull request #2110 from pc2/cpu-detection
...
Fix detection of Skylake processors when using GCC
2019-06-05 20:27:05 +02:00
Michael Lass
9cdc828afa
c_check: Unlink correct file
2019-06-05 17:31:01 +02:00
Michael Lass
7a9a4dbc4f
Fix detection of AVX512 capable compilers in getarch
...
21eda8b5
introduced a check in getarch.c to test if the compiler is capable of
AVX512. This check currently fails, since the used __AVX2__ macro is only
defined if getarch itself was compiled with AVX2/AVX512 support. Make sure this
is the case by building getarch with -march=native on x86_64. It is only
supposed to run on the build host anyway.
2019-06-05 17:30:56 +02:00