ken-cunningham-webuse
f7a06463d9
common_power.h: force DCBT_ARG 0 on PPC970 Darwin
...
without this, we see
../kernel/power/gemv_n.S:427:Parameter syntax error
and many more similar entries
that relates to this assembly command
dcbt 8, r24, r18
this change makes the DCBT_ARG = 0
and openblas builds through to completion on PowerMac 970
Tests pass
2019-03-07 12:03:45 -08:00
ken-cunningham-webuse
b0c714ef60
param.h : enable defines for PPC970 on DarwinOS
...
fixes:
gemm.c: In function 'sgemm_':
../common_param.h:981:18: error: 'SGEMM_DEFAULT_P' undeclared (first use in this function)
#define SGEMM_P SGEMM_DEFAULT_P
^
2019-03-07 12:03:25 -08:00
Martin Kroeker
8d3d29e4d7
Merge pull request #2049 from Celelibi/fix_crash_sgemm_sse_x64
...
Fix crash in sgemm SSE/nano kernel on x86_64
2019-03-07 19:28:06 +01:00
Celelibi
b7f59da42d
Fix crash in sgemm SSE/nano kernel on x86_64
...
Fix bug #2047 .
Signed-off-by: Celelibi <celelibi@gmail.com>
2019-03-07 16:55:13 +01:00
Martin Kroeker
db3dc9e282
Merge pull request #2046 from kencu/powermac
...
ctest.c : add __POWERPC__ for PowerMac
2019-03-07 14:51:41 +01:00
ken-cunningham-webuse
4290afdae2
ctest.c : add __POWERPC__ for PowerMac
2019-03-06 20:55:06 -08:00
Martin Kroeker
4741ce803b
Merge pull request #2045 from martin-frbg/2033-3
...
Do not compile in AVX512 check if AVX support is disabled
2019-03-06 22:40:26 +01:00
Martin Kroeker
11cfd0bd75
Do not compile in AVX512 check if AVX support is disabled
...
xgetbv is function depends on NO_AVX being undefined - we could change that too, but that combo is unlikely to work anyway
2019-03-05 16:04:25 +01:00
Martin Kroeker
651ab01d2b
Merge pull request #2044 from martin-frbg/issue2043
...
Fix module definition conflicts between LAPACK and ReLAPACK
2019-03-05 12:11:32 +01:00
Martin Kroeker
d7b2c53c0b
Merge pull request #2039 from brada4/meminit
...
Address warning in memory.c
2019-03-05 12:11:15 +01:00
Martin Kroeker
e4864a8933
Fix module definition conflicts between LAPACK and ReLAPACK
...
for #2043
2019-03-04 21:17:08 +01:00
Martin Kroeker
10d841d8b9
Merge pull request #2026 from martin-frbg/trmv_threads
...
Correct range limiting in trmv_thread and re-enable TRMV multithreading
2019-03-04 15:08:31 +01:00
Martin Kroeker
12f2b76748
Merge pull request #2038 from martin-frbg/issue2035
...
Improve handling of NO_STATIC and NO_SHARED
2019-03-04 15:07:48 +01:00
Martin Kroeker
6c83b878f6
Merge pull request #2040 from martin-frbg/locks2002
...
Restore locking optimizations for OpenMP case
2019-03-04 15:07:14 +01:00
maomao194313
fb4dae7124
add TARGET support for HiSilicon tsv110 CPUs
2019-03-04 16:48:49 +08:00
maomao194313
760842dda1
add TARGET support for HiSilicon tsv110 CPUs
2019-03-04 16:45:22 +08:00
maomao194313
53f482ee72
add TARGET support for HiSilicon tsv110 CPUs
2019-03-04 16:41:21 +08:00
maomao194313
783ba8058f
HiSilicon tsv110 CPUs optimization branch
...
add HiSilicon tsv110 CPUs optimization branch
2019-03-04 16:30:50 +08:00
Martin Kroeker
af480b02a4
Restore locking optimizations for OpenMP case
...
restore another accidentally dropped part of #1468 that was missed in #2004 to address performance regression reported in #1461
2019-03-03 14:17:07 +01:00
Andrew
e4a79be6bb
address warning introed with #1814 et al
2019-03-03 09:05:11 +02:00
Andrew
e5c316c6b9
init
2019-03-03 08:59:27 +02:00
Martin Kroeker
25427926bc
Improve handling of NO_STATIC and NO_SHARED
...
to avoid surprises from defining either as zero. Fixes #2035 by addressing some concerns from #1422
2019-03-02 23:36:36 +01:00
Martin Kroeker
edb8143141
Merge pull request #2037 from martin-frbg/issue2033-2
...
Make sure that AVX512 is disabled in 32bit builds
2019-03-01 11:45:02 +01:00
Martin Kroeker
c4868d11c0
Make sure that AVX512 is disabled in 32bit builds
...
for #2033
2019-03-01 09:23:03 +01:00
Martin Kroeker
4c321ae571
Merge pull request #2034 from martin-frbg/issue2033
...
Make x86_32 imply NO_AVX2, NO_AVX512 in addition to NO_AVX
2019-02-28 22:10:12 +01:00
Martin Kroeker
2ffb727187
Keep xcode8.3 for osx BINARY=32 build
...
as xcode10 deprecated i386
2019-02-28 10:51:54 +01:00
Martin Kroeker
d66214c946
Make x86_32 imply NO_AVX2, NO_AVX512 in addition to NO_AVX
...
fixes #2033
2019-02-28 09:58:25 +01:00
Martin Kroeker
fd34820b99
Fix AVX512 test always returning false due to missing compiler option
2019-02-25 17:58:31 +01:00
Martin Kroeker
918a0cc4d1
Fix missing -c option in AVX512 test
2019-02-25 17:55:36 +01:00
Martin Kroeker
0db9c03e7e
Merge pull request #2028 from brada4/mv
...
Move one of clobber fixes to right place
2019-02-24 19:50:23 +01:00
Andrew
6eee1beac5
move fix to right place
2019-02-24 20:41:02 +02:00
Andrew
e5df5958cc
init
2019-02-24 20:39:25 +02:00
Martin Kroeker
343b301d14
Reduce list of kernels in the dynamic arch build
...
to make compilation complete reliably within the 1h limit again
2019-02-20 10:27:48 +01:00
Martin Kroeker
45333d5793
Fix error introduced during cleanup
2019-02-19 22:16:33 +01:00
Martin Kroeker
e29b0cfcc4
Allow multithreading TRMV again
...
revert workaround introduced for issue #1332 as the actual cause appears to be my incorrect fix from #1262 (see #1388 )
2019-02-19 21:03:30 +01:00
Martin Kroeker
78d9910236
Correct range_n limiting
...
same bug as seen in #1388 , somehow missed in corresponding PR #1389
2019-02-19 20:59:48 +01:00
Martin Kroeker
e12cdf58ef
Merge pull request #2024 from martin-frbg/gcc9fixes4
...
Fix inline assembly constraints in Bulldozer TRSM kernels
2019-02-17 11:49:15 +01:00
Martin Kroeker
1860c9456d
Merge pull request #2023 from martin-frbg/gcc9fixes3
...
Fix inline assembly constraints in various x86_64 GEMVN kernels
2019-02-17 11:48:57 +01:00
Martin Kroeker
aec905498f
Merge pull request #1988 from TiborGY/patch-1
...
Reword/expand comments in Makefile.rule
2019-02-17 11:36:04 +01:00
TiborGY
56089991e2
fix the the
2019-02-16 23:26:13 +01:00
Martin Kroeker
f9bb76d29a
Fix inline assembly constraints in Bulldozer TRSM kernels
...
rework indices to allow marking i,as and bs as both input and output (marked operand n1 as well for simplicity). For #2009
2019-02-16 20:06:48 +01:00
Martin Kroeker
8242b1fe3f
Fix inline assembly constraints
2019-02-16 18:51:09 +01:00
Martin Kroeker
efb9038f72
Fix inline assembly constraints
2019-02-16 18:46:17 +01:00
Martin Kroeker
e976557d29
Fix inline assembly constraints
...
rework indices to allow marking argument lda as input and output.
2019-02-16 18:36:39 +01:00
Martin Kroeker
9d8be15789
Fix inline assembly constraints
...
rework indices to allow marking argument lda4 as input and output. For #2009
2019-02-16 18:24:11 +01:00
Martin Kroeker
d752799a0f
Merge pull request #2021 from martin-frbg/gcc9fixes2
...
Fix wrong constraints in inline assembly of Haswell DTRSM kernel
2019-02-16 18:05:40 +01:00
TiborGY
f209fc7fa9
Update Makefile.rule
...
add note about NUM_THREADS for package maintainers, add examples of programs that cause affinity troubles
2019-02-16 12:12:39 +01:00
Martin Kroeker
c26c0b77a7
Fix wrong constraints in inline assembly
...
for #2009
2019-02-15 15:08:16 +01:00
Martin Kroeker
1c6da2d03c
Merge pull request #2019 from martin-frbg/gcc9fixes
...
Fix unannounced modification of input operand 8 (lda4) in Haswell GEMVN microkernel
2019-02-15 15:02:54 +01:00
Martin Kroeker
4255a58cd2
Rename operands to put lda on the input/output constraint list
2019-02-15 10:10:04 +01:00