Commit Graph

3589 Commits

Author SHA1 Message Date
Erik M. Bray 8ba9e2a61a Also call CloseHandle on each thread, as well as on the event so as to not leak thread handles. 2019-03-19 11:21:44 +01:00
Erik M. Bray 4ad694eda1 Fix for #2063: The DllMain used in Cygwin did not run the thread memory
pool cleanup upon THREAD_DETACH which is needed when compiled with
USE_TLS=1.
2019-03-19 09:26:50 +01:00
Martin Kroeker 4fc17d0d75
Trivial typo fix
as suggested in #2022
2019-03-13 19:20:23 +01:00
Martin Kroeker 03d7110900
Merge pull request #2042 from maomao194313/develop
add TARGET support for HiSilicon tsv110 CPUs
2019-03-12 22:57:39 +01:00
Martin Kroeker 3ce28fb81a
Merge pull request #2055 from martin-frbg/atomid
Add CPUID data for Intel Denverton (as Nehalem)
2019-03-12 22:57:07 +01:00
Martin Kroeker 04f2226ea6
Add Intel Denverton 2019-03-12 16:09:55 +01:00
Martin Kroeker b1393c7a97
Add Intel Denverton
for #2048
2019-03-12 16:03:56 +01:00
maomao194313 7e3eb9b25d
make DYNAMIC_ARCH=1 package work on TSV110 2019-03-12 16:11:01 +08:00
maomao194313 f074d7d146
make DYNAMIC_ARCH=1 package work on TSV110. 2019-03-12 16:05:19 +08:00
Martin Kroeker f18ab6c17b
Merge pull request #2051 from martin-frbg/issue2048
Make TARGET=GENERIC compatible with DYNAMIC_ARCH=1
2019-03-09 16:39:35 +01:00
Martin Kroeker 946ec6c3b8
Merge pull request #2050 from kencu/PowerMacFix
PowerMac 970 fixes
2019-03-09 16:39:08 +01:00
Martin Kroeker 5b95534afc
Make TARGET=GENERIC compatible with DYNAMIC_ARCH=1
for issue #2048
2019-03-09 11:21:16 +01:00
ken-cunningham-webuse f7a06463d9 common_power.h: force DCBT_ARG 0 on PPC970 Darwin
without this, we see
../kernel/power/gemv_n.S:427:Parameter syntax error
and many more similar entries

that relates to this assembly command
dcbt 8, r24, r18

this change makes the DCBT_ARG = 0
and openblas builds through to completion on PowerMac 970
Tests pass
2019-03-07 12:03:45 -08:00
ken-cunningham-webuse b0c714ef60 param.h : enable defines for PPC970 on DarwinOS
fixes:
gemm.c: In function 'sgemm_':
../common_param.h:981:18: error: 'SGEMM_DEFAULT_P' undeclared (first use in this function)
 #define SGEMM_P  SGEMM_DEFAULT_P
                  ^
2019-03-07 12:03:25 -08:00
Martin Kroeker 8d3d29e4d7
Merge pull request #2049 from Celelibi/fix_crash_sgemm_sse_x64
Fix crash in sgemm SSE/nano kernel on x86_64
2019-03-07 19:28:06 +01:00
Celelibi b7f59da42d Fix crash in sgemm SSE/nano kernel on x86_64
Fix bug #2047.

Signed-off-by: Celelibi <celelibi@gmail.com>
2019-03-07 16:55:13 +01:00
Martin Kroeker db3dc9e282
Merge pull request #2046 from kencu/powermac
ctest.c : add __POWERPC__ for PowerMac
2019-03-07 14:51:41 +01:00
ken-cunningham-webuse 4290afdae2 ctest.c : add __POWERPC__ for PowerMac 2019-03-06 20:55:06 -08:00
Martin Kroeker 4741ce803b
Merge pull request #2045 from martin-frbg/2033-3
Do not compile in AVX512 check if AVX support is disabled
2019-03-06 22:40:26 +01:00
Martin Kroeker 11cfd0bd75
Do not compile in AVX512 check if AVX support is disabled
xgetbv is function depends on NO_AVX being undefined - we could change that too, but that combo is unlikely to work anyway
2019-03-05 16:04:25 +01:00
Martin Kroeker 651ab01d2b
Merge pull request #2044 from martin-frbg/issue2043
Fix module definition conflicts between LAPACK and ReLAPACK
2019-03-05 12:11:32 +01:00
Martin Kroeker d7b2c53c0b
Merge pull request #2039 from brada4/meminit
Address warning in memory.c
2019-03-05 12:11:15 +01:00
Martin Kroeker e4864a8933
Fix module definition conflicts between LAPACK and ReLAPACK
for #2043
2019-03-04 21:17:08 +01:00
Martin Kroeker 10d841d8b9
Merge pull request #2026 from martin-frbg/trmv_threads
Correct range limiting in trmv_thread and re-enable TRMV multithreading
2019-03-04 15:08:31 +01:00
Martin Kroeker 12f2b76748
Merge pull request #2038 from martin-frbg/issue2035
Improve handling of NO_STATIC and NO_SHARED
2019-03-04 15:07:48 +01:00
Martin Kroeker 6c83b878f6
Merge pull request #2040 from martin-frbg/locks2002
Restore locking optimizations for OpenMP case
2019-03-04 15:07:14 +01:00
maomao194313 fb4dae7124
add TARGET support for HiSilicon tsv110 CPUs 2019-03-04 16:48:49 +08:00
maomao194313 760842dda1
add TARGET support for HiSilicon tsv110 CPUs 2019-03-04 16:45:22 +08:00
maomao194313 53f482ee72
add TARGET support for HiSilicon tsv110 CPUs 2019-03-04 16:41:21 +08:00
maomao194313 783ba8058f
HiSilicon tsv110 CPUs optimization branch
add HiSilicon tsv110 CPUs  optimization branch
2019-03-04 16:30:50 +08:00
Martin Kroeker af480b02a4
Restore locking optimizations for OpenMP case
restore another accidentally dropped part of #1468 that was missed in #2004 to address performance regression reported in #1461
2019-03-03 14:17:07 +01:00
Andrew e4a79be6bb address warning introed with #1814 et al 2019-03-03 09:05:11 +02:00
Andrew e5c316c6b9 init 2019-03-03 08:59:27 +02:00
Martin Kroeker 25427926bc
Improve handling of NO_STATIC and NO_SHARED
to avoid surprises from defining either as zero. Fixes #2035 by addressing some concerns from #1422
2019-03-02 23:36:36 +01:00
Martin Kroeker edb8143141
Merge pull request #2037 from martin-frbg/issue2033-2
Make sure that AVX512 is disabled in 32bit builds
2019-03-01 11:45:02 +01:00
Martin Kroeker c4868d11c0
Make sure that AVX512 is disabled in 32bit builds
for #2033
2019-03-01 09:23:03 +01:00
Martin Kroeker 4c321ae571
Merge pull request #2034 from martin-frbg/issue2033
Make x86_32 imply NO_AVX2, NO_AVX512 in addition to NO_AVX
2019-02-28 22:10:12 +01:00
Martin Kroeker 2ffb727187
Keep xcode8.3 for osx BINARY=32 build
as xcode10 deprecated i386
2019-02-28 10:51:54 +01:00
Martin Kroeker d66214c946
Make x86_32 imply NO_AVX2, NO_AVX512 in addition to NO_AVX
fixes #2033
2019-02-28 09:58:25 +01:00
Martin Kroeker fd34820b99
Fix AVX512 test always returning false due to missing compiler option 2019-02-25 17:58:31 +01:00
Martin Kroeker 918a0cc4d1
Fix missing -c option in AVX512 test 2019-02-25 17:55:36 +01:00
Martin Kroeker 0db9c03e7e
Merge pull request #2028 from brada4/mv
Move one of clobber fixes to right place
2019-02-24 19:50:23 +01:00
Andrew 6eee1beac5 move fix to right place 2019-02-24 20:41:02 +02:00
Andrew e5df5958cc init 2019-02-24 20:39:25 +02:00
Martin Kroeker 343b301d14
Reduce list of kernels in the dynamic arch build
to make compilation complete reliably within the 1h limit again
2019-02-20 10:27:48 +01:00
Martin Kroeker 45333d5793
Fix error introduced during cleanup 2019-02-19 22:16:33 +01:00
Martin Kroeker e29b0cfcc4
Allow multithreading TRMV again
revert workaround introduced for issue #1332 as the actual cause appears to be my incorrect fix from #1262 (see #1388)
2019-02-19 21:03:30 +01:00
Martin Kroeker 78d9910236
Correct range_n limiting
same bug as seen in #1388, somehow missed in corresponding PR #1389
2019-02-19 20:59:48 +01:00
Martin Kroeker e12cdf58ef
Merge pull request #2024 from martin-frbg/gcc9fixes4
Fix inline assembly constraints in Bulldozer TRSM kernels
2019-02-17 11:49:15 +01:00
Martin Kroeker 1860c9456d
Merge pull request #2023 from martin-frbg/gcc9fixes3
Fix inline assembly constraints in various x86_64 GEMVN kernels
2019-02-17 11:48:57 +01:00