Commit Graph

4056 Commits

Author SHA1 Message Date
Martin Kroeker c19a449096
Merge pull request #2071 from martin-frbg/issue2068
Provide CBLAS interfaces to I?MIN and I?MAX
2019-03-30 14:54:28 +01:00
Martin Kroeker 3d1e36d4cb
Build CBLAS interfaces for I?MIN and I?MAX 2019-03-30 12:38:41 +01:00
Martin Kroeker 4f9d3e4b28
Expose CBLAS interfaces for I?MIN and I?MAX 2019-03-30 12:37:13 +01:00
Martin Kroeker 4dec151d0b
Merge pull request #2070 from quickwritereader/develop
power9 makefile. dgemm based on power8 kernel with following changes …
2019-03-29 21:46:21 +01:00
Martin Kroeker 7c51cc8527
Merge branch 'develop' into develop 2019-03-29 19:36:29 +01:00
AbdelRauf 853a18bc17 power9 makefile. dgemm based on power8 kernel with following changes : 32x unrolled 16x4 kernel and 8x4 kernel using (lxv stxv butterfly rank1 update). improvement from 17 to 22-23gflops. dtrmm cases were added into dgemm itself 2019-03-29 15:49:40 +00:00
Martin Kroeker 3ae122e2c7
Merge pull request #2069 from aixoss/aix-asm-change
AIX asm syntax changes needed for shared object creation
2019-03-25 21:34:30 +01:00
Ayappan P b043a5962e AIX asm syntax changes needed for shared object creation 2019-03-25 18:53:25 +05:30
Martin Kroeker 8502030e5e
Merge pull request #2064 from embray/cygwin/use-tls-thread-memory-cleanup
Fix for #2063
2019-03-19 22:12:51 +01:00
Erik M. Bray 8ba9e2a61a Also call CloseHandle on each thread, as well as on the event so as to not leak thread handles. 2019-03-19 11:21:44 +01:00
Erik M. Bray 4ad694eda1 Fix for #2063: The DllMain used in Cygwin did not run the thread memory
pool cleanup upon THREAD_DETACH which is needed when compiled with
USE_TLS=1.
2019-03-19 09:26:50 +01:00
Martin Kroeker dff4a197a5
Merge pull request #2058 from xsacha/patch-3
Change 64-bit detection as explained in #2056
2019-03-16 11:57:23 +01:00
Martin Kroeker a5425575b1
Merge pull request #2060 from embray/cygwin/readenv
Use POSIX getenv on Cygwin
2019-03-16 11:56:51 +01:00
Erik M. Bray 1006ff8a7b Use POSIX getenv on Cygwin
The Windows-native GetEnvironmentVariable cannot be relied on, as
Cygwin does not always copy environment variables set through Cygwin
to the Windows environment block, particularly after fork().
2019-03-15 15:06:30 +01:00
Martin Kroeker e608d4f7fe
Disable the AVX512 DGEMM kernel (again)
Due to as yet unresolved errors seen in #1955 and #2029
2019-03-13 22:10:28 +01:00
Martin Kroeker 4fc17d0d75
Trivial typo fix
as suggested in #2022
2019-03-13 19:20:23 +01:00
Sacha c3e30b2bc2
Change 64-bit detection as explained in #2056 2019-03-13 23:21:54 +10:00
Martin Kroeker 03d7110900
Merge pull request #2042 from maomao194313/develop
add TARGET support for HiSilicon tsv110 CPUs
2019-03-12 22:57:39 +01:00
Martin Kroeker 3ce28fb81a
Merge pull request #2055 from martin-frbg/atomid
Add CPUID data for Intel Denverton (as Nehalem)
2019-03-12 22:57:07 +01:00
Martin Kroeker 04f2226ea6
Add Intel Denverton 2019-03-12 16:09:55 +01:00
Martin Kroeker b1393c7a97
Add Intel Denverton
for #2048
2019-03-12 16:03:56 +01:00
maomao194313 7e3eb9b25d
make DYNAMIC_ARCH=1 package work on TSV110 2019-03-12 16:11:01 +08:00
maomao194313 f074d7d146
make DYNAMIC_ARCH=1 package work on TSV110. 2019-03-12 16:05:19 +08:00
Martin Kroeker f18ab6c17b
Merge pull request #2051 from martin-frbg/issue2048
Make TARGET=GENERIC compatible with DYNAMIC_ARCH=1
2019-03-09 16:39:35 +01:00
Martin Kroeker 946ec6c3b8
Merge pull request #2050 from kencu/PowerMacFix
PowerMac 970 fixes
2019-03-09 16:39:08 +01:00
Martin Kroeker 5b95534afc
Make TARGET=GENERIC compatible with DYNAMIC_ARCH=1
for issue #2048
2019-03-09 11:21:16 +01:00
ken-cunningham-webuse f7a06463d9 common_power.h: force DCBT_ARG 0 on PPC970 Darwin
without this, we see
../kernel/power/gemv_n.S:427:Parameter syntax error
and many more similar entries

that relates to this assembly command
dcbt 8, r24, r18

this change makes the DCBT_ARG = 0
and openblas builds through to completion on PowerMac 970
Tests pass
2019-03-07 12:03:45 -08:00
ken-cunningham-webuse b0c714ef60 param.h : enable defines for PPC970 on DarwinOS
fixes:
gemm.c: In function 'sgemm_':
../common_param.h:981:18: error: 'SGEMM_DEFAULT_P' undeclared (first use in this function)
 #define SGEMM_P  SGEMM_DEFAULT_P
                  ^
2019-03-07 12:03:25 -08:00
Martin Kroeker 8d3d29e4d7
Merge pull request #2049 from Celelibi/fix_crash_sgemm_sse_x64
Fix crash in sgemm SSE/nano kernel on x86_64
2019-03-07 19:28:06 +01:00
Celelibi b7f59da42d Fix crash in sgemm SSE/nano kernel on x86_64
Fix bug #2047.

Signed-off-by: Celelibi <celelibi@gmail.com>
2019-03-07 16:55:13 +01:00
Martin Kroeker db3dc9e282
Merge pull request #2046 from kencu/powermac
ctest.c : add __POWERPC__ for PowerMac
2019-03-07 14:51:41 +01:00
ken-cunningham-webuse 4290afdae2 ctest.c : add __POWERPC__ for PowerMac 2019-03-06 20:55:06 -08:00
Martin Kroeker 4741ce803b
Merge pull request #2045 from martin-frbg/2033-3
Do not compile in AVX512 check if AVX support is disabled
2019-03-06 22:40:26 +01:00
Martin Kroeker 11cfd0bd75
Do not compile in AVX512 check if AVX support is disabled
xgetbv is function depends on NO_AVX being undefined - we could change that too, but that combo is unlikely to work anyway
2019-03-05 16:04:25 +01:00
Martin Kroeker 651ab01d2b
Merge pull request #2044 from martin-frbg/issue2043
Fix module definition conflicts between LAPACK and ReLAPACK
2019-03-05 12:11:32 +01:00
Martin Kroeker d7b2c53c0b
Merge pull request #2039 from brada4/meminit
Address warning in memory.c
2019-03-05 12:11:15 +01:00
Martin Kroeker e4864a8933
Fix module definition conflicts between LAPACK and ReLAPACK
for #2043
2019-03-04 21:17:08 +01:00
Martin Kroeker 10d841d8b9
Merge pull request #2026 from martin-frbg/trmv_threads
Correct range limiting in trmv_thread and re-enable TRMV multithreading
2019-03-04 15:08:31 +01:00
Martin Kroeker 12f2b76748
Merge pull request #2038 from martin-frbg/issue2035
Improve handling of NO_STATIC and NO_SHARED
2019-03-04 15:07:48 +01:00
Martin Kroeker 6c83b878f6
Merge pull request #2040 from martin-frbg/locks2002
Restore locking optimizations for OpenMP case
2019-03-04 15:07:14 +01:00
maomao194313 fb4dae7124
add TARGET support for HiSilicon tsv110 CPUs 2019-03-04 16:48:49 +08:00
maomao194313 760842dda1
add TARGET support for HiSilicon tsv110 CPUs 2019-03-04 16:45:22 +08:00
maomao194313 53f482ee72
add TARGET support for HiSilicon tsv110 CPUs 2019-03-04 16:41:21 +08:00
maomao194313 783ba8058f
HiSilicon tsv110 CPUs optimization branch
add HiSilicon tsv110 CPUs  optimization branch
2019-03-04 16:30:50 +08:00
Martin Kroeker af480b02a4
Restore locking optimizations for OpenMP case
restore another accidentally dropped part of #1468 that was missed in #2004 to address performance regression reported in #1461
2019-03-03 14:17:07 +01:00
Andrew e4a79be6bb address warning introed with #1814 et al 2019-03-03 09:05:11 +02:00
Andrew e5c316c6b9 init 2019-03-03 08:59:27 +02:00
Martin Kroeker 25427926bc
Improve handling of NO_STATIC and NO_SHARED
to avoid surprises from defining either as zero. Fixes #2035 by addressing some concerns from #1422
2019-03-02 23:36:36 +01:00
Martin Kroeker edb8143141
Merge pull request #2037 from martin-frbg/issue2033-2
Make sure that AVX512 is disabled in 32bit builds
2019-03-01 11:45:02 +01:00
Martin Kroeker c4868d11c0
Make sure that AVX512 is disabled in 32bit builds
for #2033
2019-03-01 09:23:03 +01:00