Martin Kroeker
91943b7325
Merge pull request #2096 from martin-frbg/eig-testing
...
Avoid out-of-bounds accesses in LAPACK EIG tests
2019-04-28 09:55:42 +02:00
Martin Kroeker
268c28db7d
Merge pull request #2095 from martin-frbg/trsm
...
Correct length of name string in xerbla call
2019-04-28 09:55:25 +02:00
Martin Kroeker
2aad88d5b9
Avoid out-of-bounds accesses in LAPACK EIG tests
...
see https://github.com/Reference-LAPACK/lapack/issues/333
2019-04-27 23:01:49 +02:00
Martin Kroeker
0bd956fd21
Correct length of name string in xerbla call
2019-04-27 22:49:04 +02:00
Martin Kroeker
bbd9d98664
Merge pull request #2094 from martin-frbg/issue2066
...
Fix ReLAPACK integration problems
2019-04-27 22:45:47 +02:00
Martin Kroeker
798c448b0c
Add support for INTERFACE64 and fix XERBLA calls
...
1. Replaced all instances of "int" with "blasint"
2. Added string length as "hidden" third parameter in calls to fortran XERBLA
2019-04-27 19:06:00 +02:00
Martin Kroeker
9a19616a28
Support INTERFACE64=1
2019-04-27 18:55:47 +02:00
Martin Kroeker
6b41eb9c0c
Merge pull request #2092 from jeffbaylor/snprintf_with_MSC_VER
...
snprintf define consolidated to common.h
2019-04-23 20:12:06 +02:00
Martin Kroeker
ccfb7ead15
Merge pull request #2072 from martin-frbg/sum
...
Add (C)BLAS extension ?sum
2019-04-23 20:11:36 +02:00
Jeff Baylor
40e53e52d6
snprintf define consolidated to common.h
2019-04-22 17:01:34 -07:00
Martin Kroeker
744779d335
Merge pull request #2084 from RashmicaG/develop
...
Add in runtime CPU detection for POWER.
2019-04-14 21:40:07 +02:00
Rashmica Gupta
bcdf1d4917
Add in runtime CPU detection for POWER.
2019-04-09 14:20:16 +10:00
Martin Kroeker
e06b8438b4
Merge pull request #2080 from martin-frbg/issue2075
...
Add -lm and disable EXPRECISION support on *BSD
2019-04-02 21:40:58 +02:00
Martin Kroeker
9229d6859b
Add -lm and disable EXPRECISION support on *BSD
...
fixes #2075
2019-04-02 09:38:18 +02:00
Martin Kroeker
21d146a8de
Add declarations for ?sum
2019-03-31 22:12:23 +02:00
Martin Kroeker
7f4e36d219
Merge pull request #2073 from martin-frbg/issue2056-2
...
Detect 32bit environment on 64bit ARM hardware
2019-03-31 13:56:08 +02:00
Martin Kroeker
c04a729081
Add ?sum definitions for generic kernel
2019-03-31 13:55:49 +02:00
Martin Kroeker
100d94f94e
Add ?sum
2019-03-31 13:55:05 +02:00
Martin Kroeker
d17da6c6a4
Add cmake defaults for ?sum kernels
2019-03-31 11:57:01 +02:00
Martin Kroeker
1679de5e59
Detect 32bit environment on 64bit ARM hardware
...
for #2056 , using same approach as #2058
2019-03-31 10:50:43 +02:00
Martin Kroeker
246ca29679
Add ZARCH implementation of ?sum
...
as trivial copies of the respective ?asum kernels with the ABS and vflpsb calls removed
2019-03-30 22:49:05 +01:00
Martin Kroeker
9d717cb5ee
Add x86_64 implementation of ?sum
...
as trivial copy of ?asum with the fabs calls removed
2019-03-30 22:27:04 +01:00
Martin Kroeker
e3bc83f2a8
Add x86 implementation of ?sum
...
as trivial copy of ?asum with the fabs calls removed
2019-03-30 22:26:10 +01:00
Martin Kroeker
70f2a4e0d7
Add SPARC implementation of ?sum
...
as trivial copy of ?asum with the fabs replaced by fmov to preserve code structure
2019-03-30 22:25:06 +01:00
Martin Kroeker
706dfe263b
Add POWER implementation of ?sum
...
as trivial copy of ?asum with the fabs replaced by fmr to preserve code structure
2019-03-30 22:23:42 +01:00
Martin Kroeker
688fa9201c
Add MIPS64 implementation of ?sum
...
as trivial copy of ?asum with the fabs replaced by mov to preserve code structure
2019-03-30 22:22:15 +01:00
Martin Kroeker
cdbe0f0235
Add MIPS implementation of ?sum
...
as trivial copy of ?asum with the fabs calls removed
2019-03-30 22:20:14 +01:00
Martin Kroeker
f8b82bc6dc
Add ia64 implementation of ?sum
...
as trivial copy of asum with the fabs calls removed
2019-03-30 22:18:03 +01:00
Martin Kroeker
3e3ccb9011
Add ARM64 implementations of ?sum
...
as trivial copies of the respective ?asum kernels with the fabs calls removed
2019-03-30 22:13:36 +01:00
Martin Kroeker
94ab4e6fb2
Add ARM implementations of ?sum
...
(trivial copies of the respective ?asum with the fabs calls removed)
2019-03-30 22:11:38 +01:00
Martin Kroeker
c3cfc6986b
Add implementations of ssum/dsum and csum/zsum
...
as trivial copies of asum/zsasum with the fabs calls replaced by fmov to preserve code structure
2019-03-30 22:05:11 +01:00
Martin Kroeker
b9f4943a14
Add ?sum
2019-03-30 22:01:13 +01:00
Martin Kroeker
79cfc24a62
Add interface for ?sum (derived from ?asum)
2019-03-30 21:59:18 +01:00
Martin Kroeker
5c42287c4f
Add declarations for ?sum and cblas_?sum
2019-03-30 21:58:03 +01:00
Martin Kroeker
32c7063cb0
Merge pull request #2061 from martin-frbg/martin-frbg-patch-1
...
Disable the AVX512 DGEMM kernel (again)
2019-03-30 21:21:38 +01:00
Martin Kroeker
c19a449096
Merge pull request #2071 from martin-frbg/issue2068
...
Provide CBLAS interfaces to I?MIN and I?MAX
2019-03-30 14:54:28 +01:00
Martin Kroeker
3d1e36d4cb
Build CBLAS interfaces for I?MIN and I?MAX
2019-03-30 12:38:41 +01:00
Martin Kroeker
4f9d3e4b28
Expose CBLAS interfaces for I?MIN and I?MAX
2019-03-30 12:37:13 +01:00
Martin Kroeker
4dec151d0b
Merge pull request #2070 from quickwritereader/develop
...
power9 makefile. dgemm based on power8 kernel with following changes …
2019-03-29 21:46:21 +01:00
Martin Kroeker
7c51cc8527
Merge branch 'develop' into develop
2019-03-29 19:36:29 +01:00
AbdelRauf
853a18bc17
power9 makefile. dgemm based on power8 kernel with following changes : 32x unrolled 16x4 kernel and 8x4 kernel using (lxv stxv butterfly rank1 update). improvement from 17 to 22-23gflops. dtrmm cases were added into dgemm itself
2019-03-29 15:49:40 +00:00
Martin Kroeker
3ae122e2c7
Merge pull request #2069 from aixoss/aix-asm-change
...
AIX asm syntax changes needed for shared object creation
2019-03-25 21:34:30 +01:00
Ayappan P
b043a5962e
AIX asm syntax changes needed for shared object creation
2019-03-25 18:53:25 +05:30
Martin Kroeker
8502030e5e
Merge pull request #2064 from embray/cygwin/use-tls-thread-memory-cleanup
...
Fix for #2063
2019-03-19 22:12:51 +01:00
Erik M. Bray
8ba9e2a61a
Also call CloseHandle on each thread, as well as on the event so as to not leak thread handles.
2019-03-19 11:21:44 +01:00
Erik M. Bray
4ad694eda1
Fix for #2063 : The DllMain used in Cygwin did not run the thread memory
...
pool cleanup upon THREAD_DETACH which is needed when compiled with
USE_TLS=1.
2019-03-19 09:26:50 +01:00
Martin Kroeker
dff4a197a5
Merge pull request #2058 from xsacha/patch-3
...
Change 64-bit detection as explained in #2056
2019-03-16 11:57:23 +01:00
Martin Kroeker
a5425575b1
Merge pull request #2060 from embray/cygwin/readenv
...
Use POSIX getenv on Cygwin
2019-03-16 11:56:51 +01:00
Erik M. Bray
1006ff8a7b
Use POSIX getenv on Cygwin
...
The Windows-native GetEnvironmentVariable cannot be relied on, as
Cygwin does not always copy environment variables set through Cygwin
to the Windows environment block, particularly after fork().
2019-03-15 15:06:30 +01:00
Martin Kroeker
e608d4f7fe
Disable the AVX512 DGEMM kernel (again)
...
Due to as yet unresolved errors seen in #1955 and #2029
2019-03-13 22:10:28 +01:00