Martin Kroeker
|
5b0398186e
|
Merge pull request #2098 from martin-frbg/rela-malloc
Disable reallocation of work array in ReLAPACK xSYTRF
|
2019-04-28 19:31:01 +02:00 |
Martin Kroeker
|
452859f4e1
|
Merge pull request #2097 from martin-frbg/rela-getrf
Correct INFO=4 condition in ReLAPACK xGETRF
|
2019-04-28 19:28:57 +02:00 |
Martin Kroeker
|
2cd463eabd
|
Disable reallocation of work array in xSYTRF
as it appears to cause memory management problems (seen in the LAPACK tests)
|
2019-04-28 10:02:28 +02:00 |
Martin Kroeker
|
11530b76f7
|
Correct INFO=4 condition
|
2019-04-28 09:58:56 +02:00 |
Martin Kroeker
|
91943b7325
|
Merge pull request #2096 from martin-frbg/eig-testing
Avoid out-of-bounds accesses in LAPACK EIG tests
|
2019-04-28 09:55:42 +02:00 |
Martin Kroeker
|
268c28db7d
|
Merge pull request #2095 from martin-frbg/trsm
Correct length of name string in xerbla call
|
2019-04-28 09:55:25 +02:00 |
Martin Kroeker
|
2aad88d5b9
|
Avoid out-of-bounds accesses in LAPACK EIG tests
see https://github.com/Reference-LAPACK/lapack/issues/333
|
2019-04-27 23:01:49 +02:00 |
Martin Kroeker
|
0bd956fd21
|
Correct length of name string in xerbla call
|
2019-04-27 22:49:04 +02:00 |
Martin Kroeker
|
bbd9d98664
|
Merge pull request #2094 from martin-frbg/issue2066
Fix ReLAPACK integration problems
|
2019-04-27 22:45:47 +02:00 |
Martin Kroeker
|
798c448b0c
|
Add support for INTERFACE64 and fix XERBLA calls
1. Replaced all instances of "int" with "blasint"
2. Added string length as "hidden" third parameter in calls to fortran XERBLA
|
2019-04-27 19:06:00 +02:00 |
Martin Kroeker
|
9a19616a28
|
Support INTERFACE64=1
|
2019-04-27 18:55:47 +02:00 |
Martin Kroeker
|
6b41eb9c0c
|
Merge pull request #2092 from jeffbaylor/snprintf_with_MSC_VER
snprintf define consolidated to common.h
|
2019-04-23 20:12:06 +02:00 |
Martin Kroeker
|
ccfb7ead15
|
Merge pull request #2072 from martin-frbg/sum
Add (C)BLAS extension ?sum
|
2019-04-23 20:11:36 +02:00 |
Jeff Baylor
|
40e53e52d6
|
snprintf define consolidated to common.h
|
2019-04-22 17:01:34 -07:00 |
Martin Kroeker
|
744779d335
|
Merge pull request #2084 from RashmicaG/develop
Add in runtime CPU detection for POWER.
|
2019-04-14 21:40:07 +02:00 |
Rashmica Gupta
|
bcdf1d4917
|
Add in runtime CPU detection for POWER.
|
2019-04-09 14:20:16 +10:00 |
Martin Kroeker
|
e06b8438b4
|
Merge pull request #2080 from martin-frbg/issue2075
Add -lm and disable EXPRECISION support on *BSD
|
2019-04-02 21:40:58 +02:00 |
Martin Kroeker
|
9229d6859b
|
Add -lm and disable EXPRECISION support on *BSD
fixes #2075
|
2019-04-02 09:38:18 +02:00 |
Martin Kroeker
|
21d146a8de
|
Add declarations for ?sum
|
2019-03-31 22:12:23 +02:00 |
Martin Kroeker
|
7f4e36d219
|
Merge pull request #2073 from martin-frbg/issue2056-2
Detect 32bit environment on 64bit ARM hardware
|
2019-03-31 13:56:08 +02:00 |
Martin Kroeker
|
c04a729081
|
Add ?sum definitions for generic kernel
|
2019-03-31 13:55:49 +02:00 |
Martin Kroeker
|
100d94f94e
|
Add ?sum
|
2019-03-31 13:55:05 +02:00 |
Martin Kroeker
|
d17da6c6a4
|
Add cmake defaults for ?sum kernels
|
2019-03-31 11:57:01 +02:00 |
Martin Kroeker
|
1679de5e59
|
Detect 32bit environment on 64bit ARM hardware
for #2056, using same approach as #2058
|
2019-03-31 10:50:43 +02:00 |
Martin Kroeker
|
246ca29679
|
Add ZARCH implementation of ?sum
as trivial copies of the respective ?asum kernels with the ABS and vflpsb calls removed
|
2019-03-30 22:49:05 +01:00 |
Martin Kroeker
|
9d717cb5ee
|
Add x86_64 implementation of ?sum
as trivial copy of ?asum with the fabs calls removed
|
2019-03-30 22:27:04 +01:00 |
Martin Kroeker
|
e3bc83f2a8
|
Add x86 implementation of ?sum
as trivial copy of ?asum with the fabs calls removed
|
2019-03-30 22:26:10 +01:00 |
Martin Kroeker
|
70f2a4e0d7
|
Add SPARC implementation of ?sum
as trivial copy of ?asum with the fabs replaced by fmov to preserve code structure
|
2019-03-30 22:25:06 +01:00 |
Martin Kroeker
|
706dfe263b
|
Add POWER implementation of ?sum
as trivial copy of ?asum with the fabs replaced by fmr to preserve code structure
|
2019-03-30 22:23:42 +01:00 |
Martin Kroeker
|
688fa9201c
|
Add MIPS64 implementation of ?sum
as trivial copy of ?asum with the fabs replaced by mov to preserve code structure
|
2019-03-30 22:22:15 +01:00 |
Martin Kroeker
|
cdbe0f0235
|
Add MIPS implementation of ?sum
as trivial copy of ?asum with the fabs calls removed
|
2019-03-30 22:20:14 +01:00 |
Martin Kroeker
|
f8b82bc6dc
|
Add ia64 implementation of ?sum
as trivial copy of asum with the fabs calls removed
|
2019-03-30 22:18:03 +01:00 |
Martin Kroeker
|
3e3ccb9011
|
Add ARM64 implementations of ?sum
as trivial copies of the respective ?asum kernels with the fabs calls removed
|
2019-03-30 22:13:36 +01:00 |
Martin Kroeker
|
94ab4e6fb2
|
Add ARM implementations of ?sum
(trivial copies of the respective ?asum with the fabs calls removed)
|
2019-03-30 22:11:38 +01:00 |
Martin Kroeker
|
c3cfc6986b
|
Add implementations of ssum/dsum and csum/zsum
as trivial copies of asum/zsasum with the fabs calls replaced by fmov to preserve code structure
|
2019-03-30 22:05:11 +01:00 |
Martin Kroeker
|
b9f4943a14
|
Add ?sum
|
2019-03-30 22:01:13 +01:00 |
Martin Kroeker
|
79cfc24a62
|
Add interface for ?sum (derived from ?asum)
|
2019-03-30 21:59:18 +01:00 |
Martin Kroeker
|
5c42287c4f
|
Add declarations for ?sum and cblas_?sum
|
2019-03-30 21:58:03 +01:00 |
Martin Kroeker
|
32c7063cb0
|
Merge pull request #2061 from martin-frbg/martin-frbg-patch-1
Disable the AVX512 DGEMM kernel (again)
|
2019-03-30 21:21:38 +01:00 |
Martin Kroeker
|
c19a449096
|
Merge pull request #2071 from martin-frbg/issue2068
Provide CBLAS interfaces to I?MIN and I?MAX
|
2019-03-30 14:54:28 +01:00 |
Martin Kroeker
|
3d1e36d4cb
|
Build CBLAS interfaces for I?MIN and I?MAX
|
2019-03-30 12:38:41 +01:00 |
Martin Kroeker
|
4f9d3e4b28
|
Expose CBLAS interfaces for I?MIN and I?MAX
|
2019-03-30 12:37:13 +01:00 |
Martin Kroeker
|
4dec151d0b
|
Merge pull request #2070 from quickwritereader/develop
power9 makefile. dgemm based on power8 kernel with following changes …
|
2019-03-29 21:46:21 +01:00 |
Martin Kroeker
|
7c51cc8527
|
Merge branch 'develop' into develop
|
2019-03-29 19:36:29 +01:00 |
AbdelRauf
|
853a18bc17
|
power9 makefile. dgemm based on power8 kernel with following changes : 32x unrolled 16x4 kernel and 8x4 kernel using (lxv stxv butterfly rank1 update). improvement from 17 to 22-23gflops. dtrmm cases were added into dgemm itself
|
2019-03-29 15:49:40 +00:00 |
Martin Kroeker
|
3ae122e2c7
|
Merge pull request #2069 from aixoss/aix-asm-change
AIX asm syntax changes needed for shared object creation
|
2019-03-25 21:34:30 +01:00 |
Ayappan P
|
b043a5962e
|
AIX asm syntax changes needed for shared object creation
|
2019-03-25 18:53:25 +05:30 |
Martin Kroeker
|
8502030e5e
|
Merge pull request #2064 from embray/cygwin/use-tls-thread-memory-cleanup
Fix for #2063
|
2019-03-19 22:12:51 +01:00 |
Erik M. Bray
|
8ba9e2a61a
|
Also call CloseHandle on each thread, as well as on the event so as to not leak thread handles.
|
2019-03-19 11:21:44 +01:00 |
Erik M. Bray
|
4ad694eda1
|
Fix for #2063: The DllMain used in Cygwin did not run the thread memory
pool cleanup upon THREAD_DETACH which is needed when compiled with
USE_TLS=1.
|
2019-03-19 09:26:50 +01:00 |