Rashmica Gupta
bcdf1d4917
Add in runtime CPU detection for POWER.
2019-04-09 14:20:16 +10:00
Martin Kroeker
e06b8438b4
Merge pull request #2080 from martin-frbg/issue2075
...
Add -lm and disable EXPRECISION support on *BSD
2019-04-02 21:40:58 +02:00
Martin Kroeker
9229d6859b
Add -lm and disable EXPRECISION support on *BSD
...
fixes #2075
2019-04-02 09:38:18 +02:00
Martin Kroeker
21d146a8de
Add declarations for ?sum
2019-03-31 22:12:23 +02:00
Martin Kroeker
7f4e36d219
Merge pull request #2073 from martin-frbg/issue2056-2
...
Detect 32bit environment on 64bit ARM hardware
2019-03-31 13:56:08 +02:00
Martin Kroeker
c04a729081
Add ?sum definitions for generic kernel
2019-03-31 13:55:49 +02:00
Martin Kroeker
100d94f94e
Add ?sum
2019-03-31 13:55:05 +02:00
Martin Kroeker
d17da6c6a4
Add cmake defaults for ?sum kernels
2019-03-31 11:57:01 +02:00
Martin Kroeker
1679de5e59
Detect 32bit environment on 64bit ARM hardware
...
for #2056 , using same approach as #2058
2019-03-31 10:50:43 +02:00
Martin Kroeker
246ca29679
Add ZARCH implementation of ?sum
...
as trivial copies of the respective ?asum kernels with the ABS and vflpsb calls removed
2019-03-30 22:49:05 +01:00
Martin Kroeker
9d717cb5ee
Add x86_64 implementation of ?sum
...
as trivial copy of ?asum with the fabs calls removed
2019-03-30 22:27:04 +01:00
Martin Kroeker
e3bc83f2a8
Add x86 implementation of ?sum
...
as trivial copy of ?asum with the fabs calls removed
2019-03-30 22:26:10 +01:00
Martin Kroeker
70f2a4e0d7
Add SPARC implementation of ?sum
...
as trivial copy of ?asum with the fabs replaced by fmov to preserve code structure
2019-03-30 22:25:06 +01:00
Martin Kroeker
706dfe263b
Add POWER implementation of ?sum
...
as trivial copy of ?asum with the fabs replaced by fmr to preserve code structure
2019-03-30 22:23:42 +01:00
Martin Kroeker
688fa9201c
Add MIPS64 implementation of ?sum
...
as trivial copy of ?asum with the fabs replaced by mov to preserve code structure
2019-03-30 22:22:15 +01:00
Martin Kroeker
cdbe0f0235
Add MIPS implementation of ?sum
...
as trivial copy of ?asum with the fabs calls removed
2019-03-30 22:20:14 +01:00
Martin Kroeker
f8b82bc6dc
Add ia64 implementation of ?sum
...
as trivial copy of asum with the fabs calls removed
2019-03-30 22:18:03 +01:00
Martin Kroeker
3e3ccb9011
Add ARM64 implementations of ?sum
...
as trivial copies of the respective ?asum kernels with the fabs calls removed
2019-03-30 22:13:36 +01:00
Martin Kroeker
94ab4e6fb2
Add ARM implementations of ?sum
...
(trivial copies of the respective ?asum with the fabs calls removed)
2019-03-30 22:11:38 +01:00
Martin Kroeker
c3cfc6986b
Add implementations of ssum/dsum and csum/zsum
...
as trivial copies of asum/zsasum with the fabs calls replaced by fmov to preserve code structure
2019-03-30 22:05:11 +01:00
Martin Kroeker
b9f4943a14
Add ?sum
2019-03-30 22:01:13 +01:00
Martin Kroeker
79cfc24a62
Add interface for ?sum (derived from ?asum)
2019-03-30 21:59:18 +01:00
Martin Kroeker
5c42287c4f
Add declarations for ?sum and cblas_?sum
2019-03-30 21:58:03 +01:00
Martin Kroeker
32c7063cb0
Merge pull request #2061 from martin-frbg/martin-frbg-patch-1
...
Disable the AVX512 DGEMM kernel (again)
2019-03-30 21:21:38 +01:00
Martin Kroeker
c19a449096
Merge pull request #2071 from martin-frbg/issue2068
...
Provide CBLAS interfaces to I?MIN and I?MAX
2019-03-30 14:54:28 +01:00
Martin Kroeker
3d1e36d4cb
Build CBLAS interfaces for I?MIN and I?MAX
2019-03-30 12:38:41 +01:00
Martin Kroeker
4f9d3e4b28
Expose CBLAS interfaces for I?MIN and I?MAX
2019-03-30 12:37:13 +01:00
Martin Kroeker
4dec151d0b
Merge pull request #2070 from quickwritereader/develop
...
power9 makefile. dgemm based on power8 kernel with following changes …
2019-03-29 21:46:21 +01:00
Martin Kroeker
7c51cc8527
Merge branch 'develop' into develop
2019-03-29 19:36:29 +01:00
AbdelRauf
853a18bc17
power9 makefile. dgemm based on power8 kernel with following changes : 32x unrolled 16x4 kernel and 8x4 kernel using (lxv stxv butterfly rank1 update). improvement from 17 to 22-23gflops. dtrmm cases were added into dgemm itself
2019-03-29 15:49:40 +00:00
Martin Kroeker
3ae122e2c7
Merge pull request #2069 from aixoss/aix-asm-change
...
AIX asm syntax changes needed for shared object creation
2019-03-25 21:34:30 +01:00
Ayappan P
b043a5962e
AIX asm syntax changes needed for shared object creation
2019-03-25 18:53:25 +05:30
Martin Kroeker
8502030e5e
Merge pull request #2064 from embray/cygwin/use-tls-thread-memory-cleanup
...
Fix for #2063
2019-03-19 22:12:51 +01:00
Erik M. Bray
8ba9e2a61a
Also call CloseHandle on each thread, as well as on the event so as to not leak thread handles.
2019-03-19 11:21:44 +01:00
Erik M. Bray
4ad694eda1
Fix for #2063 : The DllMain used in Cygwin did not run the thread memory
...
pool cleanup upon THREAD_DETACH which is needed when compiled with
USE_TLS=1.
2019-03-19 09:26:50 +01:00
Martin Kroeker
dff4a197a5
Merge pull request #2058 from xsacha/patch-3
...
Change 64-bit detection as explained in #2056
2019-03-16 11:57:23 +01:00
Martin Kroeker
a5425575b1
Merge pull request #2060 from embray/cygwin/readenv
...
Use POSIX getenv on Cygwin
2019-03-16 11:56:51 +01:00
Erik M. Bray
1006ff8a7b
Use POSIX getenv on Cygwin
...
The Windows-native GetEnvironmentVariable cannot be relied on, as
Cygwin does not always copy environment variables set through Cygwin
to the Windows environment block, particularly after fork().
2019-03-15 15:06:30 +01:00
Martin Kroeker
e608d4f7fe
Disable the AVX512 DGEMM kernel (again)
...
Due to as yet unresolved errors seen in #1955 and #2029
2019-03-13 22:10:28 +01:00
Martin Kroeker
4fc17d0d75
Trivial typo fix
...
as suggested in #2022
2019-03-13 19:20:23 +01:00
Sacha
c3e30b2bc2
Change 64-bit detection as explained in #2056
2019-03-13 23:21:54 +10:00
Martin Kroeker
03d7110900
Merge pull request #2042 from maomao194313/develop
...
add TARGET support for HiSilicon tsv110 CPUs
2019-03-12 22:57:39 +01:00
Martin Kroeker
3ce28fb81a
Merge pull request #2055 from martin-frbg/atomid
...
Add CPUID data for Intel Denverton (as Nehalem)
2019-03-12 22:57:07 +01:00
Martin Kroeker
04f2226ea6
Add Intel Denverton
2019-03-12 16:09:55 +01:00
Martin Kroeker
b1393c7a97
Add Intel Denverton
...
for #2048
2019-03-12 16:03:56 +01:00
maomao194313
7e3eb9b25d
make DYNAMIC_ARCH=1 package work on TSV110
2019-03-12 16:11:01 +08:00
maomao194313
f074d7d146
make DYNAMIC_ARCH=1 package work on TSV110.
2019-03-12 16:05:19 +08:00
Martin Kroeker
f18ab6c17b
Merge pull request #2051 from martin-frbg/issue2048
...
Make TARGET=GENERIC compatible with DYNAMIC_ARCH=1
2019-03-09 16:39:35 +01:00
Martin Kroeker
946ec6c3b8
Merge pull request #2050 from kencu/PowerMacFix
...
PowerMac 970 fixes
2019-03-09 16:39:08 +01:00
Martin Kroeker
5b95534afc
Make TARGET=GENERIC compatible with DYNAMIC_ARCH=1
...
for issue #2048
2019-03-09 11:21:16 +01:00