Commit Graph

3830 Commits

Author SHA1 Message Date
Rashmica Gupta bcdf1d4917 Add in runtime CPU detection for POWER. 2019-04-09 14:20:16 +10:00
Martin Kroeker e06b8438b4
Merge pull request #2080 from martin-frbg/issue2075
Add -lm and disable EXPRECISION support on *BSD
2019-04-02 21:40:58 +02:00
Martin Kroeker 9229d6859b
Add -lm and disable EXPRECISION support on *BSD
fixes #2075
2019-04-02 09:38:18 +02:00
Martin Kroeker 21d146a8de
Add declarations for ?sum 2019-03-31 22:12:23 +02:00
Martin Kroeker 7f4e36d219
Merge pull request #2073 from martin-frbg/issue2056-2
Detect 32bit environment on 64bit ARM hardware
2019-03-31 13:56:08 +02:00
Martin Kroeker c04a729081
Add ?sum definitions for generic kernel 2019-03-31 13:55:49 +02:00
Martin Kroeker 100d94f94e
Add ?sum 2019-03-31 13:55:05 +02:00
Martin Kroeker d17da6c6a4
Add cmake defaults for ?sum kernels 2019-03-31 11:57:01 +02:00
Martin Kroeker 1679de5e59
Detect 32bit environment on 64bit ARM hardware
for #2056, using same approach as #2058
2019-03-31 10:50:43 +02:00
Martin Kroeker 246ca29679
Add ZARCH implementation of ?sum
as trivial copies of the respective ?asum kernels with the ABS and vflpsb calls removed
2019-03-30 22:49:05 +01:00
Martin Kroeker 9d717cb5ee
Add x86_64 implementation of ?sum
as trivial copy of ?asum with the fabs calls removed
2019-03-30 22:27:04 +01:00
Martin Kroeker e3bc83f2a8
Add x86 implementation of ?sum
as trivial copy of ?asum with the fabs calls removed
2019-03-30 22:26:10 +01:00
Martin Kroeker 70f2a4e0d7
Add SPARC implementation of ?sum
as trivial copy of ?asum with the fabs replaced by fmov to preserve code structure
2019-03-30 22:25:06 +01:00
Martin Kroeker 706dfe263b
Add POWER implementation of ?sum
as trivial copy of ?asum with the fabs replaced by fmr to preserve code structure
2019-03-30 22:23:42 +01:00
Martin Kroeker 688fa9201c
Add MIPS64 implementation of ?sum
as trivial copy of ?asum with the fabs replaced by mov to preserve code structure
2019-03-30 22:22:15 +01:00
Martin Kroeker cdbe0f0235
Add MIPS implementation of ?sum
as trivial copy of ?asum with the fabs calls removed
2019-03-30 22:20:14 +01:00
Martin Kroeker f8b82bc6dc
Add ia64 implementation of ?sum
as trivial copy of asum with the fabs calls removed
2019-03-30 22:18:03 +01:00
Martin Kroeker 3e3ccb9011
Add ARM64 implementations of ?sum
as trivial copies of the respective ?asum kernels with the fabs calls removed
2019-03-30 22:13:36 +01:00
Martin Kroeker 94ab4e6fb2
Add ARM implementations of ?sum
(trivial copies of the respective ?asum with the fabs calls removed)
2019-03-30 22:11:38 +01:00
Martin Kroeker c3cfc6986b
Add implementations of ssum/dsum and csum/zsum
as trivial copies of asum/zsasum with the fabs calls replaced by fmov to preserve code structure
2019-03-30 22:05:11 +01:00
Martin Kroeker b9f4943a14
Add ?sum 2019-03-30 22:01:13 +01:00
Martin Kroeker 79cfc24a62
Add interface for ?sum (derived from ?asum) 2019-03-30 21:59:18 +01:00
Martin Kroeker 5c42287c4f
Add declarations for ?sum and cblas_?sum 2019-03-30 21:58:03 +01:00
Martin Kroeker 32c7063cb0
Merge pull request #2061 from martin-frbg/martin-frbg-patch-1
Disable the AVX512 DGEMM kernel (again)
2019-03-30 21:21:38 +01:00
Martin Kroeker c19a449096
Merge pull request #2071 from martin-frbg/issue2068
Provide CBLAS interfaces to I?MIN and I?MAX
2019-03-30 14:54:28 +01:00
Martin Kroeker 3d1e36d4cb
Build CBLAS interfaces for I?MIN and I?MAX 2019-03-30 12:38:41 +01:00
Martin Kroeker 4f9d3e4b28
Expose CBLAS interfaces for I?MIN and I?MAX 2019-03-30 12:37:13 +01:00
Martin Kroeker 4dec151d0b
Merge pull request #2070 from quickwritereader/develop
power9 makefile. dgemm based on power8 kernel with following changes …
2019-03-29 21:46:21 +01:00
Martin Kroeker 7c51cc8527
Merge branch 'develop' into develop 2019-03-29 19:36:29 +01:00
AbdelRauf 853a18bc17 power9 makefile. dgemm based on power8 kernel with following changes : 32x unrolled 16x4 kernel and 8x4 kernel using (lxv stxv butterfly rank1 update). improvement from 17 to 22-23gflops. dtrmm cases were added into dgemm itself 2019-03-29 15:49:40 +00:00
Martin Kroeker 3ae122e2c7
Merge pull request #2069 from aixoss/aix-asm-change
AIX asm syntax changes needed for shared object creation
2019-03-25 21:34:30 +01:00
Ayappan P b043a5962e AIX asm syntax changes needed for shared object creation 2019-03-25 18:53:25 +05:30
Martin Kroeker 8502030e5e
Merge pull request #2064 from embray/cygwin/use-tls-thread-memory-cleanup
Fix for #2063
2019-03-19 22:12:51 +01:00
Erik M. Bray 8ba9e2a61a Also call CloseHandle on each thread, as well as on the event so as to not leak thread handles. 2019-03-19 11:21:44 +01:00
Erik M. Bray 4ad694eda1 Fix for #2063: The DllMain used in Cygwin did not run the thread memory
pool cleanup upon THREAD_DETACH which is needed when compiled with
USE_TLS=1.
2019-03-19 09:26:50 +01:00
Martin Kroeker dff4a197a5
Merge pull request #2058 from xsacha/patch-3
Change 64-bit detection as explained in #2056
2019-03-16 11:57:23 +01:00
Martin Kroeker a5425575b1
Merge pull request #2060 from embray/cygwin/readenv
Use POSIX getenv on Cygwin
2019-03-16 11:56:51 +01:00
Erik M. Bray 1006ff8a7b Use POSIX getenv on Cygwin
The Windows-native GetEnvironmentVariable cannot be relied on, as
Cygwin does not always copy environment variables set through Cygwin
to the Windows environment block, particularly after fork().
2019-03-15 15:06:30 +01:00
Martin Kroeker e608d4f7fe
Disable the AVX512 DGEMM kernel (again)
Due to as yet unresolved errors seen in #1955 and #2029
2019-03-13 22:10:28 +01:00
Martin Kroeker 4fc17d0d75
Trivial typo fix
as suggested in #2022
2019-03-13 19:20:23 +01:00
Sacha c3e30b2bc2
Change 64-bit detection as explained in #2056 2019-03-13 23:21:54 +10:00
Martin Kroeker 03d7110900
Merge pull request #2042 from maomao194313/develop
add TARGET support for HiSilicon tsv110 CPUs
2019-03-12 22:57:39 +01:00
Martin Kroeker 3ce28fb81a
Merge pull request #2055 from martin-frbg/atomid
Add CPUID data for Intel Denverton (as Nehalem)
2019-03-12 22:57:07 +01:00
Martin Kroeker 04f2226ea6
Add Intel Denverton 2019-03-12 16:09:55 +01:00
Martin Kroeker b1393c7a97
Add Intel Denverton
for #2048
2019-03-12 16:03:56 +01:00
maomao194313 7e3eb9b25d
make DYNAMIC_ARCH=1 package work on TSV110 2019-03-12 16:11:01 +08:00
maomao194313 f074d7d146
make DYNAMIC_ARCH=1 package work on TSV110. 2019-03-12 16:05:19 +08:00
Martin Kroeker f18ab6c17b
Merge pull request #2051 from martin-frbg/issue2048
Make TARGET=GENERIC compatible with DYNAMIC_ARCH=1
2019-03-09 16:39:35 +01:00
Martin Kroeker 946ec6c3b8
Merge pull request #2050 from kencu/PowerMacFix
PowerMac 970 fixes
2019-03-09 16:39:08 +01:00
Martin Kroeker 5b95534afc
Make TARGET=GENERIC compatible with DYNAMIC_ARCH=1
for issue #2048
2019-03-09 11:21:16 +01:00