Commit Graph

213 Commits

Author SHA1 Message Date
Andrew
e5752ff9b3 take out unused variables 2018-01-20 11:42:31 +01:00
Andrew
8a0b086b28 add missing bracket for old glibc (cppcheck) 2018-01-12 22:35:48 +01:00
Andrew
8aafa0473c address last warnings as seen by gcc7 2018-01-01 20:57:12 +01:00
Andrew
bfc2a88594 remove unused buffer 2017-12-22 00:55:40 +01:00
Martin Kroeker
28ae3ca76f Limit MAX_CPU to 1024 for now
Some Linux distributions (notably SuSE) have raised CPU_SETSIZE to 4096, apparently disregarding API limitations.
From #1348, the highest value to survive array initialization (on a desktop system) is 3232, and 1024 - which is the 
more usual CPU_SETSIZE limit, was demonstrated to work fine on an actual bignuma system.
2017-12-05 12:54:15 +01:00
Martin Kroeker
07e7c36dac Handle shmem init failures in cpu affinity setup code
Failures to obtain or attach shared memory segments would lead to an exit without explanation of the exact cause.
This change introduces a more verbose error message and tries to make the code continue without setting cpu affinity.
Fixes #1351
2017-11-18 23:57:44 +01:00
Martin Kroeker
2a6fef9a55 Try to handle shmget or shmat failing
also replaces one verbatim sched_yield with the YIELDING macro for consistency as suggested in #1351
2017-11-09 23:16:13 +01:00
Martin Kroeker
514d237257 Merge pull request #1279 from xsacha/develop
CMake improvements
2017-10-06 21:13:45 +02:00
Martin Kroeker
ba1f91f17b Convert another caller of "allocation" to LOCK_COMMAND
... as the "allocation" code jumped to now does UNLOCK_COMMAND instead of blas_unlock
2017-09-09 20:30:33 +02:00
Martin Kroeker
f460776f0f Fix thread data races 2017-09-09 19:07:06 +02:00
Martin Kroeker
e882f3d6f3 Fix thread data race in memory.c 2017-09-09 18:58:38 +02:00
Sacha Refshauge
37858d1146 Fix threading usage in CMake: s/SMP/USE_THREAD/ 2017-08-19 15:07:42 +10:00
Isuru Fernando
2f12ea017b No strncasecmp with MSVC 2017-08-08 00:07:25 +05:30
Martin Kroeker
ebb04e3265 Merge pull request #1259 from isuruf/cmake
CMake Improvements
2017-08-02 15:31:05 +02:00
Isuru Fernando
d245caa49a Support out-of-source build 2017-08-01 15:16:14 +05:30
Martin Kroeker
63cfa32691 Rework __GLIBC_PREREQ checks to avoid breaking non-glibc builds 2017-07-31 21:02:43 +02:00
Martin Kroeker
c4af196a2d Honor cgroup/cpuset limits when enumerating cpus 2017-07-25 22:47:34 +02:00
Martin Kroeker
480e697681 Revert "Honor cgroup/cpuset limits when enumerating cpus" (#1246) 2017-07-24 16:17:50 +02:00
Martin Kroeker
80373ea039 More fixes for silly misedits 2017-07-15 12:48:42 +02:00
Martin Kroeker
d12b75a6c4 Fixup braces lost in previous edit 2017-07-15 11:53:28 +02:00
Martin Kroeker
7294fb1d9d Merge branch 'develop' into cgroups 2017-07-15 10:40:42 +02:00
Martin Kroeker
731c518cff Add files via upload 2017-07-11 18:42:39 +02:00
Martin Kroeker
29fc429d9a Honor cgroup/cpuset constraints when enumerating cpus 2017-07-11 18:27:33 +02:00
Martin Kroeker
3db2adf872 Merge pull request #1230 from martin-frbg/rhel5
Add sched_getcpu implementation for pre-2.6 glibc
2017-07-09 13:16:16 +02:00
Martin Kroeker
c1cf62d2c0 Add sched_getcpu implementation for pre-2.6 glibc
Fixes #1210, compilation on RHEL5 with affinity enabled
2017-07-09 09:45:38 +02:00
Neil Shipp
34513be726 Add Microsoft Windows 10 UWP build support 2017-06-23 13:07:34 -07:00
Neil Shipp
65e56cb29d Add 64bit support for Microsoft Visual Studio 2017-06-21 13:38:22 -07:00
James Cowgill
59c97cfee4 memory: Fix buffer overflow when position == NUM_BUFFERS 2017-05-05 17:47:03 +01:00
James Cowgill
5fecfe0f42 memory: switch loop condition around in blas_memory_free
Before this commit, the "position < NUM_BUFFERS" loop condition from
blas_memory_free will be completely optimized away by GCC. This is
because the condition can only be false after undefined behavior has
already been invoked (reading past the end of an array). As a
consequence of this bug, GCC also removes the subsequent if statement
and all the code after the error label because all of it is dead.

This commit switches the loop condition around so it works as intended.
2017-05-05 16:01:58 +01:00
Gian-Carlo Pascutto
9c884986ad Add an extra familiy/model combination used by AMD Steamrolller (Godavari). 2017-04-19 19:15:47 +02:00
Gian-Carlo Pascutto
0cbd2d34e4 Recognize ZEN when passed as OPENBLAS_CORETYPE. 2017-04-10 20:05:16 +02:00
Gian-Carlo Pascutto
62979fd104 Fix dynamic detection for ZEN CPUs. 2017-04-10 19:08:37 +02:00
Denis Steckelmacher
c9ff735da6 Add ZEN support (tested for auto-detected static backend) 2017-03-19 15:32:50 +01:00
Martin Kroeker
ffc1d6c468 Merge pull request #1108 from ashwinyes/develop_20170203_thunderx2t99
Optimized Implementations for ThunderX2T99
2017-02-28 16:02:19 +01:00
Ashwin Sekhar T K
a86474c6f7 THUNDERX2T99: Performance fix for ZGEMM 2017-02-28 06:05:00 -08:00
Ashwin Sekhar T K
19ba133383 THUNDERX2T99: Add Optimized ZGEMM Implementation 2017-02-28 05:31:41 +00:00
Andrew
5088523786 detect apollo lake for real 2017-02-20 23:54:59 +01:00
Elliot Saba
1d8ab99e09 Add exfamily == 9 case (Kaby Lake) to dynamic arch detection 2017-02-10 15:23:55 -08:00
Martin Koehler
76c6e33e54 Enable EXCAVATOR kernels for A12-9800 2017-02-07 21:38:28 +01:00
Ashwin Sekhar T K
2757b49767 THUNDERX2T99: Add Optimized CGEMM Implementation 2017-01-30 17:44:26 +05:30
Ashwin Sekhar T K
f279ff4789 THUNDERX2T99: Add Optimized SGEMM Implementation 2017-01-16 21:44:33 +05:30
Zhang Xianyi
0863a0d4b4 Merge pull request #1061 from ashwinyes/develop_aarch64_vulcan_thunderx_patch
Add new targets for ARM64
2017-01-16 13:20:10 +08:00
Werner Saar
c1c5a63d3c prepared parameter.c for UNROLL values, that are not a power of two 2017-01-11 09:50:28 +01:00
Ashwin Sekhar T K
4b55fae337 ARM64: Add Cavium THUNDERX2T99 Target 2017-01-11 11:18:40 +05:30
Ashwin Sekhar T K
0b8e876d89 VULCAN: Add optimized DGEMM implementation 2017-01-10 15:01:37 +05:30
Ashwin Sekhar T K
4713e7c47f ARM64: Add the VULCAN Target 2017-01-10 15:01:17 +05:30
jiahaipeng
1aa1e6cb54 modify the blas_l1_thread.c for support multi-threded for L1 fuction with return value 2017-01-10 11:47:06 +08:00
Martin Kroeker
51aa157e64 Relocate declaration of alloc_lock outside ifdef block 2017-01-09 01:10:43 +01:00
Martin Kroeker
87c7d10b34 Fix thread data races detected by helgrind 3.12
Ref. #995, may possibly help solve issues seen in 660,883
2017-01-08 23:33:51 +01:00
Martin Kroeker
0ef7841473 Update xerbla.c 2017-01-04 23:16:48 +01:00