Commit Graph

297 Commits

Author SHA1 Message Date
Martin Kroeker
e388459a27 Merge pull request #1419 from brada4/develop
Initialize unitialized values for repeated calls
2018-01-31 23:48:34 +01:00
Andrew
e5752ff9b3 take out unused variables 2018-01-20 11:42:31 +01:00
Andrew
8a0b086b28 add missing bracket for old glibc (cppcheck) 2018-01-12 22:35:48 +01:00
Martin Kroeker
42285d8e70 Merge pull request #1410 from brada4/develop
Address warnings #1357
2018-01-06 20:02:46 +01:00
Andrew
8aafa0473c address last warnings as seen by gcc7 2018-01-01 20:57:12 +01:00
Andrew
11a627c54e remove surplus parentheses to silence clang5 2018-01-01 20:56:26 +01:00
Martin Kroeker
cc9500db41 Merge pull request #1403 from brada4/develop
Address few more warnings
2017-12-30 14:51:34 +01:00
Andrew
bfc2a88594 remove unused buffer 2017-12-22 00:55:40 +01:00
Martin Kroeker
177b78c8b4 Issue1388 (#1389)
* Calculation of chunk range limits was ignoring num_cpu

bug introduced by me in #1262 - should fix #1388

* Calculation of range limits was ignoring num_cpu

bug introduced by me in #1262

* Calculation of chunk range limits was ignoring num_cpu

bug introduced by me in #1262

* Calculation of chunk range limits was ignoring num_cpu

bug introduced by me in #1262

* Calculation of chunk range limits was ignoring num_cpu

bug introduced by me in #1262

* Calculation of chunk range limits was ignoring num_cpu

bug introduced by me in #1262
2017-12-09 22:29:03 +01:00
Andrew
281a2b952f warning cleanup (#1380)
* dead increments in driver/level2

* dead increments in kernel/generic

* part dead increments in kernel/x86_64
2017-12-05 19:54:10 +01:00
Martin Kroeker
c49c6b237d Merge pull request #1382 from martin-frbg/dtrmv-1332
Work around errors in multithreaded dtrmv
2017-12-05 19:53:23 +01:00
Martin Kroeker
28ae3ca76f Limit MAX_CPU to 1024 for now
Some Linux distributions (notably SuSE) have raised CPU_SETSIZE to 4096, apparently disregarding API limitations.
From #1348, the highest value to survive array initialization (on a desktop system) is 3232, and 1024 - which is the 
more usual CPU_SETSIZE limit, was demonstrated to work fine on an actual bignuma system.
2017-12-05 12:54:15 +01:00
Martin Kroeker
b414283f48 Disable gemv unrolling
as a (hopefully temporary) workaround for #1332
2017-12-03 22:41:54 +01:00
Andrew
ef95cd471f elminate unread variable, after reiteration 3 of them (clang4) 2017-11-25 02:54:37 +01:00
Andrew
e14d50d86e eliminate Wunused-const gcc7 warning 2017-11-24 19:13:24 +01:00
Martin Kroeker
07e7c36dac Handle shmem init failures in cpu affinity setup code
Failures to obtain or attach shared memory segments would lead to an exit without explanation of the exact cause.
This change introduces a more verbose error message and tries to make the code continue without setting cpu affinity.
Fixes #1351
2017-11-18 23:57:44 +01:00
Martin Kroeker
2a6fef9a55 Try to handle shmget or shmat failing
also replaces one verbatim sched_yield with the YIELDING macro for consistency as suggested in #1351
2017-11-09 23:16:13 +01:00
Martin Kroeker
db72ad8f6a Merge pull request #1320 from timmoon10/develop
2D thread distribution for multi-threaded GEMMs
2017-10-08 23:31:33 +02:00
Martin Kroeker
514d237257 Merge pull request #1279 from xsacha/develop
CMake improvements
2017-10-06 21:13:45 +02:00
Tim Moon
30486a356c Reduce number of data partitions in n. 2017-10-04 12:37:49 -07:00
Tim Moon
9de52b489a Cleaning up and documenting multi-threaded GEMM code. 2017-10-03 16:32:08 -07:00
Tim Moon
860dcfc703 Use 2D thread distribution for small GEMMs.
Allows maximum use of available cores if one of M and N is small and the other is large.
2017-10-03 13:43:39 -07:00
Tim Moon
6aaa107865 Reducing threads for multi-threaded GEMMs on small matrices. 2017-09-27 19:25:33 -07:00
Martin Kroeker
ba1f91f17b Convert another caller of "allocation" to LOCK_COMMAND
... as the "allocation" code jumped to now does UNLOCK_COMMAND instead of blas_unlock
2017-09-09 20:30:33 +02:00
Martin Kroeker
f460776f0f Fix thread data races 2017-09-09 19:07:06 +02:00
Martin Kroeker
e882f3d6f3 Fix thread data race in memory.c 2017-09-09 18:58:38 +02:00
Sacha Refshauge
37858d1146 Fix threading usage in CMake: s/SMP/USE_THREAD/ 2017-08-19 15:07:42 +10:00
Isuru Fernando
2f12ea017b No strncasecmp with MSVC 2017-08-08 00:07:25 +05:30
Martin Kroeker
719fcc56b0 Merge pull request #1262 from martin-frbg/xmv_thread-splitting
Make sure that range limit of last thread never exceeds data size
2017-08-06 14:11:44 +02:00
Martin Kroeker
ebb04e3265 Merge pull request #1259 from isuruf/cmake
CMake Improvements
2017-08-02 15:31:05 +02:00
Martin Kroeker
0ba64cee60 Update trmv_thread.c 2017-08-02 12:03:54 +02:00
Martin Kroeker
c4e5ba1bfe Make sure that range_n of last thread never exceeds the actual data size when splitting the workload 2017-08-02 00:37:58 +02:00
Martin Kroeker
a6f533b248 Revert "Fix calculated range limit exceeding actual data size for last thread" 2017-08-01 19:28:08 +02:00
Isuru Fernando
d245caa49a Support out-of-source build 2017-08-01 15:16:14 +05:30
Martin Kroeker
e70a6b92bf Merge pull request #1257 from martin-frbg/cgroups-prereq
Rework __GLIBC_PREREQ checks to avoid breaking non-glibc builds
2017-08-01 11:23:03 +02:00
Martin Kroeker
63cfa32691 Rework __GLIBC_PREREQ checks to avoid breaking non-glibc builds 2017-07-31 21:02:43 +02:00
Martin Kroeker
585c0010a5 Fix range limit exceeding actual data size in last step 2017-07-28 00:27:02 +02:00
Martin Kroeker
857f61bc5d Fix range limit exceeding data size in last step 2017-07-28 00:21:53 +02:00
Martin Kroeker
9332042d5f Fix range exceeding actual data size in quick_divide 2017-07-28 00:13:24 +02:00
Martin Kroeker
c4af196a2d Honor cgroup/cpuset limits when enumerating cpus 2017-07-25 22:47:34 +02:00
Martin Kroeker
480e697681 Revert "Honor cgroup/cpuset limits when enumerating cpus" (#1246) 2017-07-24 16:17:50 +02:00
Martin Kroeker
80373ea039 More fixes for silly misedits 2017-07-15 12:48:42 +02:00
Martin Kroeker
d12b75a6c4 Fixup braces lost in previous edit 2017-07-15 11:53:28 +02:00
Martin Kroeker
7294fb1d9d Merge branch 'develop' into cgroups 2017-07-15 10:40:42 +02:00
Zhang Xianyi
2a7c6930ac Merge pull request #1234 from brada4/develop
Fix write past fixed size buffer
2017-07-13 20:27:37 +08:00
Andrew
529bfc36ec Fix write past fixed size buffer 2017-07-12 00:59:30 +02:00
Martin Kroeker
731c518cff Add files via upload 2017-07-11 18:42:39 +02:00
Martin Kroeker
29fc429d9a Honor cgroup/cpuset constraints when enumerating cpus 2017-07-11 18:27:33 +02:00
Martin Kroeker
3db2adf872 Merge pull request #1230 from martin-frbg/rhel5
Add sched_getcpu implementation for pre-2.6 glibc
2017-07-09 13:16:16 +02:00
Martin Kroeker
c1cf62d2c0 Add sched_getcpu implementation for pre-2.6 glibc
Fixes #1210, compilation on RHEL5 with affinity enabled
2017-07-09 09:45:38 +02:00