Commit Graph

16 Commits

Author SHA1 Message Date
Chen, Guobing 0c1c903f1e Fix OMP num specify issue
In current code, no matter what number of threads specified, all
available CPU count is used when invoking OMP, which leads to very bad
performance if the workload is small while all available CPUs are big.
Lots of time are wasted on inter-thread sync. Fix this issue by really
using the number specified by the variable 'num' from calling API.

Signed-off-by: Chen, Guobing <guobing.chen@intel.com>
2020-08-24 02:45:54 +08:00
Martin Kroeker 791e046744
Update conditional for atomics to use HAVE_C11 2020-07-18 17:05:59 +00:00
Martin Kroeker 47bf0dba8f
Add build-time option for OMP scheduler; document MULTITHREAD_THRESHOLD range (#1620)
* Allow choosing the OpenMP scheduler and add range hint for GEMM_MULTITHREAD_THRESHOLD
* Amended description of GEMM_MULTITHREAD_THRESHOLD
to reflect #742 making it track floating point operations rather than matrix size
2018-06-15 11:25:05 +02:00
zhiyong.dang 53457f222f move _Atomic define to common.h 2018-05-11 00:13:16 -07:00
Zhiyong Dang 3716267124 Change _STDC_VERSION__ to __STDC_VERSION__
Change-Id: Id3fa4e8d9eedd4ef7230df69b611e7f397301a42
2018-05-11 12:15:08 +08:00
Zhiyong Dang 1b83341d19 Fix race condition in blas_server_omp.c
Change-Id: Ic896276cd073d6b41930c7c5a29d66348cd1725d
2018-04-27 17:00:42 +08:00
Timothy Gu 6c2ead30f0 Remove all trailing whitespace except lapack-netlib
Signed-off-by: Timothy Gu <timothygu99@gmail.com>
2014-06-27 12:05:18 -07:00
Olivier Grisel 046e4013cb Revert "Refs #294. Used pthread_atfork to avoid hang after a Unix fork."
This reverts commit 3617c22a56.
2014-02-19 18:32:54 +01:00
Zhang Xianyi 3617c22a56 Refs #294. Used pthread_atfork to avoid hang after a Unix fork.
The problem is the mutex we used in blas_server. Thus, we must clear
the mutex before the fork and re-init them at parent and child process.

If you used OpenMP, GOMP has the same problem by now. Please try other OpenMP
implemantation.
2014-02-18 15:36:04 +08:00
Zhang Xianyi 2a7503e563 Refs #225. Fixed a bug in GEMM OpenMP threading. 2013-07-15 09:56:19 +08:00
Zhang Xianyi d744c9590a In OpenMP threading, preallocate the thread buffer instead of allocating the buffer every time. This patch improved the performance slightly. 2013-03-01 14:36:47 +08:00
Zhang Xianyi 3cc6ae793e Refs #174. Return sb pointer when OpenMP or Windows. 2013-02-26 00:48:21 +08:00
Xianyi Zhang 4727fe8abf Refs #47. On Loongson 3A, set DGEMM_R parameter depending on different number of threads. It would improve double precision BLAS3 on multi-threads. 2011-09-05 15:13:52 +00:00
Xianyi Zhang 82f5274828 Refs #39. It's unnecessary to include sys/mman.h file in blas_server_omp.c. 2011-06-22 01:52:20 +08:00
Xianyi Zhang 989c6f8b06 Fixed #14 the SEGFAULT bug on 64 cores. On SMP server, the number of CPUs or cores should be less than or equal to 64. 2011-04-07 14:48:10 +08:00
Xianyi Zhang 342bbc3871 Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00