OpenBLAS

Commit Graph

Author	SHA1	Message	Date
Chen, Guobing	0c1c903f1e	Fix OMP num specify issue In current code, no matter what number of threads specified, all available CPU count is used when invoking OMP, which leads to very bad performance if the workload is small while all available CPUs are big. Lots of time are wasted on inter-thread sync. Fix this issue by really using the number specified by the variable 'num' from calling API. Signed-off-by: Chen, Guobing <guobing.chen@intel.com>	2020-08-24 02:45:54 +08:00
Martin Kroeker	791e046744	Update conditional for atomics to use HAVE_C11	2020-07-18 17:05:59 +00:00
Martin Kroeker	47bf0dba8f	Add build-time option for OMP scheduler; document MULTITHREAD_THRESHOLD range (#1620 ) * Allow choosing the OpenMP scheduler and add range hint for GEMM_MULTITHREAD_THRESHOLD * Amended description of GEMM_MULTITHREAD_THRESHOLD to reflect #742 making it track floating point operations rather than matrix size	2018-06-15 11:25:05 +02:00
zhiyong.dang	53457f222f	move _Atomic define to common.h	2018-05-11 00:13:16 -07:00
Zhiyong Dang	3716267124	Change _STDC_VERSION__ to __STDC_VERSION__ Change-Id: Id3fa4e8d9eedd4ef7230df69b611e7f397301a42	2018-05-11 12:15:08 +08:00
Zhiyong Dang	1b83341d19	Fix race condition in blas_server_omp.c Change-Id: Ic896276cd073d6b41930c7c5a29d66348cd1725d	2018-04-27 17:00:42 +08:00
Timothy Gu	6c2ead30f0	Remove all trailing whitespace except lapack-netlib Signed-off-by: Timothy Gu <timothygu99@gmail.com>	2014-06-27 12:05:18 -07:00
Olivier Grisel	046e4013cb	Revert "Refs #294 . Used pthread_atfork to avoid hang after a Unix fork." This reverts commit `3617c22a56`.	2014-02-19 18:32:54 +01:00
Zhang Xianyi	3617c22a56	Refs #294 . Used pthread_atfork to avoid hang after a Unix fork. The problem is the mutex we used in blas_server. Thus, we must clear the mutex before the fork and re-init them at parent and child process. If you used OpenMP, GOMP has the same problem by now. Please try other OpenMP implemantation.	2014-02-18 15:36:04 +08:00
Zhang Xianyi	2a7503e563	Refs #225 . Fixed a bug in GEMM OpenMP threading.	2013-07-15 09:56:19 +08:00
Zhang Xianyi	d744c9590a	In OpenMP threading, preallocate the thread buffer instead of allocating the buffer every time. This patch improved the performance slightly.	2013-03-01 14:36:47 +08:00
Zhang Xianyi	3cc6ae793e	Refs #174 . Return sb pointer when OpenMP or Windows.	2013-02-26 00:48:21 +08:00
Xianyi Zhang	4727fe8abf	Refs #47 . On Loongson 3A, set DGEMM_R parameter depending on different number of threads. It would improve double precision BLAS3 on multi-threads.	2011-09-05 15:13:52 +00:00
Xianyi Zhang	82f5274828	Refs #39 . It's unnecessary to include sys/mman.h file in blas_server_omp.c.	2011-06-22 01:52:20 +08:00
Xianyi Zhang	989c6f8b06	Fixed #14 the SEGFAULT bug on 64 cores. On SMP server, the number of CPUs or cores should be less than or equal to 64.	2011-04-07 14:48:10 +08:00
Xianyi Zhang	342bbc3871	Import GotoBLAS2 1.13 BSD version codes.	2011-01-24 14:54:24 +00:00

16 Commits