shivammonaka
7102367fde
Introduced callback to Pthread, Win32 and OpenMP backend
2024-04-02 08:53:49 +05:30
Martin Kroeker
8a9d492af7
Add default for blas_omp_threads_local
2024-01-14 19:58:49 +01:00
Martin Kroeker
c6b1d8e7a3
fix improper function prototypes (empty parentheses)
2023-09-30 12:52:06 +02:00
Martin Kroeker
b34f19a365
Ensure that a premature call to set_num_threads will not overwrite unrelated memory
2023-07-19 22:19:22 +02:00
Kai T. Ohlhus
84453b924f
Support CONSISTENT_FPCSR on AARCH64
2022-09-22 00:20:40 +09:00
Martin Kroeker
30473b6a9d
add openblas_getaffinity()
2022-07-27 19:15:18 +02:00
Martin Kroeker
07fe5b19a4
typecast function pointers
2021-12-21 12:31:54 +01:00
Peter Hawkins
dbbf92c1d1
Fix race in blas_thread_shutdown.
...
blas_server_avail was read without holding server_lock. If multiple threads call blas_thread_shutdown simultaneously, for example, by calling fork(), then they can attempt to shut down multiple times. This can lead to a segmentation fault.
2021-02-18 13:46:50 -05:00
gxw
4b548857d6
Add msa support for loongson
...
1. Using core loongson3r3 and loongson3r4 for loongson
2. Add DYNAMIC_ARCH for loongson
Change-Id: I1c6b54dbeca3a0cc31d1222af36a7e9bd6ab54c1
2020-12-09 10:28:46 +08:00
Martin Kroeker
85154c2e18
Change "HALF" and "sh" to "BFLOAT16" and "sb"
2020-10-12 00:05:05 +02:00
Martin Kroeker
357bff06b5
Add BUILD_vartype defines
2020-09-22 23:24:22 +02:00
Chen, Guobing
deaeb6c5b8
Add bfloat16 based dot and conversion with single/double
...
1. Added bfloat16 based dot as new API: shdot
2. Implemented generic kernel and cooperlake-specific (AVX512-BF16) kernel for shdot
3. Added 4 conversion APIs for bfloat16 data type <=> single/double: shstobf16 shdtobf16 sbf16tos dbf16tod
shstobf16 -- convert single float array to bfloat16 array
shdtobf16 -- convert double float array to bfloat16 array
sbf16tos -- convert bfloat16 array to single float array
dbf16tod -- convert bfloat16 array to double float array
4. Implemented generic kernels for all 4 conversion APIs, and cooperlake-specific kernel for shstobf16 and shdtobf16
5. Update level1 thread facilitate functions and macros to support multi-threading for these new APIs
6. Fix Cooperlake platform detection/specify issue when under dynamic-arch building
7. Change the typedef of bfloat16 from unsigned short to more strict uint16_t
Signed-off-by: Chen, Guobing <guobing.chen@intel.com>
2020-09-04 02:31:25 +08:00
Martin Kroeker
94bab9d1f9
Update conditional for atomics to use HAVE_C11
2020-07-18 17:03:31 +00:00
Martin Kroeker
f4248af26e
Fix compiler warnings
2020-04-28 10:43:12 +02:00
Sharvil Nanavati
7b4773b24d
Add API to set thread affinity on Linux.
...
Issue: #2545
2020-04-08 12:49:35 -07:00
Martin Kroeker
d68e4ba59b
Fix cut/paste glitch
2020-03-03 21:37:48 +01:00
Martin Kroeker
635c9e4e09
Restore initializers for mutex and conditional
2020-03-03 21:04:12 +01:00
Ali Saidi
43c2e845ab
Switch blas_server to use acq/rel semantics
...
Heavy-weight locking isn't required to pass the work queue
pointer between threads and simple atomic acquire/release
semantics can be used instead. This is especially important as
pthread_mutex_lock() isn't fair.
We've observed substantial variation in runtime because of the
the unfairness of these locks which complety goes away with
this implementation.
The locks themselves are left to provide a portable way for
idling threads to sleep/wakeup after many unsuccessful iterations
waiting.
2020-03-02 02:52:49 +00:00
luz.paz
daf2fec12d
Misc. typo fixes
...
Found via `codespell -q 3 -w -L ith,als,dum,nd,amin,nto,wis,ba -S ./relapack,./kernel,./lapack-netlib`
2019-04-29 17:03:56 -04:00
Erik M. Bray
38cf5d9364
ensure that threading has been initialized in the first place before calling openblas_set_num_threads
2018-10-28 21:16:52 +00:00
Martin Kroeker
28aa94bf4b
Include thread numbers in failure message from blas_thread_init
...
to aid in debugging cases like #1767
2018-09-22 14:00:15 +02:00
Zoltán Mizsei
6463bffd59
Haiku supporting patches
2018-08-02 20:49:14 +02:00
Alex Arslan
a41d241a0e
Add support for DragonFly BSD
2018-04-03 16:39:29 -07:00
Alex Arslan
8da6b6ae52
Allow building on OpenBSD
...
With this change, OpenBLAS builds and all tests pass on OpenBSD 6.2
using Clang. Tested on x86-64 only, with and without DYNAMIC_ARCH=1.
2018-04-02 10:48:22 -07:00
Martin Kroeker
f460776f0f
Fix thread data races
2017-09-09 19:07:06 +02:00
Martin Kroeker
87c7d10b34
Fix thread data races detected by helgrind 3.12
...
Ref. #995 , may possibly help solve issues seen in 660,883
2017-01-08 23:33:51 +01:00
Alex Arslan
a16ace68f5
Include system headers on FreeBSD
2016-11-16 21:58:20 -08:00
Zhang Xianyi
05196a8497
Refs #716 . Only call getenv at init function.
2016-03-09 12:50:07 -05:00
Lauri Tirkkonen
e737e32fd1
RLIMIT_NPROC doesn't exist on illumos
2016-01-22 18:55:51 +02:00
j-bo
6040858b22
Fix #673
...
Add lacking headers declarations when compiling for Android ARM7
2015-10-27 13:55:24 +01:00
Zhang Xianyi
70642fe4ed
Refs #668 . Raise the signal when pthread_create fails.
...
Thank James K. Lowden for the patch.
2015-10-26 19:02:51 -05:00
Grazvydas Ignotas
d3e2f0a1af
add missing barriers
...
should fix issue #597
2015-08-16 15:37:02 +02:00
Zhang Xianyi
2fb02626da
Update organization info.
2014-11-25 15:28:58 +08:00
Zhang Xianyi
7a8949e0ce
Merge branch 'develop' of https://github.com/TimothyGu/OpenBLAS into TimothyGu-develop
...
Conflicts:
driver/others/memory.c
2014-06-28 20:51:31 +08:00
Timothy Gu
6c2ead30f0
Remove all trailing whitespace except lapack-netlib
...
Signed-off-by: Timothy Gu <timothygu99@gmail.com>
2014-06-27 12:05:18 -07:00
Jameson Nash
f41f03ab83
fix #394 . this cleans up some handles after using them, and doesn't disable ALL process privileges upon success
2014-06-27 12:16:57 -04:00
Olivier Grisel
138a841390
FIX #294 : make OpenBLAS thread-pool resilient to fork via pthread_atfork
2014-02-19 19:01:15 +01:00
Olivier Grisel
046e4013cb
Revert "Refs #294 . Used pthread_atfork to avoid hang after a Unix fork."
...
This reverts commit 3617c22a56
.
2014-02-19 18:32:54 +01:00
Zhang Xianyi
3617c22a56
Refs #294 . Used pthread_atfork to avoid hang after a Unix fork.
...
The problem is the mutex we used in blas_server. Thus, we must clear
the mutex before the fork and re-init them at parent and child process.
If you used OpenMP, GOMP has the same problem by now. Please try other OpenMP
implemantation.
2014-02-18 15:36:04 +08:00
Zhang Xianyi
5155e3f509
Refs #174 . Fixed the overflowing buffer bug of multithreading hbmv and sbmv.
...
Instead of using thread 0 buffer, each thread uses its own sb buffer.
Thus, it can avoid overflowing thread 0 buffer.
2013-02-13 16:05:58 +08:00
Zhang Xianyi
538c764d2b
Refs #153 . Restore the original CPU affinity when calling openblas_set_num_threads(1).
...
Please read the issue on github.com for the detail.
2012-11-06 18:21:46 +08:00
Zhang Xianyi
a55821a2ec
Refs #132 . Kill the threads when unload the library.
2012-08-11 21:33:15 +08:00
Xianyi Zhang
3c856c0c1a
Check the return value of pthread_create. Update the docs with known issue on Loongson 3A.
2011-09-06 18:27:33 +00:00
Xianyi Zhang
4727fe8abf
Refs #47 . On Loongson 3A, set DGEMM_R parameter depending on different number of threads. It would improve double precision BLAS3 on multi-threads.
2011-09-05 15:13:52 +00:00
Xianyi Zhang
128418f49b
Fixed #10 . Supported GOTO_NUM_THREADS & GOTO_THREADS_TIMEOUT environment variables.
2011-02-24 16:32:13 +08:00
Xianyi Zhang
e6c13e2b3c
changed library name to openblas and modified environment variable.
2011-01-24 17:58:05 +00:00
Xianyi Zhang
342bbc3871
Import GotoBLAS2 1.13 BSD version codes.
2011-01-24 14:54:24 +00:00