OpenBLAS

Commit Graph

Author	SHA1	Message	Date
Martin Kroeker	c6b1d8e7a3	fix improper function prototypes (empty parentheses)	2023-09-30 12:52:06 +02:00
Martin Kroeker	b34f19a365	Ensure that a premature call to set_num_threads will not overwrite unrelated memory	2023-07-19 22:19:22 +02:00
Kai T. Ohlhus	84453b924f	Support CONSISTENT_FPCSR on AARCH64	2022-09-22 00:20:40 +09:00
Martin Kroeker	30473b6a9d	add openblas_getaffinity()	2022-07-27 19:15:18 +02:00
Martin Kroeker	07fe5b19a4	typecast function pointers	2021-12-21 12:31:54 +01:00
Peter Hawkins	dbbf92c1d1	Fix race in blas_thread_shutdown. blas_server_avail was read without holding server_lock. If multiple threads call blas_thread_shutdown simultaneously, for example, by calling fork(), then they can attempt to shut down multiple times. This can lead to a segmentation fault.	2021-02-18 13:46:50 -05:00
gxw	4b548857d6	Add msa support for loongson 1. Using core loongson3r3 and loongson3r4 for loongson 2. Add DYNAMIC_ARCH for loongson Change-Id: I1c6b54dbeca3a0cc31d1222af36a7e9bd6ab54c1	2020-12-09 10:28:46 +08:00
Martin Kroeker	85154c2e18	Change "HALF" and "sh" to "BFLOAT16" and "sb"	2020-10-12 00:05:05 +02:00
Martin Kroeker	357bff06b5	Add BUILD_vartype defines	2020-09-22 23:24:22 +02:00
Chen, Guobing	deaeb6c5b8	Add bfloat16 based dot and conversion with single/double 1. Added bfloat16 based dot as new API: shdot 2. Implemented generic kernel and cooperlake-specific (AVX512-BF16) kernel for shdot 3. Added 4 conversion APIs for bfloat16 data type <=> single/double: shstobf16 shdtobf16 sbf16tos dbf16tod shstobf16 -- convert single float array to bfloat16 array shdtobf16 -- convert double float array to bfloat16 array sbf16tos -- convert bfloat16 array to single float array dbf16tod -- convert bfloat16 array to double float array 4. Implemented generic kernels for all 4 conversion APIs, and cooperlake-specific kernel for shstobf16 and shdtobf16 5. Update level1 thread facilitate functions and macros to support multi-threading for these new APIs 6. Fix Cooperlake platform detection/specify issue when under dynamic-arch building 7. Change the typedef of bfloat16 from unsigned short to more strict uint16_t Signed-off-by: Chen, Guobing <guobing.chen@intel.com>	2020-09-04 02:31:25 +08:00
Martin Kroeker	94bab9d1f9	Update conditional for atomics to use HAVE_C11	2020-07-18 17:03:31 +00:00
Martin Kroeker	f4248af26e	Fix compiler warnings	2020-04-28 10:43:12 +02:00
Sharvil Nanavati	7b4773b24d	Add API to set thread affinity on Linux. Issue: #2545	2020-04-08 12:49:35 -07:00
Martin Kroeker	d68e4ba59b	Fix cut/paste glitch	2020-03-03 21:37:48 +01:00
Martin Kroeker	635c9e4e09	Restore initializers for mutex and conditional	2020-03-03 21:04:12 +01:00
Ali Saidi	43c2e845ab	Switch blas_server to use acq/rel semantics Heavy-weight locking isn't required to pass the work queue pointer between threads and simple atomic acquire/release semantics can be used instead. This is especially important as pthread_mutex_lock() isn't fair. We've observed substantial variation in runtime because of the the unfairness of these locks which complety goes away with this implementation. The locks themselves are left to provide a portable way for idling threads to sleep/wakeup after many unsuccessful iterations waiting.	2020-03-02 02:52:49 +00:00
luz.paz	daf2fec12d	Misc. typo fixes Found via `codespell -q 3 -w -L ith,als,dum,nd,amin,nto,wis,ba -S ./relapack,./kernel,./lapack-netlib`	2019-04-29 17:03:56 -04:00
Erik M. Bray	38cf5d9364	ensure that threading has been initialized in the first place before calling openblas_set_num_threads	2018-10-28 21:16:52 +00:00
Martin Kroeker	28aa94bf4b	Include thread numbers in failure message from blas_thread_init to aid in debugging cases like #1767	2018-09-22 14:00:15 +02:00
Zoltán Mizsei	6463bffd59	Haiku supporting patches	2018-08-02 20:49:14 +02:00
Alex Arslan	a41d241a0e	Add support for DragonFly BSD	2018-04-03 16:39:29 -07:00
Alex Arslan	8da6b6ae52	Allow building on OpenBSD With this change, OpenBLAS builds and all tests pass on OpenBSD 6.2 using Clang. Tested on x86-64 only, with and without DYNAMIC_ARCH=1.	2018-04-02 10:48:22 -07:00
Martin Kroeker	f460776f0f	Fix thread data races	2017-09-09 19:07:06 +02:00
Martin Kroeker	87c7d10b34	Fix thread data races detected by helgrind 3.12 Ref. #995, may possibly help solve issues seen in 660,883	2017-01-08 23:33:51 +01:00
Alex Arslan	a16ace68f5	Include system headers on FreeBSD	2016-11-16 21:58:20 -08:00
Zhang Xianyi	05196a8497	Refs #716 . Only call getenv at init function.	2016-03-09 12:50:07 -05:00
Lauri Tirkkonen	e737e32fd1	RLIMIT_NPROC doesn't exist on illumos	2016-01-22 18:55:51 +02:00
j-bo	6040858b22	Fix #673 Add lacking headers declarations when compiling for Android ARM7	2015-10-27 13:55:24 +01:00
Zhang Xianyi	70642fe4ed	Refs #668 . Raise the signal when pthread_create fails. Thank James K. Lowden for the patch.	2015-10-26 19:02:51 -05:00
Grazvydas Ignotas	d3e2f0a1af	add missing barriers should fix issue #597	2015-08-16 15:37:02 +02:00
Zhang Xianyi	2fb02626da	Update organization info.	2014-11-25 15:28:58 +08:00
Zhang Xianyi	7a8949e0ce	Merge branch 'develop' of https://github.com/TimothyGu/OpenBLAS into TimothyGu-develop Conflicts: driver/others/memory.c	2014-06-28 20:51:31 +08:00
Timothy Gu	6c2ead30f0	Remove all trailing whitespace except lapack-netlib Signed-off-by: Timothy Gu <timothygu99@gmail.com>	2014-06-27 12:05:18 -07:00
Jameson Nash	f41f03ab83	fix #394 . this cleans up some handles after using them, and doesn't disable ALL process privileges upon success	2014-06-27 12:16:57 -04:00
Olivier Grisel	138a841390	FIX #294 : make OpenBLAS thread-pool resilient to fork via pthread_atfork	2014-02-19 19:01:15 +01:00
Olivier Grisel	046e4013cb	Revert "Refs #294 . Used pthread_atfork to avoid hang after a Unix fork." This reverts commit `3617c22a56`.	2014-02-19 18:32:54 +01:00
Zhang Xianyi	3617c22a56	Refs #294 . Used pthread_atfork to avoid hang after a Unix fork. The problem is the mutex we used in blas_server. Thus, we must clear the mutex before the fork and re-init them at parent and child process. If you used OpenMP, GOMP has the same problem by now. Please try other OpenMP implemantation.	2014-02-18 15:36:04 +08:00
Zhang Xianyi	5155e3f509	Refs #174 . Fixed the overflowing buffer bug of multithreading hbmv and sbmv. Instead of using thread 0 buffer, each thread uses its own sb buffer. Thus, it can avoid overflowing thread 0 buffer.	2013-02-13 16:05:58 +08:00
Zhang Xianyi	538c764d2b	Refs #153 . Restore the original CPU affinity when calling openblas_set_num_threads(1). Please read the issue on github.com for the detail.	2012-11-06 18:21:46 +08:00
Zhang Xianyi	a55821a2ec	Refs #132 . Kill the threads when unload the library.	2012-08-11 21:33:15 +08:00
Xianyi Zhang	3c856c0c1a	Check the return value of pthread_create. Update the docs with known issue on Loongson 3A.	2011-09-06 18:27:33 +00:00
Xianyi Zhang	4727fe8abf	Refs #47 . On Loongson 3A, set DGEMM_R parameter depending on different number of threads. It would improve double precision BLAS3 on multi-threads.	2011-09-05 15:13:52 +00:00
Xianyi Zhang	128418f49b	Fixed #10 . Supported GOTO_NUM_THREADS & GOTO_THREADS_TIMEOUT environment variables.	2011-02-24 16:32:13 +08:00
Xianyi Zhang	e6c13e2b3c	changed library name to openblas and modified environment variable.	2011-01-24 17:58:05 +00:00
Xianyi Zhang	342bbc3871	Import GotoBLAS2 1.13 BSD version codes.	2011-01-24 14:54:24 +00:00

45 Commits