Commit Graph

172 Commits

Author SHA1 Message Date
Deeksha Goplani
0dc80a5c8d locks improvement 2024-05-13 22:17:23 +05:30
Martin Kroeker
f0f1ff7820 fix HUGETLB allocation for TLS mode as well 2024-05-08 00:40:36 +02:00
Martin Kroeker
dc99b61380 sort unwanted interdependencies of alloc_shm and alloc_hugetlb 2024-05-04 14:49:00 +02:00
gxw
d8c4ea8793 loongarch: Optimizing the performance of the GEMM on servers 2024-04-09 09:03:34 -04:00
Martin Kroeker
d938aed7fe reset "mem structure overflowed" state on shutdown 2024-01-23 17:15:53 +01:00
Martin Kroeker
90f890ee67 fix improper function prototypes (empty parentheses) (USE_TLS branch) 2023-09-30 23:12:36 +02:00
Martin Kroeker
c6b1d8e7a3 fix improper function prototypes (empty parentheses) 2023-09-30 12:52:06 +02:00
Martin Kroeker
7e939fb831 Fix handling of additional buffer structures in case of overflow 2023-09-19 23:33:39 +02:00
Tiziano Müller
6a611db560 memory: show correct number of max threads 2023-09-10 08:44:07 +02:00
Martin Kroeker
3326b924b3 remove status variable blas_num_threads_set; initialize openmp thread maximum on startup 2023-07-26 00:31:24 +02:00
Martin Kroeker
e5538a62cb Add suggestions to NUM_THREADS/auxiliary buffer message 2023-05-04 22:56:39 +02:00
Martin Kroeker
e298d613fa initialize status variable for openblas_set_num_threads 2023-03-08 23:43:15 +01:00
Martin Kroeker
e38ab079a0 Fix OpenMP thread counting returning places rather than cores 2023-03-08 19:17:33 +01:00
Martin Kroeker
69148ae795 Guard against sysconf returning zero processors 2022-07-06 17:22:18 +02:00
Martin Kroeker
b329e45288 Guard against omp_get_num_places returning zero 2022-01-01 00:46:23 +01:00
Martin Kroeker
c8d05aa7a5 Move the threads overflow flag under the protection of the local blas lock (#3476)
* Move accesses to the overflow flag into the scope of the blas lock
2021-12-13 08:34:52 +01:00
Martin Kroeker
4f057bffd6 Fix NULL pointer checks in blas_memory_alloc 2021-11-05 10:43:17 +01:00
Martin Kroeker
efb16fafb0 Fix miscounting of threadpool size on Linux with OMP_PROC_BIND=TRUE (#3437)
*  return OMP places (if available, or SC_NPROCESSORS_CONF) for maximum thread count when built with OpenMP
2021-11-04 12:11:16 +01:00
Martin Kroeker
dd09f0173e Remove extraneous qualifiers from struct definition 2021-09-14 21:52:26 +02:00
Martin Kroeker
cd10d1c03b Fix typo 2021-08-30 14:38:28 +02:00
Martin Kroeker
2db1a99aca Clean up debug messages 2021-08-30 14:21:25 +02:00
Martin Kroeker
89fc5b8f4f Fix unmap logic 2021-08-29 19:50:24 +02:00
Martin Kroeker
7fd12a5e69 Add likely() hints for gcc 2021-08-29 13:54:51 +02:00
Martin Kroeker
2ba9a567aa Fix typo 2021-08-28 17:14:59 +02:00
Martin Kroeker
b4b952eece Add auxiliary tracking space for thread buffer frees too 2021-08-28 17:03:53 +02:00
Martin Kroeker
7d1becc575 Allocate an auxiliary struct when running out of preconfigured threads 2021-08-28 14:18:36 +02:00
Martin Kroeker
898212efcd Actually add the message to the TLS section 2021-08-02 14:50:14 +02:00
Martin Kroeker
210a1584c5 Rebase source and edit TLS version of the message as well 2021-08-02 14:19:16 +02:00
Martin Kroeker
f2a7a67f5a Improve the "tried to allocate too many buffers" error message 2021-07-31 17:23:40 +02:00
Craig Watson
4d7dfe4845 Include Haiku in processor count checks 2021-07-27 09:00:30 +00:00
River Dillon
2f6326a630 Remove <linux/unistd.h> 2021-07-10 00:36:07 -07:00
Martin Kroeker
1a3ad4b670 Fix signatures of the TLS-mode dll_callback and p_process_term functions for Win64 2021-02-22 19:40:36 +01:00
Martin Kroeker
b0bded3f2f Fix get_num_procs() in the USE_TLS branch for non-glibc systems 2021-02-18 11:14:05 +01:00
Martin Kroeker
0cc36770f1 Merge pull request #3073 from xoviat/embedded
add embedded option
2021-01-31 18:02:41 +01:00
Alex Henrie
113840da12 Fix null pointer check in blas_memory_alloc 2021-01-24 22:20:44 -07:00
xoviat
2e8d6e8690 add functions for embedded 2021-01-23 22:12:17 -06:00
xoviat
b60de4447a add cortex-m platform 2021-01-19 08:57:44 -06:00
Martin Kroeker
fd7da56965 Move definitions that are neither needed nor supported on SUNOS 2020-10-25 12:01:50 +01:00
Martin Kroeker
ac653c94f3 Merge branch 'develop' into issue2588-cmake 2020-10-11 13:57:07 +02:00
Alexander Grund
3c05f54df8 Avoid out of bounds access on invalid memory free 2020-10-01 10:48:45 +02:00
Alexander Grund
dee7c49938 Fix TABs and trailing space 2020-10-01 10:43:16 +02:00
Martin Kroeker
357bff06b5 Add BUILD_vartype defines 2020-09-22 23:24:22 +02:00
Martin Kroeker
09eb9d2584 Update conditional for atomics to HAVE_C11 2020-07-18 17:07:38 +00:00
Martin Kroeker
f4248af26e Fix compiler warnings 2020-04-28 10:43:12 +02:00
Martin Kroeker
f41600e66f Add a read barrier in the traversing of the buffer list
Needed on systems with weak memory ordering - the inferior, partially working fix from #2544 was already removed in #2551
2020-04-13 12:34:02 +02:00
Martin Kroeker
2a28448a96 Add safeguards for sufficient BUFFER_SIZE 2020-04-12 19:45:36 +02:00
Martin Kroeker
69f277f8ee Add another memory barrier for ARM and a multicore test run on ThunderX to help detect such issues (#2544)
* Add another memory barrier in memory.c to prevent races in memory slot allocation

* Add an all-core test on Drone.io's ThunderX platform and modify dgemm_tester to use all 96 cores
2020-04-08 11:04:51 +02:00
Martin Kroeker
78100b8093 Free Windows thread memory with MEM_RELEASE rather than MEM_DECOMMIT
as suggested by hjmndv in #2370
2020-01-18 15:06:39 +01:00
Martin Kroeker
1b90989662 Add NetBSD to the xBSD conditionals 2019-10-25 12:52:49 +02:00
Martin Kroeker
1776ad82c0 Add files via upload 2019-08-09 00:08:11 +02:00