Commit Graph

597 Commits

Author SHA1 Message Date
Martin Kroeker
4f057bffd6 Fix NULL pointer checks in blas_memory_alloc 2021-11-05 10:43:17 +01:00
Martin Kroeker
08f8bb66c0 Add CPUIDs for Alder Lake and other recent Intel cpus 2021-11-04 20:36:39 +01:00
Martin Kroeker
efb16fafb0 Fix miscounting of threadpool size on Linux with OMP_PROC_BIND=TRUE (#3437)
*  return OMP places (if available, or SC_NPROCESSORS_CONF) for maximum thread count when built with OpenMP
2021-11-04 12:11:16 +01:00
Marius Hillenbrand
77747bc536 cpuid_zarch/hwcaps: add documentation and dump hwcaps in init
Add pointers to the definition of the hardware capability flags in glibc
and describe how they relate to the levels CPU_Z13 and CPU_Z14 for
optimized kernels.

To aid identifying available hardware capabilities and in debugging
potential build issues, dump their value in dynamic_arch_init() when
OPENBLAS_VERBOSE is set to 2 or higher.

Signed-off-by: Marius Hillenbrand <mhillen@linux.ibm.com>
2021-10-28 12:08:48 +02:00
Martin Kroeker
22a616bd8f Add model number for Tiger Lake H (mobile variant) 2021-10-27 22:17:58 +02:00
Marius Hillenbrand
44950ca173 s390x: use DYNAMIC_ARCH's cpu detection for compile-time choice
On s390x, the run-time detection for DYNAMIC_ARCH and the compile-time
choice in cpuid_zarch use different methods for identifying the
supported CPU features. To make cpuid_zarch future-proof and both easier
to maintain, switch cpuid_zarch to the same mechanism as DYNAMIC_ZARCH
(i.e., derive the supported CPU features from hwcap flags) and share
code between both (in a new header cpuid_zarch.h).

Signed-off-by: Marius Hillenbrand <mhillen@linux.ibm.com>
2021-10-26 16:19:14 +02:00
Wangyang Guo
3dc6052c7e initial support for Sapphire Rapids platform 2021-10-12 01:30:40 -07:00
Rafael Cardoso Fernandes Sousa
0e8b4adf22 Remove unused commented code (#if directive) 2021-09-15 22:18:48 +00:00
Martin Kroeker
fa8bf57768 Merge pull request #3380 from martin-frbg/structwarn
Remove extraneous qualifiers from struct definition
2021-09-15 07:19:09 +02:00
Martin Kroeker
dd09f0173e Remove extraneous qualifiers from struct definition 2021-09-14 21:52:26 +02:00
Martin Kroeker
2f8220d757 Add sbgemm 2021-09-14 16:14:43 +02:00
Martin Kroeker
5f6a609253 Add sbgemv 2021-09-14 16:13:57 +02:00
Wangyang Guo
045ed5c91d sbgemm: fix build error in BFLOAT16 disabled 2021-09-07 23:37:08 +08:00
Wangyang Guo
8356a604f0 sbgemm: cooperlake: tuning for block params 2021-09-07 21:30:46 +08:00
Martin Kroeker
cd10d1c03b Fix typo 2021-08-30 14:38:28 +02:00
Martin Kroeker
2db1a99aca Clean up debug messages 2021-08-30 14:21:25 +02:00
Martin Kroeker
89fc5b8f4f Fix unmap logic 2021-08-29 19:50:24 +02:00
Martin Kroeker
7fd12a5e69 Add likely() hints for gcc 2021-08-29 13:54:51 +02:00
Martin Kroeker
2ba9a567aa Fix typo 2021-08-28 17:14:59 +02:00
Martin Kroeker
b4b952eece Add auxiliary tracking space for thread buffer frees too 2021-08-28 17:03:53 +02:00
Martin Kroeker
7d1becc575 Allocate an auxiliary struct when running out of preconfigured threads 2021-08-28 14:18:36 +02:00
Martin Kroeker
898212efcd Actually add the message to the TLS section 2021-08-02 14:50:14 +02:00
Martin Kroeker
210a1584c5 Rebase source and edit TLS version of the message as well 2021-08-02 14:19:16 +02:00
Martin Kroeker
f2a7a67f5a Improve the "tried to allocate too many buffers" error message 2021-07-31 17:23:40 +02:00
Craig Watson
4d7dfe4845 Include Haiku in processor count checks 2021-07-27 09:00:30 +00:00
JonasZhou
0fca36c8c3 Add cpu detection support for Zhaoxin processors
Signed-off-by: JonasZhou <JonasZhou@zhaoxin.com>
2021-07-12 13:43:45 +08:00
River Dillon
2f6326a630 Remove <linux/unistd.h> 2021-07-10 00:36:07 -07:00
Martin Kroeker
8f22ac552b Add vendor string Shanghai as successor to Centaur 2021-07-08 18:28:49 +02:00
Martin Kroeker
eb2fdd3af0 Recognize newer Zhaoxin/Centaur processors as Nehalem 2021-07-08 12:23:15 +02:00
User User-User
750719528a bugz 2021-06-20 16:40:43 +02:00
User User-User
6423b282a1 dynamic_arch 2021-06-20 14:19:41 +02:00
Martin Kroeker
307c4c0786 Fix typo 2021-06-16 13:41:16 +02:00
Martin Kroeker
e83df93975 Work around another recent macro name collision with winnt.h 2021-06-16 12:32:34 +02:00
Martin Kroeker
cbfd3c87e1 Recognize Intel Ice Lake SP as Cooper Lake 2021-05-14 20:44:06 +02:00
Martin Kroeker
623d580b4c Restore __volatile__ keyword 2021-04-16 10:27:32 +02:00
Martin Kroeker
186368ddc3 Fix compilation with CLANG 2021-03-16 16:52:57 +01:00
Martin Kroeker
1a3ad4b670 Fix signatures of the TLS-mode dll_callback and p_process_term functions for Win64 2021-02-22 19:40:36 +01:00
Peter Hawkins
dbbf92c1d1 Fix race in blas_thread_shutdown.
blas_server_avail was read without holding server_lock. If multiple threads call blas_thread_shutdown simultaneously, for example, by calling fork(), then they can attempt to shut down multiple times. This can lead to a segmentation fault.
2021-02-18 13:46:50 -05:00
Martin Kroeker
cb429d6b12 Merge pull request #3110 from martin-frbg/issue3108
Fix get_num_procs()  in the USE_TLS branch for non-glibc systems
2021-02-18 15:45:25 +01:00
Martin Kroeker
b0bded3f2f Fix get_num_procs() in the USE_TLS branch for non-glibc systems 2021-02-18 11:14:05 +01:00
Martin Kroeker
e4e5042e38 Recognize Intel Tiger Lake as SkylakeX 2021-02-11 20:17:11 +01:00
Martin Kroeker
0cc36770f1 Merge pull request #3073 from xoviat/embedded
add embedded option
2021-01-31 18:02:41 +01:00
Martin Kroeker
eea0c0f2ed Merge pull request #3085 from alexhenrie/memory_alloc
Fix null pointer check in blas_memory_alloc
2021-01-26 20:11:42 +01:00
Martin Kroeker
0cb9e9fc8d Remove the VORTEX support bits again for now 2021-01-25 19:02:21 +01:00
Alex Henrie
113840da12 Fix null pointer check in blas_memory_alloc 2021-01-24 22:20:44 -07:00
Martin Kroeker
deb2e66bcc Add DYNAMIC_LIST support for ARM64 2021-01-24 23:18:52 +01:00
xoviat
2e8d6e8690 add functions for embedded 2021-01-23 22:12:17 -06:00
Martin Kroeker
b94dab5250 patch to support power10 in builtin_cpu_is was backported to gcc 10.2, so allow that as wel 2021-01-20 21:34:36 +01:00
Martin Kroeker
63fa3c3f8f Require gcc 11 for builtin_cpu_is(power10)
fixes #3074
2021-01-20 15:41:04 +01:00
xoviat
b60de4447a add cortex-m platform 2021-01-19 08:57:44 -06:00