Wangyang Guo
8356a604f0
sbgemm: cooperlake: tuning for block params
2021-09-07 21:30:46 +08:00
Martin Kroeker
cd10d1c03b
Fix typo
2021-08-30 14:38:28 +02:00
Martin Kroeker
2db1a99aca
Clean up debug messages
2021-08-30 14:21:25 +02:00
Martin Kroeker
89fc5b8f4f
Fix unmap logic
2021-08-29 19:50:24 +02:00
Martin Kroeker
7fd12a5e69
Add likely() hints for gcc
2021-08-29 13:54:51 +02:00
Martin Kroeker
2ba9a567aa
Fix typo
2021-08-28 17:14:59 +02:00
Martin Kroeker
b4b952eece
Add auxiliary tracking space for thread buffer frees too
2021-08-28 17:03:53 +02:00
Martin Kroeker
7d1becc575
Allocate an auxiliary struct when running out of preconfigured threads
2021-08-28 14:18:36 +02:00
Martin Kroeker
898212efcd
Actually add the message to the TLS section
2021-08-02 14:50:14 +02:00
Martin Kroeker
210a1584c5
Rebase source and edit TLS version of the message as well
2021-08-02 14:19:16 +02:00
Martin Kroeker
f2a7a67f5a
Improve the "tried to allocate too many buffers" error message
2021-07-31 17:23:40 +02:00
Craig Watson
4d7dfe4845
Include Haiku in processor count checks
2021-07-27 09:00:30 +00:00
JonasZhou
0fca36c8c3
Add cpu detection support for Zhaoxin processors
...
Signed-off-by: JonasZhou <JonasZhou@zhaoxin.com>
2021-07-12 13:43:45 +08:00
River Dillon
2f6326a630
Remove <linux/unistd.h>
2021-07-10 00:36:07 -07:00
Martin Kroeker
8f22ac552b
Add vendor string Shanghai as successor to Centaur
2021-07-08 18:28:49 +02:00
Martin Kroeker
eb2fdd3af0
Recognize newer Zhaoxin/Centaur processors as Nehalem
2021-07-08 12:23:15 +02:00
User User-User
750719528a
bugz
2021-06-20 16:40:43 +02:00
User User-User
6423b282a1
dynamic_arch
2021-06-20 14:19:41 +02:00
Martin Kroeker
307c4c0786
Fix typo
2021-06-16 13:41:16 +02:00
Martin Kroeker
e83df93975
Work around another recent macro name collision with winnt.h
2021-06-16 12:32:34 +02:00
Martin Kroeker
cbfd3c87e1
Recognize Intel Ice Lake SP as Cooper Lake
2021-05-14 20:44:06 +02:00
Martin Kroeker
623d580b4c
Restore __volatile__ keyword
2021-04-16 10:27:32 +02:00
Martin Kroeker
186368ddc3
Fix compilation with CLANG
2021-03-16 16:52:57 +01:00
Martin Kroeker
1a3ad4b670
Fix signatures of the TLS-mode dll_callback and p_process_term functions for Win64
2021-02-22 19:40:36 +01:00
Peter Hawkins
dbbf92c1d1
Fix race in blas_thread_shutdown.
...
blas_server_avail was read without holding server_lock. If multiple threads call blas_thread_shutdown simultaneously, for example, by calling fork(), then they can attempt to shut down multiple times. This can lead to a segmentation fault.
2021-02-18 13:46:50 -05:00
Martin Kroeker
cb429d6b12
Merge pull request #3110 from martin-frbg/issue3108
...
Fix get_num_procs() in the USE_TLS branch for non-glibc systems
2021-02-18 15:45:25 +01:00
Martin Kroeker
b0bded3f2f
Fix get_num_procs() in the USE_TLS branch for non-glibc systems
2021-02-18 11:14:05 +01:00
Martin Kroeker
e4e5042e38
Recognize Intel Tiger Lake as SkylakeX
2021-02-11 20:17:11 +01:00
Martin Kroeker
0cc36770f1
Merge pull request #3073 from xoviat/embedded
...
add embedded option
2021-01-31 18:02:41 +01:00
Martin Kroeker
eea0c0f2ed
Merge pull request #3085 from alexhenrie/memory_alloc
...
Fix null pointer check in blas_memory_alloc
2021-01-26 20:11:42 +01:00
Martin Kroeker
0cb9e9fc8d
Remove the VORTEX support bits again for now
2021-01-25 19:02:21 +01:00
Alex Henrie
113840da12
Fix null pointer check in blas_memory_alloc
2021-01-24 22:20:44 -07:00
Martin Kroeker
deb2e66bcc
Add DYNAMIC_LIST support for ARM64
2021-01-24 23:18:52 +01:00
xoviat
2e8d6e8690
add functions for embedded
2021-01-23 22:12:17 -06:00
Martin Kroeker
b94dab5250
patch to support power10 in builtin_cpu_is was backported to gcc 10.2, so allow that as wel
2021-01-20 21:34:36 +01:00
Martin Kroeker
63fa3c3f8f
Require gcc 11 for builtin_cpu_is(power10)
...
fixes #3074
2021-01-20 15:41:04 +01:00
xoviat
b60de4447a
add cortex-m platform
2021-01-19 08:57:44 -06:00
Martin Kroeker
2c445be8ba
Merge pull request #3051 from martin-frbg/rocketlake
...
Add CPUID information for Intel Rocket Lake
2021-01-14 15:56:25 +01:00
Martin Kroeker
6fe0f1fab9
Label get_cpu_ftr as volatile to keep gcc from rearranging the code
2021-01-11 19:05:29 +01:00
Martin Kroeker
17c16f2a71
Implement builtin_cpu_is and limit cpu choices to P8 and P9 for NVIDIA compilers
2020-12-19 23:21:22 +01:00
Martin Kroeker
865676682d
Add Intel Rocket Lake
2020-12-14 22:40:23 +01:00
Martin Kroeker
6232237dba
Make fallback from P10 to P9 conditional on suitable compiler
2020-12-11 23:41:17 +01:00
Martin Kroeker
18d8a67485
Merge pull request #2994 from antonblanchard/power10-fixes
...
Power10 fixes
2020-12-11 23:37:30 +01:00
Martin Kroeker
83de62c20d
Merge pull request #3026 from martin-frbg/revert747
...
Revert PR747 - SYRK parameter changes for Haswell and related targets
2020-12-10 16:29:41 +01:00
gxw
4b548857d6
Add msa support for loongson
...
1. Using core loongson3r3 and loongson3r4 for loongson
2. Add DYNAMIC_ARCH for loongson
Change-Id: I1c6b54dbeca3a0cc31d1222af36a7e9bd6ab54c1
2020-12-09 10:28:46 +08:00
Martin Kroeker
a554712439
remove extra/intermediate size step for min_jj introduced in PR747
2020-12-08 21:01:36 +01:00
Martin Kroeker
5d26223f4a
remove extra/intermediate size step of min_jj from PR747
2020-12-08 20:59:56 +01:00
Martin Kroeker
bc5b1ddf0d
Merge pull request #3004 from martin-frbg/bsd_getauxval
...
ARM64 DYNAMIC_ARCH build fix for BSD/OSX
2020-11-23 08:35:12 +01:00
Martin Kroeker
e7bf8ced6c
Build fix for systems that do not support getauxval
2020-11-22 20:20:28 +01:00
Martin Kroeker
5fa305172a
Use ifeq instead of ifdef for user-definable options
2020-11-22 16:29:56 +01:00