Martin Kroeker
bf1430f7d7
Merge pull request #2208 from martin-frbg/munmap-debug
...
Provide more information on mmap/munmap failure
2019-08-09 07:55:35 +02:00
Martin Kroeker
1776ad82c0
Add files via upload
2019-08-09 00:08:11 +02:00
Martin Kroeker
4e2f81cfa1
Provide more information on mmap/munmap failure
...
for #2207
2019-08-08 23:15:35 +02:00
Martin Kroeker
3d36c45116
Add CPUID identification of Intel Ice Lake
2019-08-01 22:52:35 +02:00
Martin Kroeker
21d05a4835
Merge pull request #2140 from martin-frbg/pgi19
...
Do not try ancient PGI hacks with recent versions of that compiler
2019-05-26 12:39:20 +02:00
Martin Kroeker
1778fd4219
Do not try ancient PGI hacks with recent versions of that compiler
...
should fix #2139
2019-05-22 13:48:27 +02:00
Martin Kroeker
86dda5c2fa
Add option USE_LOCKING for SMP-like locking in USE_THREAD=0 builds
2019-05-15 23:21:20 +02:00
Martin Kroeker
5cabda79d0
Merge pull request #2117 from martin-frbg/issue2114
...
Fix errors in cpu affinity setup with glibc 2.6
2019-05-07 18:18:16 +02:00
Martin Kroeker
a6a8cc2b7f
Fix errors in cpu enumeration with glibc 2.6
...
for #2114
2019-05-07 13:34:52 +02:00
Martin Kroeker
a387a23518
Merge pull request #2101 from luzpaz/misc-typos
...
Misc. typo fixes in comments and documentation
2019-05-04 22:28:29 +02:00
Martin Kroeker
b43c8382c8
Correct argument of CPU_ISSET for glibc <2.5
...
fixes #2104
2019-05-01 10:46:46 +02:00
luz.paz
daf2fec12d
Misc. typo fixes
...
Found via `codespell -q 3 -w -L ith,als,dum,nd,amin,nto,wis,ba -S ./relapack,./kernel,./lapack-netlib`
2019-04-29 17:03:56 -04:00
Jeff Baylor
40e53e52d6
snprintf define consolidated to common.h
2019-04-22 17:01:34 -07:00
Rashmica Gupta
bcdf1d4917
Add in runtime CPU detection for POWER.
2019-04-09 14:20:16 +10:00
Erik M. Bray
8ba9e2a61a
Also call CloseHandle on each thread, as well as on the event so as to not leak thread handles.
2019-03-19 11:21:44 +01:00
Erik M. Bray
4ad694eda1
Fix for #2063 : The DllMain used in Cygwin did not run the thread memory
...
pool cleanup upon THREAD_DETACH which is needed when compiled with
USE_TLS=1.
2019-03-19 09:26:50 +01:00
Martin Kroeker
3ce28fb81a
Merge pull request #2055 from martin-frbg/atomid
...
Add CPUID data for Intel Denverton (as Nehalem)
2019-03-12 22:57:07 +01:00
Martin Kroeker
04f2226ea6
Add Intel Denverton
2019-03-12 16:09:55 +01:00
Martin Kroeker
4741ce803b
Merge pull request #2045 from martin-frbg/2033-3
...
Do not compile in AVX512 check if AVX support is disabled
2019-03-06 22:40:26 +01:00
Martin Kroeker
11cfd0bd75
Do not compile in AVX512 check if AVX support is disabled
...
xgetbv is function depends on NO_AVX being undefined - we could change that too, but that combo is unlikely to work anyway
2019-03-05 16:04:25 +01:00
Martin Kroeker
d7b2c53c0b
Merge pull request #2039 from brada4/meminit
...
Address warning in memory.c
2019-03-05 12:11:15 +01:00
Martin Kroeker
6c83b878f6
Merge pull request #2040 from martin-frbg/locks2002
...
Restore locking optimizations for OpenMP case
2019-03-04 15:07:14 +01:00
Martin Kroeker
af480b02a4
Restore locking optimizations for OpenMP case
...
restore another accidentally dropped part of #1468 that was missed in #2004 to address performance regression reported in #1461
2019-03-03 14:17:07 +01:00
Andrew
e4a79be6bb
address warning introed with #1814 et al
2019-03-03 09:05:11 +02:00
Martin Kroeker
03a2bf2602
Fix potential memory leak in cpu enumeration on Linux ( #2008 )
...
* Fix potential memory leak in cpu enumeration with glibc
An early return after a failed call to sched_getaffinity would leak the previously allocated cpu_set_t. Wrong calculation of the size argument in that call increased the likelyhood of that failure. Fixes #2003
2019-02-10 23:24:45 +01:00
Martin Kroeker
69edc5bbe7
Restore dropped patches in the non-TLS branch of memory.c ( #2004 )
...
* Restore dropped patches in the non-TLS branch of memory.c
As discovered in #2002 , the reintroduction of the "original" non-TLS version of memory.c as an alternate branch had inadvertently used ba1f91f
rather than a8002e2
, thereby dropping the commits for #1450 , #1468 , #1501 , #1504 and #1520 .
2019-02-07 20:06:13 +01:00
caiyu
29dc72889f
Add support for Hygon Dhyana
2019-01-16 14:25:19 +08:00
Martin Kroeker
dbc9a060ef
Fix missing braces in support_av() call
2019-01-14 22:41:31 +01:00
Martin Kroeker
21c0f2af7b
Merge pull request #1957 from martin-frbg/issue1954
...
Move TLS key deletion to openblas_quit
2019-01-10 12:04:08 +01:00
Martin Kroeker
ad2c386d6a
Move TLS key deletion to openblas_quit
...
fixes #1954 (as suggested by thrasibule in that issue)
2019-01-10 00:32:50 +01:00
Martin Kroeker
31ed19e8b9
Add message for SkylakeX and KNL fallbacks to Haswell
2019-01-05 19:41:13 +01:00
Martin Kroeker
e1574fa2b4
Add xcr0 (os support) check
2019-01-05 18:08:02 +01:00
Martin Kroeker
ae1d1f74f7
Query AVX2 and AVX512 capability for runtime cpu selection
2019-01-05 16:55:33 +01:00
Martin Kroeker
bba1e67269
Delete the pthread key on cleanup in TLS mode
...
to avoid a crash when OpenBLAS was loaded via dlopen and libc tries to clean up the leaked TLS after dlclose
Fixes #1720
2018-12-29 21:59:31 +01:00
Martin Kroeker
0bf6d74e5f
Fix typo in previous commit for arm dynamic arch
2018-12-07 19:37:33 +01:00
Martin Kroeker
2b355592e3
Make sure to use the arm version of dynamic.c in ARM64 DYNAMIC_ARCH
...
cf. #1908
2018-12-07 16:25:55 +01:00
Andrew
2601cd58ab
remove surplus locking code , only enabled w x86, disabled or never enabled on all others
2018-11-30 11:38:19 +01:00
Martin Kroeker
97d7298973
call it OpenBLAS not just version
2018-11-29 11:52:08 +01:00
Martin Kroeker
de0d0ed52f
Improve formatting of config output
2018-11-29 11:28:19 +01:00
Martin Kroeker
816775e309
Add version information to openblas_get_config output
2018-11-29 00:06:44 +01:00
Martin Kroeker
f5595d0262
Merge pull request #1843 from martin-frbg/aix_numprocs
...
Add get_num_procs implementation for AIX
2018-10-31 21:25:15 +01:00
Martin Kroeker
326d394a0f
Add get_num_procs implementation for AIX
...
(and copy HAIKU implementation to the non-TLS version of the code as well)
2018-10-31 18:38:22 +01:00
Erik M. Bray
38cf5d9364
ensure that threading has been initialized in the first place before calling openblas_set_num_threads
2018-10-28 21:16:52 +00:00
Ashwin Sekhar T K
d5aeff636f
ARM64: Enable DYNAMIC_ARCH
...
Enable DYNAMIC_ARCH feature on ARM64. This patch uses the cpuid
feature in linux kernel to detect the core type at runtime
(https://www.kernel.org/doc/Documentation/arm64/cpu-feature-registers.txt ).
If this feature is missing in kernel, then the user should use the
OPENBLAS_CORETYPE env variable to select the desired core type.
2018-10-22 01:49:35 -07:00
Ashwin Sekhar T K
d50abc8903
ARM64: Move parameters from parameter.c to param.h
...
Remove the runtime setting of P, Q, R parameters for
targets ARMV8, THUNDERX2T99. Instead set them as constants
in param.h at compile time.
2018-10-22 01:45:51 -07:00
Ashwin Sekhar T K
21f46a1cf2
ARM64: Use THUNDERX2T99 Neon Kernels for ARMV8
...
Currently the generic ARMV8 target uses C implementations
for many routines. Replace these with the neon implementations
written for THUNDERX2T99 target which are upto 6x faster for
certain routines.
2018-10-17 10:44:37 -07:00
Andrew
3439158dea
address #1782 2nd loop
2018-10-03 21:20:50 +02:00
Martin Kroeker
28aa94bf4b
Include thread numbers in failure message from blas_thread_init
...
to aid in debugging cases like #1767
2018-09-22 14:00:15 +02:00
Martin Kroeker
1ad1e79062
Catch inadvertent USE_TLS=0 declaration
...
for #1766
2018-09-19 18:03:43 +02:00
Martin Kroeker
b402626509
Do not use the new TLS code for non-threaded builds even if USE_TLS is set
...
Workaround for #1761 as that exposed a problem in the new code (which was intended to speed up multithreaded code only anyway).
2018-09-16 12:43:36 +02:00