Martin Kroeker
8c10f0abba
Merge pull request #3794 from bartoldeman/benchmark-align-malloc
...
Benchmarks: align malloc'ed buffers.
2022-10-21 16:13:58 +02:00
Bart Oldeman
9e6b060bf3
Fix comment.
...
It stores the pointer, not an offset (that would be an alternative approach).
2022-10-20 20:11:09 -04:00
Bart Oldeman
9959a60873
Benchmarks: align malloc'ed buffers.
...
Benchmarks should allocate with cacheline (often 64 bytes) alignment
to avoid unreliable timings. This technique, storing the offset in the
byte before the pointer, doesn't require C11's aligned_alloc for
compatibility with older compilers.
For example, Glibc's x86_64 malloc returns 16-byte aligned buffers, which is
not sufficient for AVX/AVX2 (32-byte preferred) or AVX512 (64-byte).
2022-10-20 13:28:20 -04:00
Martin Kroeker
ad424fce08
Merge pull request #3791 from martin-frbg/issue3790
...
Fix pkgconfig file generation for INTERFACE64 builds
2022-10-19 07:11:33 +02:00
Martin Kroeker
5f72415f10
Suffix the pkgconfig file itself in INTERFACE64 builds
2022-10-18 20:29:24 +02:00
Martin Kroeker
747ade5adf
fix INTERFACE64/USE64BITINT reporting
2022-10-18 17:28:07 +02:00
Martin Kroeker
8bacea1254
Pass libsuffix to openblas.pc and fix passing of INTERFACE64/USE64BITINT flag
2022-10-18 16:18:29 +02:00
Martin Kroeker
b2523471c9
Add libsuffix support
2022-10-18 16:16:26 +02:00
Martin Kroeker
11b2570c13
Merge pull request #3786 from martin-frbg/issue3784
...
Disable the gfortran tree vectorizer for lapack-netlib
2022-10-13 18:34:28 +02:00
Martin Kroeker
ab6009b0b6
Merge pull request #3773 from staticfloat/sf/openblas_default_num_threads
...
Add `OPENBLAS_DEFAULT_NUM_THREADS`
2022-10-13 14:15:14 +02:00
Martin Kroeker
32566bfb44
Disable the gfortran tree vectorizer for netlib LAPACK
2022-10-13 14:04:25 +02:00
Martin Kroeker
57809526c4
Disable the gfortran tree vectorizer for lapack-netlib
2022-10-13 09:12:23 +02:00
Martin Kroeker
eece0dfd14
Merge pull request #3781 from martin-frbg/issue3779
...
Fix building with only a subset of variable types on Windows
2022-10-01 19:26:09 +02:00
Martin Kroeker
db50ab4a72
Add BUILD_vartype defines
2022-10-01 15:14:51 +02:00
Martin Kroeker
a84a8a7096
Merge pull request #3778 from martin-frbg/issue3775
...
Fix misdetection of gfortran on Cray systems
2022-10-01 15:12:40 +02:00
Martin Kroeker
79d842047a
Move Cray case after GNU as Cray builds of gfortran have both names in the version string
2022-09-30 11:58:15 +02:00
Martin Kroeker
5e78493d95
Move Cray case after GNU as Cray builds of gfortran have both names in the version string
2022-09-30 11:55:56 +02:00
Elliot Saba
d2ce93179f
Add `OPENBLAS_DEFAULT_NUM_THREADS`
...
This allows Julia to set a default number of threads (usually `1`) to be
used when no other thread counts are specified [0], to short-circuit the
default OpenBLAS thread initialization routine that spins up a different
number of threads than Julia would otherwise choose.
The reason to add a new environment variable is that we want to be able
to configure OpenBLAS to avoid performing its initial memory
allocation/thread startup, as that can consume significant amounts of
memory, but we still want to be sensitive to legacy codebases that set
things like `OMP_NUM_THREADS` or `GOTOBLAS_NUM_THREADS`. Creating a new
environment variable that is openblas-specific and is not already
publicly used to control the overall number of threads of programs like
Julia seems to be the best way forward.
[0] https://github.com/JuliaLang/julia/pull/46844
2022-09-30 01:21:44 +00:00
Martin Kroeker
8e851160d7
Merge pull request #3772 from siko1056/develop
...
Support CONSISTENT_FPCSR on aarch64 systems
2022-09-29 20:22:50 +02:00
Martin Kroeker
cf132deb14
Merge pull request #3774 from sashashura/patch-1
...
GitHub Workflows security hardening
2022-09-29 18:49:50 +02:00
Martin Kroeker
6077d81161
Merge pull request #3777 from martin-frbg/fixmips64generic2
...
Fix MIPS64_GENERIC copyobj declarations for DYNAMIC_ARCH
2022-09-29 13:50:59 +02:00
Martin Kroeker
f6f35a4288
fix copyobj declarations to work with DYNAMIC_ARCH
2022-09-29 08:47:14 +02:00
Alex
c726604319
build: harden dynamic_arch.yml permissions
...
Signed-off-by: Alex <aleksandrosansan@gmail.com>
2022-09-26 13:48:11 +02:00
Alex
4de8e1b8f9
build: harden mips64.yml permissions
...
Signed-off-by: Alex <aleksandrosansan@gmail.com>
2022-09-26 13:47:15 +02:00
Alex
11cd108095
build: harden nightly-Homebrew-build.yml permissions
...
Signed-off-by: Alex <aleksandrosansan@gmail.com>
2022-09-26 13:46:34 +02:00
Kai T. Ohlhus
c2892f0e31
Makefile.rule: update CONSISTENT_FPCSR documentation
2022-09-22 00:25:13 +09:00
Kai T. Ohlhus
84453b924f
Support CONSISTENT_FPCSR on AARCH64
2022-09-22 00:20:40 +09:00
Martin Kroeker
667d0e0b48
Merge pull request #3771 from martin-frbg/fixmips64generic
...
Add KERNEL file for MIPS64_GENERIC as a copy of GENERIC
2022-09-19 18:58:14 +02:00
Martin Kroeker
b1d69fb3ac
Add MIPS64_GENERIC as a copy of GENERIC
2022-09-17 23:52:32 +02:00
Martin Kroeker
63d063cb6d
Merge pull request #3769 from XiWeiGu/mips64-test
...
[WIP,Testing]: Add test for mips64
2022-09-17 23:48:53 +02:00
gxw
edea1bcfaf
MIPS64: Fixed failed utest dsdot:dsdot_n_1 when TARGET=I6500
2022-09-17 16:43:22 +08:00
gxw
548a11b9d9
[WIP,Testing]: Add test for mips64
2022-09-16 09:23:01 +08:00
Martin Kroeker
47120f20ca
Merge pull request #3768 from martin-frbg/fixwarnings
...
Fix some warnings in x86_64 kernels
2022-09-15 13:26:21 +02:00
Martin Kroeker
101a2c77c3
Fix warnings
2022-09-15 09:19:19 +02:00
Martin Kroeker
7ee3cab4ff
Merge pull request #3767 from martin-frbg/decl_adaptive
...
Fix missing external declaration of openblas_omp_adaptive_env()
2022-09-15 07:20:07 +02:00
Martin Kroeker
9402df5604
Fix missing external declaration
2022-09-14 21:44:34 +02:00
Martin Kroeker
dd846e72ed
Merge pull request #3766 from martin-frbg/issue3640
...
Add (minimal) initial support for processing with the Emscripten Javascript converter
2022-09-14 20:03:57 +02:00
Martin Kroeker
b285307e18
Add a kludge for the Emscripten js converter
2022-09-14 17:05:24 +02:00
Martin Kroeker
9773a9d6b3
undefine YIELDING for the Emscripten js converter
2022-09-14 17:04:11 +02:00
Martin Kroeker
dc856de3af
Merge pull request #3765 from martin-frbg/f2cpointer
...
Fix pointer/integer argument mismatch in the f2c-translated LAPACK
2022-09-14 15:51:49 +02:00
Martin Kroeker
91110f92d2
fix missing return type in function declaration
2022-09-14 14:03:31 +02:00
Martin Kroeker
515cf26929
Fix pointer/integer argument mismatch in calls to pow()
2022-09-14 11:48:36 +02:00
Martin Kroeker
8273ab6ee3
Merge pull request #3764 from martin-frbg/issue3757
...
Fix compilation of Haswell/Zen DYNAMIC_ARCH targets with Apple clang
2022-09-14 08:42:53 +02:00
Martin Kroeker
a0a4f7c447
Add -mfma to -mavx2 for clang, and add AVX2 declaration for Zen in DYNAMIC_ARCH builds
2022-09-13 22:47:00 +02:00
Martin Kroeker
23d59baaf1
Add -mfma to -mavx2 for Apple clang, and set AVX2 options for Zen as well
2022-09-13 22:39:27 +02:00
Martin Kroeker
85758aba67
Merge pull request #3763 from XiWeiGu/issue3761
...
MIPS64: Using the macro MTC rather than MTC1
2022-09-13 20:12:22 +02:00
gxw
365936ae1b
MIPS64: Using the macro MTC rather than MTC1
2022-09-13 16:39:40 +08:00
Martin Kroeker
fab84910ea
Merge pull request #3758 from martin-frbg/issue3755
...
Remove excessive quoting of arguments in Makefile.prebuild again
2022-09-07 14:57:22 +02:00
Martin Kroeker
389e378063
Remove excessive quoting of arguments from PR3722
2022-09-07 09:01:03 +02:00
Martin Kroeker
51efcfcd29
Merge pull request #3753 from martin-frbg/azure-macos
...
Move all Apple jobs on Azure to macos-11 following deprecation
2022-09-03 15:02:07 +02:00