Commit Graph

525 Commits

Author SHA1 Message Date
Chip-Kerchner 2d0b233425 Fix missing parens. 2023-10-03 10:26:14 -05:00
Chip-Kerchner a8c90eb3ed Added cpu_is 2023-10-03 10:24:04 -05:00
Chip-Kerchner b677d0d5fd Adding missing endif 2023-10-02 13:09:12 -05:00
Chip-Kerchner e5dc376912 Remove duplicate defines. 2023-10-02 12:48:47 -05:00
Chip-Kerchner 10210748de Revert PGI changes. 2023-10-02 12:44:07 -05:00
Chip-Kerchner a922a07e61 Cleanup white spaces. 2023-10-02 12:24:30 -05:00
Chip-Kerchner 12130ee961 Remove tab. 2023-10-02 12:19:22 -05:00
Chip-Kerchner eb738d9929 Minor changes. 2023-10-02 12:14:46 -05:00
Chip-Kerchner 48da98b2a7 Merge remote-tracking branch 'origin/develop' into XLC-AIX 2023-10-02 12:01:33 -05:00
Chip-Kerchner 3b1150fcee Fix CPU identification to work on AIX. 2023-10-02 12:00:48 -05:00
Martin Kroeker 90f890ee67
fix improper function prototypes (empty parentheses) (USE_TLS branch) 2023-09-30 23:12:36 +02:00
Martin Kroeker cf2174fb69
fix improper function prototypes (empty parentheses) 2023-09-30 17:04:39 +02:00
Martin Kroeker c6b1d8e7a3
fix improper function prototypes (empty parentheses) 2023-09-30 12:52:06 +02:00
Martin Kroeker 7e939fb831
Fix handling of additional buffer structures in case of overflow 2023-09-19 23:33:39 +02:00
Tiziano Müller 6a611db560 memory: show correct number of max threads 2023-09-10 08:44:07 +02:00
Martin Kroeker c2f4bdbbb4
Merge pull request #4163 from martin-frbg/issue4017
Rework OpenMP thread count limit handling
2023-07-31 17:58:51 +02:00
Martin Kroeker 9ff84dc3f2
remove unused status variable 2023-07-26 10:02:44 +02:00
Martin Kroeker 3326b924b3
remove status variable blas_num_threads_set; initialize openmp thread maximum on startup 2023-07-26 00:31:24 +02:00
Chris Sidebottom f971ef55f2 Add ARMV8SVE to AArch64 Dynamic Dispatch
In order to enable support for future cores which have similar tunings
(in this case I'm doing this for the Arm(R) Neoverse(TM) V2 core), this generically detects SVE support and enables it. This should better manage the size and complexity of dynamic dispatch rather than just copy pasting the same parameters.

To make `ARMV8SVE` more representive of the common 128-bit SVE case,
I've split it and similar parameters from A64FX which has the wider
512-bit SVE.
2023-07-25 18:35:15 +01:00
Martin Kroeker 3bdcf3259d
Merge branch 'xianyi:develop' into issue4101 2023-07-20 08:23:20 +02:00
Martin Kroeker b34f19a365
Ensure that a premature call to set_num_threads will not overwrite unrelated memory 2023-07-19 22:19:22 +02:00
Martin Kroeker 66904f8148
Ensure that a premature call will not overwrite unrelated memory 2023-07-19 22:14:34 +02:00
Martin Kroeker 5c58994eb2
Add fallback warning 2023-07-19 18:27:41 +02:00
Martin Kroeker ca7199f249
Treat newer Neoverse as N1 if SVE unavailable (may be disabled in container/cloud env) 2023-07-19 14:48:42 +02:00
Martin Kroeker 616fdea82a
Revert "Improve Windows threading performance scaling" 2023-06-28 09:45:17 +02:00
Mark Seminatore d6991dd230 fix missing #endif 2023-06-24 15:43:32 -07:00
Mark Seminatore 7783a9af02 attempt to fix old mingw gcc issue 2023-06-24 14:35:11 -07:00
Mark Seminatore 8caabc5982 fix #4063 remove unused pool_lock 2023-06-23 19:45:16 -07:00
Mark Seminatore d301649430 fix #4063 threading perf issues on Windows 2023-06-23 19:42:27 -07:00
Honglin Zhu 9e80a194d6 Fix dynamic_list build and gcc version check error 2023-05-21 19:52:58 +08:00
Honglin Zhu 0b83088887 spr dynamic arch support 2023-05-19 10:48:18 +08:00
Martin Kroeker e5538a62cb
Add suggestions to NUM_THREADS/auxiliary buffer message 2023-05-04 22:56:39 +02:00
Martin Kroeker 579bc86671
remove call to omp_set_num_threads 2023-03-21 20:58:56 +01:00
Martin Kroeker e298d613fa
initialize status variable for openblas_set_num_threads 2023-03-08 23:43:15 +01:00
Martin Kroeker 05aa88268f
add status variable for openblas_set_num_threads 2023-03-08 23:41:57 +01:00
Martin Kroeker e38ab079a0
Fix OpenMP thread counting returning places rather than cores 2023-03-08 19:17:33 +01:00
Martin Kroeker d4868babbc
Fix typos 2022-12-29 23:07:55 +01:00
Martin Kroeker 18c99d3e63
Update dynamic_arm64.c 2022-12-25 13:31:38 +01:00
Martin Kroeker 186a310f92
Update dynamic_arm64.c 2022-12-25 12:22:48 +01:00
Martin Kroeker da6e426b13
fix Cooperlake not selectable via environment variable 2022-11-03 18:13:35 +01:00
Martin Kroeker ab6009b0b6
Merge pull request #3773 from staticfloat/sf/openblas_default_num_threads
Add `OPENBLAS_DEFAULT_NUM_THREADS`
2022-10-13 14:15:14 +02:00
Martin Kroeker db50ab4a72
Add BUILD_vartype defines 2022-10-01 15:14:51 +02:00
Elliot Saba d2ce93179f Add `OPENBLAS_DEFAULT_NUM_THREADS`
This allows Julia to set a default number of threads (usually `1`) to be
used when no other thread counts are specified [0], to short-circuit the
default OpenBLAS thread initialization routine that spins up a different
number of threads than Julia would otherwise choose.

The reason to add a new environment variable is that we want to be able
to configure OpenBLAS to avoid performing its initial memory
allocation/thread startup, as that can consume significant amounts of
memory, but we still want to be sensitive to legacy codebases that set
things like `OMP_NUM_THREADS` or `GOTOBLAS_NUM_THREADS`.  Creating a new
environment variable that is openblas-specific and is not already
publicly used to control the overall number of threads of programs like
Julia seems to be the best way forward.

[0] https://github.com/JuliaLang/julia/pull/46844
2022-09-30 01:21:44 +00:00
Kai T. Ohlhus 84453b924f
Support CONSISTENT_FPCSR on AARCH64 2022-09-22 00:20:40 +09:00
Martin Kroeker 9402df5604
Fix missing external declaration 2022-09-14 21:44:34 +02:00
Martin Kroeker bd30120ba7
Merge pull request #3720 from FlyGoat/mips64
Make it work on general MIPS64 processors
2022-08-19 20:24:27 +02:00
Jiaxun Yang fae9368f14 Implement DYNAMIC_LIST for MIPS64
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
2022-08-12 13:13:31 +01:00
Jiaxun Yang a50b29c540 Provide a fallback MIPS64_GENERIC target
It is really dangerous to fallback to Loongson core on other
MIPS64 processors.

Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
2022-08-12 13:13:28 +01:00
Jiaxun Yang b633eb79f2 Use $at as temporary register for mips/loongson CPUCFG read
Some compilers (namely LLVM) are not happy with clobbering
registers in inline assembly.
Use $at as temporary register and explicitly use noat
hint.

Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
2022-08-07 13:22:32 +01:00
Martin Kroeker 19fefd100e
Merge pull request #3703 from martin-frbg/omp_adaptive
Add env variable OMP_ADAPTIVE to control OMP threadpool behaviour
2022-08-03 15:38:39 +02:00