Commit Graph

705 Commits

Author SHA1 Message Date
Martin Kroeker e7a895e714
Add Apple M as NeoverseN1 2023-12-25 12:36:05 +01:00
Chris Sidebottom dc20a78188 Use functionally equivalent dynamic targets
Similar to `drivers/other/dynamic.c`, I've looked for functionally
equivalent targets and mapped them in the default DYNAMIC_ARCH build.
Users can still build specific cores using DYNAMIC_LIST.
2023-12-23 12:45:27 +00:00
Mark Seminatore edac80d7e8 some cleanup, dynamically scale threads, add missing WIN_CASE defn 2023-12-07 14:59:27 -08:00
Mark Seminatore 4ebf814b42 fix bug failing to mark task as finished. 2023-12-05 23:28:37 -08:00
Mark Seminatore 5f51811728 try at new threading model 2023-12-05 22:43:36 -08:00
Shiyou Yin 1310a0931b loongarch: Refine build control for loongarch64.
1. Use getauxval instead of cpucfg to test hardware capability.
2. Remove unnecessary code and option for compiler check in c_check.
2023-11-28 20:23:55 +08:00
Chip-Kerchner d99aad8ee3 Fix older version of gcc - missing __has_builtin, cpuid and no support of P10. 2023-11-14 11:07:08 -06:00
Martin Kroeker 9b5f8eb33a
Fix empty function prototypes 2023-11-12 19:35:53 +01:00
Martin Kroeker 9324520d0e
typo fix 2023-11-11 23:14:58 +01:00
Martin Kroeker ff6437f2d7
Add workaround for omp_get_max_threads hanging on FreeBSD with libomp from LLVM14 2023-11-11 21:30:32 +01:00
Chip-Kerchner 4eecccd49b Fix __builtin_cpu_is for AIX. 2023-11-08 07:12:21 -06:00
Chip-Kerchner 5e31c57083 Only define __builtin_cpu_is and __builtin_cpu_supports if not present. 2023-11-07 20:58:34 -06:00
Chip-Kerchner 7dcb2d67f2 Have POWER7 return arch=POWER6. 2023-11-01 15:23:28 -05:00
Chip-Kerchner c8882bd9d8 Remove POWER7 from cpu list. 2023-11-01 14:53:55 -05:00
Chip Kerchner badfb2e60f Merge branch 'develop' into XLC-AIX 2023-10-26 09:19:31 -05:00
Martin Kroeker e12aaed13d
Fix unwanted fallthrough from Intel Family 6 to 15 in case of identification failure 2023-10-18 16:28:54 +02:00
Chip-Kerchner 880af052dd Fix dynamic dispatch P9 for clang. 2023-10-06 13:41:49 -05:00
Chip-Kerchner 3655632611 Another small change. 2023-10-06 13:11:40 -05:00
Chip-Kerchner 36e08f6994 One more small change. 2023-10-06 13:08:41 -05:00
Chip-Kerchner 298bf1f240 Reduce differences. 2023-10-06 12:50:28 -05:00
Chip-Kerchner 71c6689af4 Fix dynamic dispatch to work for clang. 2023-10-06 12:20:40 -05:00
Chip-Kerchner c60f9d9c08 Add missing CPU_POWER5. 2023-10-06 09:49:17 -05:00
Chip Kerchner 3cc72a3797 Only include cpu_id and cpu_supports in AIX and fix parameter types. 2023-10-04 09:54:37 -05:00
Chip-Kerchner 09212f84bf Fix default case for cpu_is. 2023-10-03 12:23:21 -05:00
Chip-Kerchner 2d0b233425 Fix missing parens. 2023-10-03 10:26:14 -05:00
Chip-Kerchner a8c90eb3ed Added cpu_is 2023-10-03 10:24:04 -05:00
Chip-Kerchner b677d0d5fd Adding missing endif 2023-10-02 13:09:12 -05:00
Chip-Kerchner e5dc376912 Remove duplicate defines. 2023-10-02 12:48:47 -05:00
Chip-Kerchner 10210748de Revert PGI changes. 2023-10-02 12:44:07 -05:00
Chip-Kerchner a922a07e61 Cleanup white spaces. 2023-10-02 12:24:30 -05:00
Chip-Kerchner 12130ee961 Remove tab. 2023-10-02 12:19:22 -05:00
Chip-Kerchner eb738d9929 Minor changes. 2023-10-02 12:14:46 -05:00
Chip-Kerchner 48da98b2a7 Merge remote-tracking branch 'origin/develop' into XLC-AIX 2023-10-02 12:01:33 -05:00
Chip-Kerchner 3b1150fcee Fix CPU identification to work on AIX. 2023-10-02 12:00:48 -05:00
Martin Kroeker 90f890ee67
fix improper function prototypes (empty parentheses) (USE_TLS branch) 2023-09-30 23:12:36 +02:00
Martin Kroeker cf2174fb69
fix improper function prototypes (empty parentheses) 2023-09-30 17:04:39 +02:00
Martin Kroeker c6b1d8e7a3
fix improper function prototypes (empty parentheses) 2023-09-30 12:52:06 +02:00
Martin Kroeker c4bd4a2e5d
fix improper function prototypes (empty parentheses) 2023-09-30 12:49:24 +02:00
Martin Kroeker 7e939fb831
Fix handling of additional buffer structures in case of overflow 2023-09-19 23:33:39 +02:00
Tiziano Müller 6a611db560 memory: show correct number of max threads 2023-09-10 08:44:07 +02:00
Martin Kroeker c2f4bdbbb4
Merge pull request #4163 from martin-frbg/issue4017
Rework OpenMP thread count limit handling
2023-07-31 17:58:51 +02:00
Martin Kroeker 9ff84dc3f2
remove unused status variable 2023-07-26 10:02:44 +02:00
Martin Kroeker 3326b924b3
remove status variable blas_num_threads_set; initialize openmp thread maximum on startup 2023-07-26 00:31:24 +02:00
Chris Sidebottom f971ef55f2 Add ARMV8SVE to AArch64 Dynamic Dispatch
In order to enable support for future cores which have similar tunings
(in this case I'm doing this for the Arm(R) Neoverse(TM) V2 core), this generically detects SVE support and enables it. This should better manage the size and complexity of dynamic dispatch rather than just copy pasting the same parameters.

To make `ARMV8SVE` more representive of the common 128-bit SVE case,
I've split it and similar parameters from A64FX which has the wider
512-bit SVE.
2023-07-25 18:35:15 +01:00
Martin Kroeker 3bdcf3259d
Merge branch 'xianyi:develop' into issue4101 2023-07-20 08:23:20 +02:00
Martin Kroeker b34f19a365
Ensure that a premature call to set_num_threads will not overwrite unrelated memory 2023-07-19 22:19:22 +02:00
Martin Kroeker 66904f8148
Ensure that a premature call will not overwrite unrelated memory 2023-07-19 22:14:34 +02:00
Martin Kroeker 5c58994eb2
Add fallback warning 2023-07-19 18:27:41 +02:00
Martin Kroeker ca7199f249
Treat newer Neoverse as N1 if SVE unavailable (may be disabled in container/cloud env) 2023-07-19 14:48:42 +02:00
Martin Kroeker 616fdea82a
Revert "Improve Windows threading performance scaling" 2023-06-28 09:45:17 +02:00