OpenBLAS

Commit Graph

Author	SHA1	Message	Date
Honglin Zhu	9e80a194d6	Fix dynamic_list build and gcc version check error	2023-05-21 19:52:58 +08:00
Honglin Zhu	0b83088887	spr dynamic arch support	2023-05-19 10:48:18 +08:00
Martin Kroeker	da6e426b13	fix Cooperlake not selectable via environment variable	2022-11-03 18:13:35 +01:00
Martin Kroeker	2c62096fce	Expand cpu mapping for future Zen cpus and use feature-based fallback for unknown AMD family codes	2022-05-18 15:35:30 +02:00
Adam Niederer	69f2ac4ea2	Fix broken elif in dynamic.c This fixes compilation in the following case: $(MAKE) USE_OPENMP=1 USE_THREAD=1 NO_LAPACK=0 DYNAMIC_ARCH=1 \ DYNAMIC_LIST="HASWELL SKYLAKEX ATOM COOPERLAKE SAPPHIRERAPIDS ZEN"	2022-03-17 20:04:37 -04:00
JonasZhou	2d0ad89b0d	Support Zhaoxin/Centaur kh40000 as ZEN Signed-off-by: JonasZhou <JonasZhou@zhaoxin.com>	2022-03-10 15:08:38 +08:00
Martin Kroeker	fa3e9f25e6	Support AVX512-enabled Alder Lake	2022-02-07 00:00:56 +01:00
Martin Kroeker	6ed52576f8	Add feature-based fallback for unknown x86_64 cpus	2021-12-16 22:02:49 +01:00
Martin Kroeker	08f8bb66c0	Add CPUIDs for Alder Lake and other recent Intel cpus	2021-11-04 20:36:39 +01:00
Martin Kroeker	22a616bd8f	Add model number for Tiger Lake H (mobile variant)	2021-10-27 22:17:58 +02:00
JonasZhou	0fca36c8c3	Add cpu detection support for Zhaoxin processors Signed-off-by: JonasZhou <JonasZhou@zhaoxin.com>	2021-07-12 13:43:45 +08:00
Martin Kroeker	8f22ac552b	Add vendor string Shanghai as successor to Centaur	2021-07-08 18:28:49 +02:00
Martin Kroeker	eb2fdd3af0	Recognize newer Zhaoxin/Centaur processors as Nehalem	2021-07-08 12:23:15 +02:00
Martin Kroeker	cbfd3c87e1	Recognize Intel Ice Lake SP as Cooper Lake	2021-05-14 20:44:06 +02:00
Martin Kroeker	e4e5042e38	Recognize Intel Tiger Lake as SkylakeX	2021-02-11 20:17:11 +01:00
Martin Kroeker	865676682d	Add Intel Rocket Lake	2020-12-14 22:40:23 +01:00
Guillaume Horel	1f564d729b	fix avx2 detection reword commits to make it clearer	2020-10-31 10:00:48 -04:00
Chen, Guobing	deaeb6c5b8	Add bfloat16 based dot and conversion with single/double 1. Added bfloat16 based dot as new API: shdot 2. Implemented generic kernel and cooperlake-specific (AVX512-BF16) kernel for shdot 3. Added 4 conversion APIs for bfloat16 data type <=> single/double: shstobf16 shdtobf16 sbf16tos dbf16tod shstobf16 -- convert single float array to bfloat16 array shdtobf16 -- convert double float array to bfloat16 array sbf16tos -- convert bfloat16 array to single float array dbf16tod -- convert bfloat16 array to double float array 4. Implemented generic kernels for all 4 conversion APIs, and cooperlake-specific kernel for shstobf16 and shdtobf16 5. Update level1 thread facilitate functions and macros to support multi-threading for these new APIs 6. Fix Cooperlake platform detection/specify issue when under dynamic-arch building 7. Change the typedef of bfloat16 from unsigned short to more strict uint16_t Signed-off-by: Chen, Guobing <guobing.chen@intel.com>	2020-09-04 02:31:25 +08:00
Martin Kroeker	12918358aa	Add AMD Renoir/Matisse and preliminary support for Zen3 as Zen2 also support AMD family 22 Jaguar/Puma as Bobcat	2020-07-28 13:53:17 +00:00
Martin Kroeker	584ef8d4ae	Add support for Comet Lake H & S	2020-06-27 14:36:37 +02:00
Matthew Treinish	f37e941d52	Add support to driver/others/dynamic.c too	2020-06-25 11:56:49 -04:00
User User-User	e6b9275034	address vs2019 C4293	2020-06-24 09:12:23 +03:00
Martin Kroeker	007d9f97d7	Make gotoblas_corename report the name of the selected TARGET rather than its aliases	2020-06-13 19:25:28 +02:00
Martin Kroeker	3518617f5b	Add Intel Goldmont+ cpuid was originally in #2228 but that PR had misplaced the file in the toplevel directory	2019-12-03 08:32:29 +01:00
Martin Kroeker	f95989cbc1	Fix AVX512 capability test (always returning zero) from #2322	2019-11-23 22:38:07 +01:00
Martin Kroeker	3d36c45116	Add CPUID identification of Intel Ice Lake	2019-08-01 22:52:35 +02:00
Martin Kroeker	3ce28fb81a	Merge pull request #2055 from martin-frbg/atomid Add CPUID data for Intel Denverton (as Nehalem)	2019-03-12 22:57:07 +01:00
Martin Kroeker	04f2226ea6	Add Intel Denverton	2019-03-12 16:09:55 +01:00
Martin Kroeker	11cfd0bd75	Do not compile in AVX512 check if AVX support is disabled xgetbv is function depends on NO_AVX being undefined - we could change that too, but that combo is unlikely to work anyway	2019-03-05 16:04:25 +01:00
caiyu	29dc72889f	Add support for Hygon Dhyana	2019-01-16 14:25:19 +08:00
Martin Kroeker	dbc9a060ef	Fix missing braces in support_av() call	2019-01-14 22:41:31 +01:00
Martin Kroeker	31ed19e8b9	Add message for SkylakeX and KNL fallbacks to Haswell	2019-01-05 19:41:13 +01:00
Martin Kroeker	e1574fa2b4	Add xcr0 (os support) check	2019-01-05 18:08:02 +01:00
Martin Kroeker	ae1d1f74f7	Query AVX2 and AVX512 capability for runtime cpu selection	2019-01-05 16:55:33 +01:00
Martin Kroeker	504310eeb9	Merge pull request #1665 from martin-frbg/cpuid-ryzen2 Add cpuid for AMD Ryzen 2	2018-07-04 08:19:40 +02:00
Martin Kroeker	d0ec4325cf	Add cpuid for AMD Ryzen 2	2018-07-03 21:03:24 +02:00
Martin Kroeker	9d15a3bd16	Fix typo that broke compilation with DYNAMIC_ARCH and NO_AVX2 fixes 1659	2018-07-02 14:40:41 +02:00
Martin Kroeker	750162a05f	Try gradual fallback for cores not in the dynamic core list	2018-06-25 21:02:31 +02:00
Martin Kroeker	1833a67071	Add support for a user-defined list of dynamic targets	2018-06-23 19:42:15 +02:00
Martin Kroeker	63f7395fb4	Move some DYNAMIC_ARCH targets to new DYNAMIC_OLDER option	2018-06-09 16:31:38 +02:00
Martin Kroeker	38ad05bd04	Extend loop range to find SkylakeX in force_coretype	2018-06-05 10:26:49 +02:00
Martin Kroeker	8be027e4c6	Update dynamic.c	2018-06-04 14:36:39 +02:00
Martin Kroeker	ac7b6e3e9a	Fix misplaced endif	2018-06-04 08:23:40 +02:00
Martin Kroeker	ef626c6824	typo fix	2018-06-04 00:13:19 +02:00
Martin Kroeker	5a51cf4576	Separate Skylake X from Skylake	2018-06-03 23:41:33 +02:00
Arjan van de Ven	99c7bba8e4	Initial support for SkylakeX / AVX512 This patch adds the basic infrastructure for adding the SkylakeX (Intel Skylake server) target. The SkylakeX target will use the AVX512 (AVX512VL level) instruction set, which brings 2 basic things: 1) 512 bit wide SIMD (2x width of AVX2) 2) 32 SIMD registers (2x the number on AVX2) This initial patch only contains a trivial transofrmation of the Haswell SGEMM kernel to AVX512VL; more will follow later but this patch aims to get the infrastructure in place for this "later". Full performance tuning has not been done yet; with more registers and wider SIMD it's in theory possible to retune the kernels but even without that there's an interesting enough performance increase (30-40% range) with just this change.	2018-06-03 07:58:52 +00:00
Isuru Fernando	2f12ea017b	No strncasecmp with MSVC	2017-08-08 00:07:25 +05:30
Gian-Carlo Pascutto	9c884986ad	Add an extra familiy/model combination used by AMD Steamrolller (Godavari).	2017-04-19 19:15:47 +02:00
Gian-Carlo Pascutto	0cbd2d34e4	Recognize ZEN when passed as OPENBLAS_CORETYPE.	2017-04-10 20:05:16 +02:00
Gian-Carlo Pascutto	62979fd104	Fix dynamic detection for ZEN CPUs.	2017-04-10 19:08:37 +02:00

1 2

96 Commits