JonasZhou
0fca36c8c3
Add cpu detection support for Zhaoxin processors
...
Signed-off-by: JonasZhou <JonasZhou@zhaoxin.com>
2021-07-12 13:43:45 +08:00
Martin Kroeker
8f22ac552b
Add vendor string Shanghai as successor to Centaur
2021-07-08 18:28:49 +02:00
Martin Kroeker
eb2fdd3af0
Recognize newer Zhaoxin/Centaur processors as Nehalem
2021-07-08 12:23:15 +02:00
Martin Kroeker
cbfd3c87e1
Recognize Intel Ice Lake SP as Cooper Lake
2021-05-14 20:44:06 +02:00
Martin Kroeker
e4e5042e38
Recognize Intel Tiger Lake as SkylakeX
2021-02-11 20:17:11 +01:00
Martin Kroeker
865676682d
Add Intel Rocket Lake
2020-12-14 22:40:23 +01:00
Guillaume Horel
1f564d729b
fix avx2 detection
...
reword commits to make it clearer
2020-10-31 10:00:48 -04:00
Chen, Guobing
deaeb6c5b8
Add bfloat16 based dot and conversion with single/double
...
1. Added bfloat16 based dot as new API: shdot
2. Implemented generic kernel and cooperlake-specific (AVX512-BF16) kernel for shdot
3. Added 4 conversion APIs for bfloat16 data type <=> single/double: shstobf16 shdtobf16 sbf16tos dbf16tod
shstobf16 -- convert single float array to bfloat16 array
shdtobf16 -- convert double float array to bfloat16 array
sbf16tos -- convert bfloat16 array to single float array
dbf16tod -- convert bfloat16 array to double float array
4. Implemented generic kernels for all 4 conversion APIs, and cooperlake-specific kernel for shstobf16 and shdtobf16
5. Update level1 thread facilitate functions and macros to support multi-threading for these new APIs
6. Fix Cooperlake platform detection/specify issue when under dynamic-arch building
7. Change the typedef of bfloat16 from unsigned short to more strict uint16_t
Signed-off-by: Chen, Guobing <guobing.chen@intel.com>
2020-09-04 02:31:25 +08:00
Martin Kroeker
12918358aa
Add AMD Renoir/Matisse and preliminary support for Zen3 as Zen2
...
also support AMD family 22 Jaguar/Puma as Bobcat
2020-07-28 13:53:17 +00:00
Martin Kroeker
584ef8d4ae
Add support for Comet Lake H & S
2020-06-27 14:36:37 +02:00
Matthew Treinish
f37e941d52
Add support to driver/others/dynamic.c too
2020-06-25 11:56:49 -04:00
User User-User
e6b9275034
address vs2019 C4293
2020-06-24 09:12:23 +03:00
Martin Kroeker
007d9f97d7
Make gotoblas_corename report the name of the selected TARGET rather than its aliases
2020-06-13 19:25:28 +02:00
Martin Kroeker
3518617f5b
Add Intel Goldmont+ cpuid
...
was originally in #2228 but that PR had misplaced the file in the toplevel directory
2019-12-03 08:32:29 +01:00
Martin Kroeker
f95989cbc1
Fix AVX512 capability test (always returning zero)
...
from #2322
2019-11-23 22:38:07 +01:00
Martin Kroeker
3d36c45116
Add CPUID identification of Intel Ice Lake
2019-08-01 22:52:35 +02:00
Martin Kroeker
3ce28fb81a
Merge pull request #2055 from martin-frbg/atomid
...
Add CPUID data for Intel Denverton (as Nehalem)
2019-03-12 22:57:07 +01:00
Martin Kroeker
04f2226ea6
Add Intel Denverton
2019-03-12 16:09:55 +01:00
Martin Kroeker
11cfd0bd75
Do not compile in AVX512 check if AVX support is disabled
...
xgetbv is function depends on NO_AVX being undefined - we could change that too, but that combo is unlikely to work anyway
2019-03-05 16:04:25 +01:00
caiyu
29dc72889f
Add support for Hygon Dhyana
2019-01-16 14:25:19 +08:00
Martin Kroeker
dbc9a060ef
Fix missing braces in support_av() call
2019-01-14 22:41:31 +01:00
Martin Kroeker
31ed19e8b9
Add message for SkylakeX and KNL fallbacks to Haswell
2019-01-05 19:41:13 +01:00
Martin Kroeker
e1574fa2b4
Add xcr0 (os support) check
2019-01-05 18:08:02 +01:00
Martin Kroeker
ae1d1f74f7
Query AVX2 and AVX512 capability for runtime cpu selection
2019-01-05 16:55:33 +01:00
Martin Kroeker
504310eeb9
Merge pull request #1665 from martin-frbg/cpuid-ryzen2
...
Add cpuid for AMD Ryzen 2
2018-07-04 08:19:40 +02:00
Martin Kroeker
d0ec4325cf
Add cpuid for AMD Ryzen 2
2018-07-03 21:03:24 +02:00
Martin Kroeker
9d15a3bd16
Fix typo that broke compilation with DYNAMIC_ARCH and NO_AVX2
...
fixes 1659
2018-07-02 14:40:41 +02:00
Martin Kroeker
750162a05f
Try gradual fallback for cores not in the dynamic core list
2018-06-25 21:02:31 +02:00
Martin Kroeker
1833a67071
Add support for a user-defined list of dynamic targets
2018-06-23 19:42:15 +02:00
Martin Kroeker
63f7395fb4
Move some DYNAMIC_ARCH targets to new DYNAMIC_OLDER option
2018-06-09 16:31:38 +02:00
Martin Kroeker
38ad05bd04
Extend loop range to find SkylakeX in force_coretype
2018-06-05 10:26:49 +02:00
Martin Kroeker
8be027e4c6
Update dynamic.c
2018-06-04 14:36:39 +02:00
Martin Kroeker
ac7b6e3e9a
Fix misplaced endif
2018-06-04 08:23:40 +02:00
Martin Kroeker
ef626c6824
typo fix
2018-06-04 00:13:19 +02:00
Martin Kroeker
5a51cf4576
Separate Skylake X from Skylake
2018-06-03 23:41:33 +02:00
Arjan van de Ven
99c7bba8e4
Initial support for SkylakeX / AVX512
...
This patch adds the basic infrastructure for adding the SkylakeX (Intel Skylake server)
target. The SkylakeX target will use the AVX512 (AVX512VL level) instruction set,
which brings 2 basic things:
1) 512 bit wide SIMD (2x width of AVX2)
2) 32 SIMD registers (2x the number on AVX2)
This initial patch only contains a trivial transofrmation of the Haswell SGEMM kernel
to AVX512VL; more will follow later but this patch aims to get the infrastructure
in place for this "later".
Full performance tuning has not been done yet; with more registers and wider SIMD
it's in theory possible to retune the kernels but even without that there's an
interesting enough performance increase (30-40% range) with just this change.
2018-06-03 07:58:52 +00:00
Isuru Fernando
2f12ea017b
No strncasecmp with MSVC
2017-08-08 00:07:25 +05:30
Gian-Carlo Pascutto
9c884986ad
Add an extra familiy/model combination used by AMD Steamrolller (Godavari).
2017-04-19 19:15:47 +02:00
Gian-Carlo Pascutto
0cbd2d34e4
Recognize ZEN when passed as OPENBLAS_CORETYPE.
2017-04-10 20:05:16 +02:00
Gian-Carlo Pascutto
62979fd104
Fix dynamic detection for ZEN CPUs.
2017-04-10 19:08:37 +02:00
Denis Steckelmacher
c9ff735da6
Add ZEN support (tested for auto-detected static backend)
2017-03-19 15:32:50 +01:00
Andrew
5088523786
detect apollo lake for real
2017-02-20 23:54:59 +01:00
Elliot Saba
1d8ab99e09
Add `exfamily == 9` case (Kaby Lake) to dynamic arch detection
2017-02-10 15:23:55 -08:00
Martin Koehler
76c6e33e54
Enable EXCAVATOR kernels for A12-9800
2017-02-07 21:38:28 +01:00
Martin Kroeker
596ead0f8d
Add files via upload
2016-11-06 23:26:39 +01:00
Martin Kroeker
8a8f3932eb
Update dynamic.c
...
Add Bay Trail "Pentium N3520" atom
2016-10-16 22:40:00 +02:00
Martin Kroeker
7de829f713
Update dynamic.c
...
Add Braswell (extended model 4, model 12) N3150 as Nehalem
2016-07-14 12:22:55 +02:00
Werner Saar
2b967590a0
bugfix in dynamic.c
2016-04-25 09:08:38 +02:00
Zhang Xianyi
1edf30b790
Change Opteron(SSE3) to Opteron_SSE3 at dyanmaic core name.
2016-03-01 20:13:08 +08:00
Martin Kroeker
935356c34f
Update dynamic.c and cpuid_x86.c for Intel Avoton.
...
Second part of "support Intel Avoton via Nehalem kernel"
2016-02-02 13:42:55 -05:00