Commit Graph

76 Commits

Author SHA1 Message Date
Martin Kroeker 83b5c6b92d
Fix compilation with NO_AVX=1 set
fixes #1974
2019-01-20 12:18:53 +01:00
caiyu 29dc72889f Add support for Hygon Dhyana 2019-01-16 14:25:19 +08:00
Martin Kroeker 00401489c2
Fix missing braces in support_avx() 2019-01-14 22:38:32 +01:00
Martin Kroeker 68eb3146ce
Add xcr0 (os support) check 2019-01-05 18:07:14 +01:00
Martin Kroeker 0afaae4b23
Query AVX2 and AVX512VL capability in x86 cpu detection 2019-01-05 16:58:56 +01:00
TiborGY 211120c508
Fix typo in UNKNOWN core name
Should be of no consequence, right?
2018-12-27 23:09:21 +01:00
Martin Kroeker 64ca44873b
Fix detection of Ryzen2 (missing CORE_ZEN) 2018-10-28 18:36:55 +01:00
Martin Kroeker 3f73e8b8cf
Add cpuid for AMD Ryzen 2
for #1664
2018-07-03 21:01:35 +02:00
Martin Kroeker 2d8cc7193a
Support upcoming Intel Cannon Lake CPUs as Skylake X (#1621)
* Support  upcoming Cannon Lake as Skylake X
2018-06-17 23:38:14 +02:00
Martin Kroeker dc9fe05ab5
Update cpuid_x86.c 2018-06-04 17:10:19 +02:00
Martin Kroeker 5a92b311e0
Separate Skylake X from Skylake 2018-06-03 23:29:07 +02:00
Arjan van de Ven 99c7bba8e4 Initial support for SkylakeX / AVX512
This patch adds the basic infrastructure for adding the SkylakeX (Intel Skylake server)
target. The SkylakeX target will use the AVX512 (AVX512VL level) instruction set,
which brings 2 basic things:
1) 512 bit wide SIMD (2x width of AVX2)
2) 32 SIMD registers (2x the number on AVX2)

This initial patch only contains a trivial transofrmation of the Haswell SGEMM kernel
to AVX512VL; more will follow later but this patch aims to get the infrastructure
in place for this "later".

Full performance tuning has not been done yet; with more registers and wider SIMD
it's in theory possible to retune the kernels but even without that there's an
interesting enough performance increase (30-40% range) with just this change.
2018-06-03 07:58:52 +00:00
Martin Kroeker aece65ea29 Fix coretype detection for Bay Trail Atom
My earlier PR #982 appears to have been incomplete in this regard - fixes #1285
2017-08-27 13:06:54 +02:00
Martin Kroeker 00774b1105 Add dummy implementation of cpuid_count for the CPUIDEMU case 2017-07-12 21:56:23 +02:00
Martin Kroeker 6497aae57c Use cpuid 4 with subleafs to query L1 cache size on Intel processors 2017-07-12 20:43:09 +02:00
Gian-Carlo Pascutto 9c884986ad Add an extra familiy/model combination used by AMD Steamrolller (Godavari). 2017-04-19 19:15:47 +02:00
Johannes Buchner b4071d0d16 Autodetect AMD A8-6410 as BARCELONA 2017-04-03 17:07:27 +10:00
Denis Steckelmacher c9ff735da6 Add ZEN support (tested for auto-detected static backend) 2017-03-19 15:32:50 +01:00
Martin Kroeker 688267edf3 Fix core detection for Kaby Lake without AVX (G4560)
Should fix #1109)
2017-03-02 17:36:16 +01:00
Elliot Saba 04b2b06665 CPUID mappings for Core i5-7600K (Kaby Lake) 2017-02-10 14:53:15 -08:00
Martin Koehler 76c6e33e54 Enable EXCAVATOR kernels for A12-9800 2017-02-07 21:38:28 +01:00
Martin Kroeker 60816c9259 Add files via upload 2016-11-06 23:26:04 +01:00
Martin Kroeker 3409bccb21 Update cpuid_x86.c
Add Bay Trail "Pentium N3520" atom cpu
2016-10-16 22:45:44 +02:00
Martin Kroeker 154729908e Update cpuid_x86.c 2016-07-14 17:29:34 +02:00
Martin Kroeker 97bd1e42c8 Update cpuid_x86.c 2016-07-14 12:25:17 +02:00
Martin Kroeker 935356c34f Update dynamic.c and cpuid_x86.c for Intel Avoton.
Second part of "support Intel Avoton via Nehalem kernel"
2016-02-02 13:42:55 -05:00
Martin Kroeker 4f05c23673 Update cpuid_x86.c
Add recognition of Intel Atom C27xx (Avoton, model code 4D)
2016-02-02 13:38:01 -05:00
Jerome Robert 76398c3233 Fix detection of AMD E2-3200 2015-12-28 19:45:47 +01:00
Zhang Xianyi 839395fc25 Detect AMD Trinity and Richland. 2015-10-29 02:53:29 +08:00
Zhang Xianyi 94b125255f Merge branch 'develop' into cmake
Conflicts:
	driver/others/memory.c
2015-10-13 04:46:08 +08:00
Zhang Xianyi cc7cab8a45 Detect other Intel Skylake cores.
http://users.atw.hu/instlatx64/
2015-09-09 10:47:17 -05:00
Yichao Yu 61ae47eb99 Ref #632. Support Intel Skylake by Haswell kernels. 2015-09-09 11:07:33 -04:00
Zhang Xianyi dcd5ba4443 Merge branch 'cmake' of https://github.com/hpanderson/OpenBLAS into hpanderson_cmake 2015-07-22 04:06:39 +08:00
Zhang Xianyi 51ff17d46e Add AMD Excavator target. 2015-05-13 16:16:30 -05:00
Zhang Xianyi 8977b3f235 Refs #529. Support Intel Broadwell by Haswell kernels. 2015-04-02 11:08:03 -05:00
Hank Anderson e19bf3a28b Removed MSVC cpuid func when using clang. 2015-02-25 14:44:49 -06:00
Hank Anderson 84d90d6ed8 Fixed some compiler errors/warnings for clang. 2015-02-25 11:52:25 -06:00
Hank Anderson 92cdac5f87 Added MSVC functions to cpuid_x86.c to replace gcc-specific ASM. 2015-01-01 21:02:48 -06:00
Werner Saar 4319769b79 added target processor STEAMROLLER 2014-12-28 20:16:46 +08:00
Zhang Xianyi 2987bc7b40 refs #464. Fixed the bug of detecting L2 associative on x86. 2014-11-10 17:15:34 +08:00
Isaac Dunham db7e6366cd Workaround PIC limitations in cpuid.
cpuid uses register ebx, but ebx is reserved in PIC.
So save ebx, swap ebx & edi, and return edi.

Copied from Igor Pavlov's equivalent fix for 7zip (in CpuArch.c),
which is public domain and thus OK license-wise.
2014-08-28 13:05:07 -07:00
Zhang Xianyi c94762bb56 Refs #401. Added NO_AVX2 flag for old binutils (e.g. RHEL6) 2014-07-16 08:38:25 +08:00
Timothy Gu 6c2ead30f0 Remove all trailing whitespace except lapack-netlib
Signed-off-by: Timothy Gu <timothygu99@gmail.com>
2014-06-27 12:05:18 -07:00
wernsaar 88b6bf251a force fallback for x86 32bit 2014-06-22 17:27:11 +02:00
Zhang Xianyi 7b8604ea29 Refs #335. Added the fallback of L2 size detection for some virtual machines. 2014-01-08 11:16:21 +08:00
Zhang Xianyi ab69443bd4 Refs #332. Added addtional Intel Ivy Bridge and Haswell CPU-id. 2014-01-05 23:44:29 +08:00
Zhang Xianyi 2638370844 Init code base for Intel Haswell. 2013-08-13 00:54:59 +08:00
Zhang Xianyi 23186d9f21 Fixed the FMA3 detection bug. 2013-07-27 22:37:57 +08:00
Zhang Xianyi 886cbaf4e4 Support AMD Piledriver by bulldozer kernels. 2013-07-06 12:06:43 -03:00
Dan Luu 88ef307cef Refs #241. Add Haswell support (using sandybridge optimizations) 2013-06-30 22:35:14 +08:00