Commit Graph

18 Commits

Author SHA1 Message Date
Arjan van de Ven 99c7bba8e4 Initial support for SkylakeX / AVX512
This patch adds the basic infrastructure for adding the SkylakeX (Intel Skylake server)
target. The SkylakeX target will use the AVX512 (AVX512VL level) instruction set,
which brings 2 basic things:
1) 512 bit wide SIMD (2x width of AVX2)
2) 32 SIMD registers (2x the number on AVX2)

This initial patch only contains a trivial transofrmation of the Haswell SGEMM kernel
to AVX512VL; more will follow later but this patch aims to get the infrastructure
in place for this "later".

Full performance tuning has not been done yet; with more registers and wider SIMD
it's in theory possible to retune the kernels but even without that there's an
interesting enough performance increase (30-40% range) with just this change.
2018-06-03 07:58:52 +00:00
Andrew 281a2b952f warning cleanup (#1380)
* dead increments in driver/level2

* dead increments in kernel/generic

* part dead increments in kernel/x86_64
2017-12-05 19:54:10 +01:00
Martin Kroeker 5f402b7759
Remove unused (loop?) variable j from the gemv_n_4 implementations 2017-11-14 23:29:42 +01:00
Denis Steckelmacher c9ff735da6 Add ZEN support (tested for auto-detected static backend) 2017-03-19 15:32:50 +01:00
Werner Saar 298b13bba4 updated some kernel files for EXCAVATOR 2016-04-25 10:36:23 +02:00
Zhang Xianyi 6e7be06e07 Refs JuliaLang/julia#5728. Fix gemv performance bug on Haswell Mac OSX.
On Mac OS X, it should use .align 4 (equal to .align 16 on Linux).
I didn't get the performance benefit from .align. Thus, I deleted it.
2016-02-19 17:56:07 -05:00
Werner Saar 30f52d53df optimized dgemv_n kernel for haswell 2015-04-30 12:11:39 +02:00
Werner Saar ddf983d643 added optimizations for steamroller 2014-12-30 20:14:45 +08:00
Werner Saar 4319769b79 added target processor STEAMROLLER 2014-12-28 20:16:46 +08:00
Werner Saar bc5fff7085 changed inline assembler labels to short form 2014-12-07 12:38:54 +01:00
wernsaar 75207b1148 optimized sgemv_n for very small size of m 2014-09-07 18:23:48 +02:00
wernsaar 274828fa50 optimizations for very small sizes 2014-09-07 13:45:03 +02:00
wernsaar d143f84dd2 added optimized sgemv_n kernel for haswell 2014-09-06 12:08:48 +02:00
wernsaar a64fe9bcc9 added optimized sgemv_n kernel for sandybridge 2014-09-06 08:41:53 +02:00
wernsaar 53de943690 bugfix for sgemv_n_4.c 2014-09-04 18:55:52 +02:00
wernsaar 7f910010a0 optimized sgemv_n kernel for small sizes 2014-09-04 13:09:27 +02:00
wernsaar 3a5d8dbff9 optimized sgemv_n_4.c 2014-09-03 15:34:30 +02:00
wernsaar 2a60c6d4b0 optimized sgemv_n for small sizes 2014-09-03 14:48:45 +02:00