This patch adds the basic infrastructure for adding the SkylakeX (Intel Skylake server) target. The SkylakeX target will use the AVX512 (AVX512VL level) instruction set, which brings 2 basic things: 1) 512 bit wide SIMD (2x width of AVX2) 2) 32 SIMD registers (2x the number on AVX2) This initial patch only contains a trivial transofrmation of the Haswell SGEMM kernel to AVX512VL; more will follow later but this patch aims to get the infrastructure in place for this "later". Full performance tuning has not been done yet; with more registers and wider SIMD it's in theory possible to retune the kernels but even without that there's an interesting enough performance increase (30-40% range) with just this change.
94 lines
793 B
Plaintext
94 lines
793 B
Plaintext
Force Target Examples:
|
|
|
|
make TARGET=NEHALEM
|
|
make TARGET=LOONGSON3A BINARY=64
|
|
make TARGET=ISTANBUL
|
|
|
|
Supported List:
|
|
1.X86/X86_64
|
|
a)Intel CPU:
|
|
P2
|
|
KATMAI
|
|
COPPERMINE
|
|
NORTHWOOD
|
|
PRESCOTT
|
|
BANIAS
|
|
YONAH
|
|
CORE2
|
|
PENRYN
|
|
DUNNINGTON
|
|
NEHALEM
|
|
SANDYBRIDGE
|
|
HASWELL
|
|
SKYLAKEX
|
|
ATOM
|
|
|
|
b)AMD CPU:
|
|
ATHLON
|
|
OPTERON
|
|
OPTERON_SSE3
|
|
BARCELONA
|
|
SHANGHAI
|
|
ISTANBUL
|
|
BOBCAT
|
|
BULLDOZER
|
|
PILEDRIVER
|
|
STEAMROLLER
|
|
EXCAVATOR
|
|
ZEN
|
|
|
|
c)VIA CPU:
|
|
SSE_GENERIC
|
|
VIAC3
|
|
NANO
|
|
|
|
2.Power CPU:
|
|
POWER4
|
|
POWER5
|
|
POWER6
|
|
POWER7
|
|
POWER8
|
|
PPCG4
|
|
PPC970
|
|
PPC970MP
|
|
PPC440
|
|
PPC440FP2
|
|
CELL
|
|
|
|
3.MIPS CPU:
|
|
P5600
|
|
1004K
|
|
|
|
4.MIPS64 CPU:
|
|
SICORTEX
|
|
LOONGSON3A
|
|
LOONGSON3B
|
|
I6400
|
|
P6600
|
|
I6500
|
|
|
|
5.IA64 CPU:
|
|
ITANIUM2
|
|
|
|
6.SPARC CPU:
|
|
SPARC
|
|
SPARCV7
|
|
|
|
7.ARM CPU:
|
|
CORTEXA15
|
|
CORTEXA9
|
|
ARMV7
|
|
ARMV6
|
|
ARMV5
|
|
|
|
8.ARM 64-bit CPU:
|
|
ARMV8
|
|
CORTEXA57
|
|
VULCAN
|
|
THUNDERX
|
|
THUNDERX2T99
|
|
|
|
9.System Z:
|
|
ZARCH_GENERIC
|
|
Z13
|