Commit Graph

12 Commits

Author SHA1 Message Date
Wangyang Guo
225683218c Small Matrix: use proper inline asm input constraint for AVX512 mask 2022-02-28 03:22:31 +00:00
Mosè Giordano
abbc947edb Fix compilation of Skylake AVX512 kernels with GCC 6 2022-02-23 22:51:59 +00:00
Martin Kroeker
73ffabe6ba Guard uses of _mm512_reduce_add_p? 2022-02-23 20:06:14 +01:00
Wangyang Guo
ca7682e3a3 Small Matrix: skylakex: sgemm nn: fix n6 conflicts with n4 2021-08-02 07:06:54 +00:00
Wangyang Guo
9967e61abb Small Matrix: skylakex: sgemm nn: fix error when beta not zero 2021-08-02 07:06:54 +00:00
Wangyang Guo
a87736346f Small Matrix: skylakex: sgemm nn: add n6 to improve performance 2021-08-02 07:06:54 +00:00
Wangyang Guo
4c9d9940fd Small Matrix: skylakex: sgemm nn: reduce store 4 N at a time 2021-08-02 07:06:54 +00:00
Wangyang Guo
13b32f69b7 Small Matrix: skylakex: sgemm nn: reduce store 4 M at a time 2021-08-02 07:06:54 +00:00
Wangyang Guo
3d8c6d9607 Small Matrix: skylakex: sgemm nn: clean up unused code 2021-08-02 07:06:54 +00:00
Wangyang Guo
49b61a3f30 Small Matrix: skylakex: sgemm_nn: optimize for M <= 8 2021-08-02 07:06:54 +00:00
Wangyang Guo
f88470323b Optimize M < 16 using AVX512 mask 2021-08-02 07:06:54 +00:00
Wangyang Guo
9186456a12 small matrix: SkylakeX: add SGEMM NN kernel 2021-08-02 07:06:54 +00:00