Commit Graph

61 Commits

Author SHA1 Message Date
wernsaar
8eaa04acbb added zgemm_kernel_2x2_bulldozer.S 2013-06-11 12:00:49 +02:00
wernsaar
d65bbec99b added new sgemm kernel for BULLDOZER 2013-06-09 15:57:42 +02:00
wernsaar
25491e42f9 New dgemm kernel for BULLDOZER: dgemm_kernel_8x2_bulldozer.S 2013-06-08 09:40:17 +02:00
wernsaar
69aa6c8fb1 bad performance with some data 2013-04-28 11:14:23 +02:00
wernsaar
60b263f3d2 removed trsm_kernel_RT_4x4_bulldozer.S. wrong results 2013-04-27 17:23:08 +02:00
wernsaar
7ac306e0da added trsm_kernel_RT_4x4_bulldozer.S 2013-04-27 16:48:48 +02:00
wernsaar
4cb454cdf2 added trsm_kernel_LT_4x4_bulldozer.S 2013-04-27 14:30:00 +02:00
wernsaar
7641f6e253 Merged some improvements into dgemm_kernel_4x4_bulldozer.S.
Changed the copy functions to generic to solve prefetch conflicts
2013-04-16 19:05:06 +02:00
Zhang Xianyi
0133580148 Used sgemm bulldozer kernel on 64 bit. 2013-02-09 16:29:14 +01:00
Zhang Xianyi
f19af5ecc0 Refs #54. Added AMD Bulldozer x86_64 dgemm kernel developed by Werner Saar <wernsaar at googlemail.com>
Based on the dgemm kernel for AMD Barcelona, he used AVX and FMA4 instructions.
Thank Werner Saar!
2012-12-07 01:05:11 +08:00
Zhang Xianyi
b7c0fa6bd2 Init AMD Bulldozer codebase. 2012-12-06 07:29:54 -05:00