Commit Graph

14 Commits

Author SHA1 Message Date
Martin Kroeker
5e3e91d0fc Split the microkernel workload into chunks of 32 floats for dsdot mode to limit loss of precision 2017-10-22 18:18:51 +02:00
Martin Kroeker
8ac87c1cb6 Implement DSDOT with unchanged sdot microkernels 2017-10-16 23:27:51 +02:00
Denis Steckelmacher
c9ff735da6 Add ZEN support (tested for auto-detected static backend) 2017-03-19 15:32:50 +01:00
Werner Saar
298b13bba4 updated some kernel files for EXCAVATOR 2016-04-25 10:36:23 +02:00
Zhang Xianyi
d4380c1fe4 Refs xianyi/OpenBLAS-CI#10 , Fix sdot for scipy test_iterative.test_convergence test failure on AMD bulldozer and piledriver. 2016-04-07 01:44:18 +08:00
Werner Saar
c22068c406 optimized sdot.c for increments != 1 2015-04-24 13:13:20 +02:00
Werner Saar
d7a17ad85d optimized sdot-kernel for pilediver 2015-04-13 13:19:21 +02:00
Werner Saar
166d76e864 added optimized sdot-kernel for steamroller 2015-04-11 08:48:18 +02:00
Werner Saar
3937e2a0a0 add optimized sdot-kernel for sandybridge 2015-04-05 19:47:05 +02:00
Werner Saar
701b9d7556 added optimized sdot- and ddot-kernel for HASWELL 2015-04-05 17:57:53 +02:00
Werner Saar
4319769b79 added target processor STEAMROLLER 2014-12-28 20:16:46 +08:00
wernsaar
ee74445155 added optimized caxpy kernel for bulldozer 2014-08-25 14:53:28 +02:00
wernsaar
5d97b0754c added optimized sdot kernel for nehalem 2014-08-22 17:00:26 +02:00
wernsaar
8a9e868919 added optimized sdot for bulldozer 2014-08-22 14:29:17 +02:00