33 Commits

Author SHA1 Message Date
Martin Kroeker
cae5d9a20b Add trivially optimized dsdot based on sdot 2017-11-24 20:04:29 +01:00
Isuru Fernando
2c51a990ac Fix extra whitespaces. CMake parser macro fails with it
TODO: Fix the parser macro to strip trailing whitespaces
2017-08-02 18:26:57 +05:30
Werner Saar
c99cc41cbd Added optimized zgemv_n kernel for bulldozer, piledriver and steamroller 2016-03-09 14:02:03 +01:00
Werner Saar
c8f2c5d636 added optimized trsm_kernels 2016-01-05 13:05:05 +01:00
Werner Saar
95b1faf667 added optimized cscal and zscal kernels for steamroller and piledriver 2015-05-18 10:50:57 +02:00
Werner Saar
e00cccc41e added optimized dscal kernel for piledriver 2015-05-13 13:05:35 +02:00
Werner Saar
34ba66606a add optimized daxpy-kernel for piledriver 2015-04-14 14:23:29 +02:00
Werner Saar
331c417637 optimized saxpy for piledriver 2015-04-14 08:34:11 +02:00
Werner Saar
9299d8cfd6 added optimized cdot- and zdot-kernels for bulldozer 2015-04-08 16:29:55 +02:00
Werner Saar
60c6dec6e6 updated some lines for bulldozer 2015-04-06 18:47:16 +02:00
wernsaar
9908b6031c bugfix in KERNEL.PILEDRIVER 2014-09-13 16:26:53 +02:00
wernsaar
8a39cdb1c1 added optimized zgemv_t kernel for haswell 2014-09-13 09:47:07 +02:00
wernsaar
80f7786875 enabled optimized sgemv kernels for piledriver 2014-09-07 21:13:57 +02:00
wernsaar
ca6c8d06ce enabled optimized sgemv kernels for windows 2014-08-06 14:24:36 +02:00
wernsaar
95a8caa2f3 added optimized sgemv_t kernel 2014-08-06 12:12:17 +02:00
wernsaar
2bab92961f enabled optimized sgemv_n kernels for windows 2014-08-05 14:52:54 +02:00
wernsaar
db6917303f added a better optimized sgemv_n kernel for bulldozer and piledriver 2014-08-04 14:29:01 +02:00
wernsaar
2cce125c79 added optimized sgemv_t for bulldozer and piledriver 2014-07-19 15:48:07 +02:00
wernsaar
b3938fe371 don't use this sgemv_n on Windows 2014-07-19 07:15:34 +02:00
wernsaar
3c5732615d added blocked sgemv_n and microkernel for bulldozer and piledriver 2014-07-17 23:15:07 +02:00
wernsaar
880597b301 segment violation in sgemv kernels 2014-07-13 10:46:14 +02:00
wernsaar
13348b2137 removed reference to daxpy_bulldozer kernel (Windows bug in lapack-test) 2014-07-06 16:39:32 +02:00
Zhang Xianyi
99efbbbad5 Fixed #395. Enable optimized cgemm for Sandybridge. Added optimized sdot kernel.
Fixed c/zgemm, zgemv computational error of haswell, piledriver, bullldozer, and
barcelona on Windows.

Merge branch 'develop' of https://github.com/wernsaar/OpenBLAS into wernsaar-develop

Conflicts:
	kernel/Makefile.L1
	kernel/x86_64/KERNEL
	param.h
2014-06-29 10:34:51 +08:00
wernsaar
a15f22a1f6 bugfix for piledriver cgemm-, zgemm- and zgemv-kernel 2014-06-28 11:46:58 +02:00
Timothy Gu
6c2ead30f0 Remove all trailing whitespace except lapack-netlib
Signed-off-by: Timothy Gu <timothygu99@gmail.com>
2014-06-27 12:05:18 -07:00
wernsaar
a13bcc1716 enabled optimized sgemv kernel for barcelona and piledriver 2014-06-25 13:50:57 +02:00
wernsaar
5118a7f4d1 small optimizations on dgemm_kernel for Piledriver 2013-10-31 11:53:26 +01:00
wernsaar
e172b70ea2 added cgemm_kernel for Piledriver 2013-10-31 08:38:17 +01:00
wernsaar
1cf4b974b2 added zgemm_kernel for Piledriver 2013-10-30 09:12:17 +01:00
wernsaar
7bccff1512 added sgemm_kernel for PILEDRIVER 2013-10-29 22:53:04 +01:00
wernsaar
2840d56aeb added dgemm_kernel for Piledriver 2013-10-19 09:47:15 +02:00
Zhang Xianyi
6c4a7d0828 Import AMD Piledriver DGEMM kernel generated by AUGEM.
So far, this kernel doesn't deal with edge.

AUGEM: Automatically Generate High Performance Dense Linear Algebra
Kernels on x86 CPUs.
Qian Wang, Xianyi Zhang, Yunquan Zhang, and Qing Yi. In the
International Conference for High Performance Computing, Networking,
Storage and Analysis (SC'13). Denver, CO. Nov, 2013.
2013-08-25 10:16:01 -03:00
Zhang Xianyi
886cbaf4e4 Support AMD Piledriver by bulldozer kernels. 2013-07-06 12:06:43 -03:00