Commit Graph

20 Commits

Author SHA1 Message Date
wernsaar
ca6c8d06ce enabled optimized sgemv kernels for windows 2014-08-06 14:24:36 +02:00
wernsaar
95a8caa2f3 added optimized sgemv_t kernel 2014-08-06 12:12:17 +02:00
wernsaar
2bab92961f enabled optimized sgemv_n kernels for windows 2014-08-05 14:52:54 +02:00
wernsaar
db6917303f added a better optimized sgemv_n kernel for bulldozer and piledriver 2014-08-04 14:29:01 +02:00
wernsaar
2cce125c79 added optimized sgemv_t for bulldozer and piledriver 2014-07-19 15:48:07 +02:00
wernsaar
b3938fe371 don't use this sgemv_n on Windows 2014-07-19 07:15:34 +02:00
wernsaar
3c5732615d added blocked sgemv_n and microkernel for bulldozer and piledriver 2014-07-17 23:15:07 +02:00
wernsaar
880597b301 segment violation in sgemv kernels 2014-07-13 10:46:14 +02:00
wernsaar
13348b2137 removed reference to daxpy_bulldozer kernel (Windows bug in lapack-test) 2014-07-06 16:39:32 +02:00
Zhang Xianyi
99efbbbad5 Fixed #395. Enable optimized cgemm for Sandybridge. Added optimized sdot kernel.
Fixed c/zgemm, zgemv computational error of haswell, piledriver, bullldozer, and
barcelona on Windows.

Merge branch 'develop' of https://github.com/wernsaar/OpenBLAS into wernsaar-develop

Conflicts:
	kernel/Makefile.L1
	kernel/x86_64/KERNEL
	param.h
2014-06-29 10:34:51 +08:00
wernsaar
a15f22a1f6 bugfix for piledriver cgemm-, zgemm- and zgemv-kernel 2014-06-28 11:46:58 +02:00
Timothy Gu
6c2ead30f0 Remove all trailing whitespace except lapack-netlib
Signed-off-by: Timothy Gu <timothygu99@gmail.com>
2014-06-27 12:05:18 -07:00
wernsaar
a13bcc1716 enabled optimized sgemv kernel for barcelona and piledriver 2014-06-25 13:50:57 +02:00
wernsaar
5118a7f4d1 small optimizations on dgemm_kernel for Piledriver 2013-10-31 11:53:26 +01:00
wernsaar
e172b70ea2 added cgemm_kernel for Piledriver 2013-10-31 08:38:17 +01:00
wernsaar
1cf4b974b2 added zgemm_kernel for Piledriver 2013-10-30 09:12:17 +01:00
wernsaar
7bccff1512 added sgemm_kernel for PILEDRIVER 2013-10-29 22:53:04 +01:00
wernsaar
2840d56aeb added dgemm_kernel for Piledriver 2013-10-19 09:47:15 +02:00
Zhang Xianyi
6c4a7d0828 Import AMD Piledriver DGEMM kernel generated by AUGEM.
So far, this kernel doesn't deal with edge.

AUGEM: Automatically Generate High Performance Dense Linear Algebra
Kernels on x86 CPUs.
Qian Wang, Xianyi Zhang, Yunquan Zhang, and Qing Yi. In the
International Conference for High Performance Computing, Networking,
Storage and Analysis (SC'13). Denver, CO. Nov, 2013.
2013-08-25 10:16:01 -03:00
Zhang Xianyi
886cbaf4e4 Support AMD Piledriver by bulldozer kernels. 2013-07-06 12:06:43 -03:00