wernsaar
|
6093ee5363
|
bugfix in zgemv_n_microk_haswell-2.c
|
2014-08-12 10:02:25 +02:00 |
wernsaar
|
07c66b1960
|
modified algorithm for better numerical stability
|
2014-08-12 08:35:42 +02:00 |
wernsaar
|
58b075daef
|
added optimized zgemv_t kernel for haswell
|
2014-08-11 16:57:52 +02:00 |
wernsaar
|
09fcd3a341
|
add optimized zgemv_t kernel for bulldozer
|
2014-08-11 14:19:25 +02:00 |
wernsaar
|
726ad085cb
|
added optimized zgemv_t for haswell
|
2014-08-11 13:10:12 +02:00 |
wernsaar
|
6fe416976d
|
added optimimized zgemv_t c-kernel
|
2014-08-11 09:13:18 +02:00 |
wernsaar
|
dbc2eff029
|
disabled optimized haswell zgemv_n kernel for windows ( bad rounding )
|
2014-08-10 11:57:24 +02:00 |
wernsaar
|
462b4885ff
|
added optimized zgemv_n kernel for haswell
|
2014-08-10 08:39:17 +02:00 |
wernsaar
|
aa54fe064c
|
added zgemv_n c-function
|
2014-08-07 22:30:20 +02:00 |
wernsaar
|
006ef3ea01
|
added optimized dgemv_t kernel for haswell
|
2014-08-07 10:08:54 +02:00 |
wernsaar
|
60f17628cc
|
added optimized dgemv_n kernel for haswell
|
2014-08-07 09:18:02 +02:00 |
wernsaar
|
c9bad1403a
|
added optimized sgemv_t kernel for sandybridge
|
2014-08-07 07:49:33 +02:00 |
wernsaar
|
2f8927376f
|
enabled optimized nehalem sgemv_t kernel for windows
|
2014-08-06 16:58:21 +02:00 |
wernsaar
|
d945a2b06d
|
added optimized sgemv_t kernel for nehalem
|
2014-08-06 16:21:48 +02:00 |
wernsaar
|
ca6c8d06ce
|
enabled optimized sgemv kernels for windows
|
2014-08-06 14:24:36 +02:00 |
wernsaar
|
7aa43c8928
|
enabled optimized sgemv kernels for windows
|
2014-08-06 14:06:30 +02:00 |
wernsaar
|
891b960854
|
added optimized sgemv_t kernel for haswell
|
2014-08-06 13:42:41 +02:00 |
wernsaar
|
95a8caa2f3
|
added optimized sgemv_t kernel
|
2014-08-06 12:12:17 +02:00 |
wernsaar
|
8c05b8105b
|
bugfix in sgemv_n.c
|
2014-08-05 20:14:29 +02:00 |
wernsaar
|
c80084a98f
|
changed default x86_64 sgemv_n kernel to sgemv_n.c
|
2014-08-05 19:42:56 +02:00 |
wernsaar
|
2bab92961f
|
enabled optimized sgemv_n kernels for windows
|
2014-08-05 14:52:54 +02:00 |
wernsaar
|
9175b8bd5f
|
changed long to blaslong for windows compatibility
|
2014-08-05 13:28:39 +02:00 |
wernsaar
|
793f2d43b0
|
added optimized sgemv_n kernel for nehalem
|
2014-08-05 10:50:08 +02:00 |
wernsaar
|
a4dde45f87
|
optimized sgemv_n kernel for sandybridge
|
2014-08-05 08:53:09 +02:00 |
wernsaar
|
7fa7ea3e1e
|
updated haswell optimized sgmv_n kernel
|
2014-08-05 08:04:47 +02:00 |
wernsaar
|
3fbc13eb65
|
modified sgemv_n for haswell
|
2014-08-04 16:22:11 +02:00 |
wernsaar
|
db6917303f
|
added a better optimized sgemv_n kernel for bulldozer and piledriver
|
2014-08-04 14:29:01 +02:00 |
wernsaar
|
793175be3a
|
added experimental support for big numa machines
|
2014-08-02 13:40:16 +02:00 |
wernsaar
|
271af406f3
|
bugfix for linux affinity code
|
2014-08-01 23:10:08 +02:00 |
wernsaar
|
f5f50b3563
|
added benchmarks for lapack potrf, potrs and potri functions
|
2014-08-01 21:08:37 +02:00 |
wernsaar
|
651dd22d7d
|
added benchmark program for lapack ?getri functions
|
2014-08-01 08:55:20 +02:00 |
wernsaar
|
7c611a2f95
|
bugfix for zgeev
|
2014-07-31 12:35:38 +02:00 |
wernsaar
|
296564e369
|
added lapack geev benchmark
|
2014-07-31 10:35:25 +02:00 |
wernsaar
|
799a0eabbd
|
bugfix in cholesky.c
|
2014-07-30 14:00:19 +02:00 |
wernsaar
|
ca63503e61
|
extented plot-filter.sh for linpack and cholesky benchmarks
|
2014-07-30 13:03:42 +02:00 |
wernsaar
|
5087096711
|
optimization of sandybridge cgemm-kernel
|
2014-07-29 19:07:21 +02:00 |
Zhang Xianyi
|
21f7768b26
|
Merge pull request #421 from wernsaar/develop
optimized sgemm- and cgemm-kernel for haswell
|
2014-07-29 15:50:00 +08:00 |
wernsaar
|
46bc4fd50c
|
optimized cgemm kernel for haswell
|
2014-07-29 08:53:09 +02:00 |
wernsaar
|
1cc02b4337
|
optimized sgemm kernel for haswell
|
2014-07-28 11:50:01 +02:00 |
Zhang Xianyi
|
6e223db7fc
|
Merge pull request #420 from wernsaar/develop
Optimizations for HASWELL
|
2014-07-27 23:30:14 +08:00 |
wernsaar
|
1d33547222
|
optimized zgemm kernel for haswell
|
2014-07-27 11:51:42 +02:00 |
wernsaar
|
3ea4dadd30
|
optimizations for trsm
|
2014-07-25 11:59:17 +02:00 |
wernsaar
|
1b10ff129a
|
optimizations for trmm
|
2014-07-25 10:00:23 +02:00 |
wernsaar
|
125610d23b
|
allow to set custom value for ?GEMM_DEFAULT_UNROLL_MN, optimizations for syrk
|
2014-07-24 18:43:31 +02:00 |
wernsaar
|
e213a42cde
|
added a sample plot-filter scripts and a header file for gnuplot
|
2014-07-21 14:50:24 +02:00 |
wernsaar
|
e4663be46a
|
added symv benchmark
|
2014-07-21 07:50:54 +02:00 |
wernsaar
|
11637b6926
|
add benchmark for ger
|
2014-07-21 06:25:42 +02:00 |
Zhang Xianyi
|
80bf3e6a35
|
Merge pull request #419 from wernsaar/develop
added optimized sgemv kernels for Sandy Bridge, Haswell, Bullldozer, and Piledriver.
|
2014-07-20 23:35:17 +08:00 |
wernsaar
|
6acbafe45b
|
added sgemv_n microkernel for haswell
|
2014-07-20 14:52:25 +02:00 |
wernsaar
|
5392d11b04
|
optimized sgemv_n_microk_sandy.c
|
2014-07-20 14:08:04 +02:00 |