wernsaar
aa54fe064c
added zgemv_n c-function
2014-08-07 22:30:20 +02:00
wernsaar
006ef3ea01
added optimized dgemv_t kernel for haswell
2014-08-07 10:08:54 +02:00
wernsaar
60f17628cc
added optimized dgemv_n kernel for haswell
2014-08-07 09:18:02 +02:00
wernsaar
c9bad1403a
added optimized sgemv_t kernel for sandybridge
2014-08-07 07:49:33 +02:00
wernsaar
2f8927376f
enabled optimized nehalem sgemv_t kernel for windows
2014-08-06 16:58:21 +02:00
wernsaar
d945a2b06d
added optimized sgemv_t kernel for nehalem
2014-08-06 16:21:48 +02:00
wernsaar
ca6c8d06ce
enabled optimized sgemv kernels for windows
2014-08-06 14:24:36 +02:00
wernsaar
7aa43c8928
enabled optimized sgemv kernels for windows
2014-08-06 14:06:30 +02:00
wernsaar
891b960854
added optimized sgemv_t kernel for haswell
2014-08-06 13:42:41 +02:00
wernsaar
95a8caa2f3
added optimized sgemv_t kernel
2014-08-06 12:12:17 +02:00
Zhang Xianyi
5c0d0ecbde
Merge pull request #430 from wernsaar/develop
...
added a better optimized sgemv_n kernel
2014-08-06 02:52:30 +08:00
wernsaar
8c05b8105b
bugfix in sgemv_n.c
2014-08-05 20:14:29 +02:00
wernsaar
c80084a98f
changed default x86_64 sgemv_n kernel to sgemv_n.c
2014-08-05 19:42:56 +02:00
wernsaar
2bab92961f
enabled optimized sgemv_n kernels for windows
2014-08-05 14:52:54 +02:00
wernsaar
9175b8bd5f
changed long to blaslong for windows compatibility
2014-08-05 13:28:39 +02:00
wernsaar
793f2d43b0
added optimized sgemv_n kernel for nehalem
2014-08-05 10:50:08 +02:00
wernsaar
a4dde45f87
optimized sgemv_n kernel for sandybridge
2014-08-05 08:53:09 +02:00
wernsaar
7fa7ea3e1e
updated haswell optimized sgmv_n kernel
2014-08-05 08:04:47 +02:00
wernsaar
3fbc13eb65
modified sgemv_n for haswell
2014-08-04 16:22:11 +02:00
wernsaar
db6917303f
added a better optimized sgemv_n kernel for bulldozer and piledriver
2014-08-04 14:29:01 +02:00
Zhang Xianyi
c2fdeb6c22
Merge pull request #429 from idunham/numprocs
...
Fix link error on Linux/musl.
2014-08-04 08:12:23 +08:00
Isaac Dunham
f7eb81a846
Fix link error on Linux/musl.
...
get_nprocs() is a GNU convenience function equivalent to POSIX2008
sysconf(_SC_NPROCESSORS_ONLN); the latter should be available in unistd.h
on any current *nix. (OS X supports this call since 10.5, and FreeBSD
currently supports it. But this commit does not change FreeBSD or OS X
versions.)
2014-08-03 15:06:30 -07:00
Zhang Xianyi
edc329883c
Merge pull request #427 from wernsaar/develop
...
added experimental support for big numa machines
2014-08-03 00:57:44 +08:00
wernsaar
793175be3a
added experimental support for big numa machines
2014-08-02 13:40:16 +02:00
Zhang Xianyi
83c4ba8d32
Merge pull request #426 from wernsaar/develop
...
added benchmark program for lapack ?getri functions
2014-08-02 15:34:41 +08:00
wernsaar
271af406f3
bugfix for linux affinity code
2014-08-01 23:10:08 +02:00
wernsaar
f5f50b3563
added benchmarks for lapack potrf, potrs and potri functions
2014-08-01 21:08:37 +02:00
wernsaar
651dd22d7d
added benchmark program for lapack ?getri functions
2014-08-01 08:55:20 +02:00
Zhang Xianyi
f329f77bd0
Merge pull request #425 from wernsaar/develop
...
added benchmark for lapack ?geev routines
2014-08-01 08:04:16 +08:00
wernsaar
7c611a2f95
bugfix for zgeev
2014-07-31 12:35:38 +02:00
wernsaar
296564e369
added lapack geev benchmark
2014-07-31 10:35:25 +02:00
Zhang Xianyi
27af6e35d3
Merge pull request #424 from ihnorton/fix_arm_cpuid
...
cpuid_arm: fix detection when cpuinfo uses "Processor"
2014-07-31 13:54:07 +08:00
Isaiah Norton
a183ad1df4
cpuid_arm: fix detection when cpuinfo uses "Processor"
...
instead of "model name"
2014-07-31 05:13:31 +00:00
wernsaar
799a0eabbd
bugfix in cholesky.c
2014-07-30 14:00:19 +02:00
wernsaar
ca63503e61
extented plot-filter.sh for linpack and cholesky benchmarks
2014-07-30 13:03:42 +02:00
Zhang Xianyi
4f83217df6
Merge pull request #422 from wernsaar/develop
...
optimization of sandybridge cgemm-kernel
2014-07-30 17:09:58 +08:00
wernsaar
5087096711
optimization of sandybridge cgemm-kernel
2014-07-29 19:07:21 +02:00
Zhang Xianyi
21f7768b26
Merge pull request #421 from wernsaar/develop
...
optimized sgemm- and cgemm-kernel for haswell
2014-07-29 15:50:00 +08:00
wernsaar
46bc4fd50c
optimized cgemm kernel for haswell
2014-07-29 08:53:09 +02:00
wernsaar
1cc02b4337
optimized sgemm kernel for haswell
2014-07-28 11:50:01 +02:00
Zhang Xianyi
6e223db7fc
Merge pull request #420 from wernsaar/develop
...
Optimizations for HASWELL
2014-07-27 23:30:14 +08:00
wernsaar
1d33547222
optimized zgemm kernel for haswell
2014-07-27 11:51:42 +02:00
wernsaar
3ea4dadd30
optimizations for trsm
2014-07-25 11:59:17 +02:00
wernsaar
1b10ff129a
optimizations for trmm
2014-07-25 10:00:23 +02:00
wernsaar
125610d23b
allow to set custom value for ?GEMM_DEFAULT_UNROLL_MN, optimizations for syrk
2014-07-24 18:43:31 +02:00
wernsaar
e213a42cde
added a sample plot-filter scripts and a header file for gnuplot
2014-07-21 14:50:24 +02:00
wernsaar
e4663be46a
added symv benchmark
2014-07-21 07:50:54 +02:00
wernsaar
11637b6926
add benchmark for ger
2014-07-21 06:25:42 +02:00
Zhang Xianyi
80bf3e6a35
Merge pull request #419 from wernsaar/develop
...
added optimized sgemv kernels for Sandy Bridge, Haswell, Bullldozer, and Piledriver.
2014-07-20 23:35:17 +08:00
wernsaar
6acbafe45b
added sgemv_n microkernel for haswell
2014-07-20 14:52:25 +02:00