Commit Graph

1641 Commits

Author SHA1 Message Date
wernsaar 8c05b8105b bugfix in sgemv_n.c 2014-08-05 20:14:29 +02:00
wernsaar c80084a98f changed default x86_64 sgemv_n kernel to sgemv_n.c 2014-08-05 19:42:56 +02:00
wernsaar 2bab92961f enabled optimized sgemv_n kernels for windows 2014-08-05 14:52:54 +02:00
wernsaar 9175b8bd5f changed long to blaslong for windows compatibility 2014-08-05 13:28:39 +02:00
wernsaar 793f2d43b0 added optimized sgemv_n kernel for nehalem 2014-08-05 10:50:08 +02:00
wernsaar a4dde45f87 optimized sgemv_n kernel for sandybridge 2014-08-05 08:53:09 +02:00
wernsaar 7fa7ea3e1e updated haswell optimized sgmv_n kernel 2014-08-05 08:04:47 +02:00
wernsaar 3fbc13eb65 modified sgemv_n for haswell 2014-08-04 16:22:11 +02:00
wernsaar db6917303f added a better optimized sgemv_n kernel for bulldozer and piledriver 2014-08-04 14:29:01 +02:00
Zhang Xianyi c2fdeb6c22 Merge pull request #429 from idunham/numprocs
Fix link error on Linux/musl.
2014-08-04 08:12:23 +08:00
Isaac Dunham f7eb81a846 Fix link error on Linux/musl.
get_nprocs() is a GNU convenience function equivalent to POSIX2008
sysconf(_SC_NPROCESSORS_ONLN); the latter should be available in unistd.h
on any current *nix. (OS X supports this call since 10.5, and FreeBSD
currently supports it. But this commit does not change FreeBSD or OS X
versions.)
2014-08-03 15:06:30 -07:00
Zhang Xianyi edc329883c Merge pull request #427 from wernsaar/develop
added experimental support for big numa machines
2014-08-03 00:57:44 +08:00
wernsaar 793175be3a added experimental support for big numa machines 2014-08-02 13:40:16 +02:00
Zhang Xianyi 83c4ba8d32 Merge pull request #426 from wernsaar/develop
added benchmark program for lapack ?getri functions
2014-08-02 15:34:41 +08:00
wernsaar 271af406f3 bugfix for linux affinity code 2014-08-01 23:10:08 +02:00
wernsaar f5f50b3563 added benchmarks for lapack potrf, potrs and potri functions 2014-08-01 21:08:37 +02:00
wernsaar 651dd22d7d added benchmark program for lapack ?getri functions 2014-08-01 08:55:20 +02:00
Zhang Xianyi f329f77bd0 Merge pull request #425 from wernsaar/develop
added benchmark for lapack ?geev routines
2014-08-01 08:04:16 +08:00
wernsaar 7c611a2f95 bugfix for zgeev 2014-07-31 12:35:38 +02:00
wernsaar 296564e369 added lapack geev benchmark 2014-07-31 10:35:25 +02:00
Zhang Xianyi 27af6e35d3 Merge pull request #424 from ihnorton/fix_arm_cpuid
cpuid_arm: fix detection when cpuinfo uses "Processor"
2014-07-31 13:54:07 +08:00
Isaiah Norton a183ad1df4 cpuid_arm: fix detection when cpuinfo uses "Processor"
instead of "model name"
2014-07-31 05:13:31 +00:00
wernsaar 799a0eabbd bugfix in cholesky.c 2014-07-30 14:00:19 +02:00
wernsaar ca63503e61 extented plot-filter.sh for linpack and cholesky benchmarks 2014-07-30 13:03:42 +02:00
Zhang Xianyi 4f83217df6 Merge pull request #422 from wernsaar/develop
optimization of sandybridge cgemm-kernel
2014-07-30 17:09:58 +08:00
wernsaar 5087096711 optimization of sandybridge cgemm-kernel 2014-07-29 19:07:21 +02:00
Zhang Xianyi 21f7768b26 Merge pull request #421 from wernsaar/develop
optimized sgemm- and cgemm-kernel for haswell
2014-07-29 15:50:00 +08:00
wernsaar 46bc4fd50c optimized cgemm kernel for haswell 2014-07-29 08:53:09 +02:00
wernsaar 1cc02b4337 optimized sgemm kernel for haswell 2014-07-28 11:50:01 +02:00
Zhang Xianyi 6e223db7fc Merge pull request #420 from wernsaar/develop
Optimizations for HASWELL
2014-07-27 23:30:14 +08:00
wernsaar 1d33547222 optimized zgemm kernel for haswell 2014-07-27 11:51:42 +02:00
wernsaar 3ea4dadd30 optimizations for trsm 2014-07-25 11:59:17 +02:00
wernsaar 1b10ff129a optimizations for trmm 2014-07-25 10:00:23 +02:00
wernsaar 125610d23b allow to set custom value for ?GEMM_DEFAULT_UNROLL_MN, optimizations for syrk 2014-07-24 18:43:31 +02:00
wernsaar e213a42cde added a sample plot-filter scripts and a header file for gnuplot 2014-07-21 14:50:24 +02:00
wernsaar e4663be46a added symv benchmark 2014-07-21 07:50:54 +02:00
wernsaar 11637b6926 add benchmark for ger 2014-07-21 06:25:42 +02:00
Zhang Xianyi 80bf3e6a35 Merge pull request #419 from wernsaar/develop
added optimized sgemv kernels for Sandy Bridge, Haswell, Bullldozer, and Piledriver.
2014-07-20 23:35:17 +08:00
wernsaar 6acbafe45b added sgemv_n microkernel for haswell 2014-07-20 14:52:25 +02:00
wernsaar 5392d11b04 optimized sgemv_n_microk_sandy.c 2014-07-20 14:08:04 +02:00
wernsaar c0fe95fb72 added sgemv_n microkernel for sandybridge 2014-07-20 13:17:47 +02:00
wernsaar d9d4077c93 added sgemv_t microkernel for haswell 2014-07-20 11:30:32 +02:00
wernsaar 02eb72ac42 bugfix in sgemv_t_microk_sandy.c 2014-07-20 10:48:41 +02:00
wernsaar c06f9986d4 added sgemv_t microkernel for sandybridge 2014-07-20 10:21:08 +02:00
wernsaar 2cce125c79 added optimized sgemv_t for bulldozer and piledriver 2014-07-19 15:48:07 +02:00
wernsaar b3938fe371 don't use this sgemv_n on Windows 2014-07-19 07:15:34 +02:00
Zhang Xianyi e6668dd83b Merge pull request #414 from staticfloat/sf/symlinkfix
Don't create an absolute symlink when installing on Darwin
2014-07-18 23:13:18 +08:00
wernsaar c8a4a56177 performance optimizations for sgemv_n 2014-07-18 11:25:21 +02:00
wernsaar 3c5732615d added blocked sgemv_n and microkernel for bulldozer and piledriver 2014-07-17 23:15:07 +02:00
Zhang Xianyi f20c0f9819 Merge branch 'develop' 2014-07-17 15:15:57 +08:00