Commit Graph

1251 Commits

Author SHA1 Message Date
Zhang Xianyi 7e4e195e82 Merge branch 'develop' 2014-10-13 17:10:41 +08:00
Zhang Xianyi ac5a7e1c1b Update dot to 0.2.12 version. 2014-10-13 17:10:12 +08:00
wernsaar f1b9a4a1ca Ref #454: fixed bug in common_param.h 2014-09-23 11:34:29 +02:00
Zhang Xianyi ae6b7caf32 Merge pull request #453 from wernsaar/develop
Enabled GEMM3M functions
2014-09-22 16:47:54 +08:00
wernsaar f446d2368a updated cblas.h and cblas_noconst.h 2014-09-21 13:39:15 +02:00
wernsaar dab4edd069 added benchmark for gemm3m functions 2014-09-21 12:00:41 +02:00
wernsaar 9d7057366d bugfix for GEMM3M functions 2014-09-21 11:41:43 +02:00
wernsaar 7f234f8ed1 added GEMM3M tests 2014-09-21 10:55:08 +02:00
wernsaar 9e829ce98f enabled cblas gemm3m functions 2014-09-20 17:20:02 +02:00
wernsaar d49fd33885 disabled SYMM3M and HEMM3M functions because segment violations 2014-09-20 15:27:40 +02:00
wernsaar f0f9b25bb6 added test for CGEMM3M function 2014-09-20 14:53:30 +02:00
wernsaar 7aae4a62e7 enabled use of GEMM3M functions 2014-09-20 14:27:10 +02:00
wernsaar 7a911569b8 added test for GEMM3M functions 2014-09-20 14:21:42 +02:00
wernsaar 466bfb8b86 updated README.md 2014-09-17 16:01:07 +02:00
Zhang Xianyi 70d1ba09b2 Update the doc for target list. 2014-09-17 14:29:21 +08:00
Zhang Xianyi d293b78b64 Merge pull request #451 from eshelman/patch-1
Add HASWELL to TargetList.txt
2014-09-17 14:20:06 +08:00
Eliot Eshelman 9912dbbcf9 Add HASWELL to TargetList.txt
The Intel "Haswell" architecture is missing from the list of build targets.
2014-09-16 18:26:45 -04:00
Zhang Xianyi 01bc462e8e Merge pull request #449 from wernsaar/develop
optimized multithreading lower limits
2014-09-16 14:33:48 +08:00
wernsaar 3300f5ebff optimized multithreading lower limits 2014-09-15 11:38:25 +02:00
Zhang Xianyi 59e2c20557 Merge pull request #448 from wernsaar/develop
Optimized cgemv and zgemv kernels
2014-09-15 13:12:14 +08:00
wernsaar b7c9566eea removed obsolete gemv kernel files 2014-09-14 11:00:53 +02:00
wernsaar 6df1b0be81 optimized zgemv_n_microk_sandy-4.c 2014-09-14 10:21:22 +02:00
wernsaar 2ac1e076c1 added optimized zgemv_n kernel for sandybridge 2014-09-14 09:02:05 +02:00
wernsaar 9908b6031c bugfix in KERNEL.PILEDRIVER 2014-09-13 16:26:53 +02:00
wernsaar 8f100a14f2 optimized cgemv_t kernel for haswell 2014-09-13 16:13:27 +02:00
wernsaar 53b5726b04 added optimized cgemv_t kernel for haswell 2014-09-13 15:14:12 +02:00
wernsaar 1a352b24e6 updated KERNEL.HASWELL 2014-09-13 12:23:27 +02:00
wernsaar 5194818d4b updated zgemv_t_4.c 2014-09-13 09:48:34 +02:00
wernsaar 8a39cdb1c1 added optimized zgemv_t kernel for haswell 2014-09-13 09:47:07 +02:00
wernsaar fd2478c9e2 optimized interface/zgemv.c for multithreading 2014-09-12 19:18:23 +02:00
wernsaar 0a1390f2d8 enabled optimized zgemv_t kernel for bulldozer 2014-09-12 17:43:47 +02:00
wernsaar a8b0812feb optimized zgemv_t for bulldozer 2014-09-12 17:42:25 +02:00
wernsaar a0fb68ab42 added optimized zgemv_t kernel for bulldozer 2014-09-12 17:04:22 +02:00
wernsaar 44c11165d5 bugfix in cgemv_t_4.c 2014-09-12 14:12:24 +02:00
wernsaar 564be4eb72 added optimized cgemv_t kernel 2014-09-12 13:38:01 +02:00
wernsaar 107c3ea7d5 added optimized zgemv_t routine 2014-09-12 12:35:20 +02:00
wernsaar bb8d698335 optimized zgemv_n_microk_haswell-4.c for small size 2014-09-11 13:44:55 +02:00
wernsaar e0192a6914 bugfix in zgemv_n_4.c 2014-09-11 13:18:00 +02:00
wernsaar bced4594bb added optimized zgemv_n kernel 2014-09-11 12:34:57 +02:00
wernsaar cafba99b6b bufix in cgemv_n_microk_haswell-4.c 2014-09-11 11:12:44 +02:00
wernsaar ac8f232b2a more optimizations 2014-09-11 10:25:48 +02:00
wernsaar f98e1244c4 optimized cgemv_n_4.c 2014-09-10 19:26:14 +02:00
wernsaar be95700b30 added optimized cgemv_kernel for haswell 2014-09-10 14:11:24 +02:00
wernsaar 4aa534ae93 added cgemv_n kernel, optimized for small sizes 2014-09-10 13:45:13 +02:00
Zhang Xianyi 1cba8e7b11 Merge pull request #446 from grisuthedragon/cblas_matcopy
Add a CBLAS interface for the BLAS extension s/d/c/z*matcopy routines.
2014-09-10 16:31:31 +08:00
Zhang Xianyi d13e92f07e Merge pull request #445 from wernsaar/develop
A lot of optimizations for gemv kernels
2014-09-10 16:28:14 +08:00
wernsaar baa46e4fba added and tested optimized dgemv_n kernel for haswell 2014-09-09 16:17:45 +02:00
wernsaar faab7a181d added optimized dgemv_n kernel for haswell 2014-09-09 15:32:32 +02:00
wernsaar 8109d8232c optimized dgemv_t kernel for haswell 2014-09-09 14:38:08 +02:00
wernsaar debc6d1a05 bugfix in KERNEL.HASWELL 2014-09-09 14:04:44 +02:00