Zhang Xianyi
|
7e4e195e82
|
Merge branch 'develop'
|
2014-10-13 17:10:41 +08:00 |
Zhang Xianyi
|
ac5a7e1c1b
|
Update dot to 0.2.12 version.
|
2014-10-13 17:10:12 +08:00 |
wernsaar
|
f1b9a4a1ca
|
Ref #454: fixed bug in common_param.h
|
2014-09-23 11:34:29 +02:00 |
Zhang Xianyi
|
ae6b7caf32
|
Merge pull request #453 from wernsaar/develop
Enabled GEMM3M functions
|
2014-09-22 16:47:54 +08:00 |
wernsaar
|
f446d2368a
|
updated cblas.h and cblas_noconst.h
|
2014-09-21 13:39:15 +02:00 |
wernsaar
|
dab4edd069
|
added benchmark for gemm3m functions
|
2014-09-21 12:00:41 +02:00 |
wernsaar
|
9d7057366d
|
bugfix for GEMM3M functions
|
2014-09-21 11:41:43 +02:00 |
wernsaar
|
7f234f8ed1
|
added GEMM3M tests
|
2014-09-21 10:55:08 +02:00 |
wernsaar
|
9e829ce98f
|
enabled cblas gemm3m functions
|
2014-09-20 17:20:02 +02:00 |
wernsaar
|
d49fd33885
|
disabled SYMM3M and HEMM3M functions because segment violations
|
2014-09-20 15:27:40 +02:00 |
wernsaar
|
f0f9b25bb6
|
added test for CGEMM3M function
|
2014-09-20 14:53:30 +02:00 |
wernsaar
|
7aae4a62e7
|
enabled use of GEMM3M functions
|
2014-09-20 14:27:10 +02:00 |
wernsaar
|
7a911569b8
|
added test for GEMM3M functions
|
2014-09-20 14:21:42 +02:00 |
wernsaar
|
466bfb8b86
|
updated README.md
|
2014-09-17 16:01:07 +02:00 |
Zhang Xianyi
|
70d1ba09b2
|
Update the doc for target list.
|
2014-09-17 14:29:21 +08:00 |
Zhang Xianyi
|
d293b78b64
|
Merge pull request #451 from eshelman/patch-1
Add HASWELL to TargetList.txt
|
2014-09-17 14:20:06 +08:00 |
Eliot Eshelman
|
9912dbbcf9
|
Add HASWELL to TargetList.txt
The Intel "Haswell" architecture is missing from the list of build targets.
|
2014-09-16 18:26:45 -04:00 |
Zhang Xianyi
|
01bc462e8e
|
Merge pull request #449 from wernsaar/develop
optimized multithreading lower limits
|
2014-09-16 14:33:48 +08:00 |
wernsaar
|
3300f5ebff
|
optimized multithreading lower limits
|
2014-09-15 11:38:25 +02:00 |
Zhang Xianyi
|
59e2c20557
|
Merge pull request #448 from wernsaar/develop
Optimized cgemv and zgemv kernels
|
2014-09-15 13:12:14 +08:00 |
wernsaar
|
b7c9566eea
|
removed obsolete gemv kernel files
|
2014-09-14 11:00:53 +02:00 |
wernsaar
|
6df1b0be81
|
optimized zgemv_n_microk_sandy-4.c
|
2014-09-14 10:21:22 +02:00 |
wernsaar
|
2ac1e076c1
|
added optimized zgemv_n kernel for sandybridge
|
2014-09-14 09:02:05 +02:00 |
wernsaar
|
9908b6031c
|
bugfix in KERNEL.PILEDRIVER
|
2014-09-13 16:26:53 +02:00 |
wernsaar
|
8f100a14f2
|
optimized cgemv_t kernel for haswell
|
2014-09-13 16:13:27 +02:00 |
wernsaar
|
53b5726b04
|
added optimized cgemv_t kernel for haswell
|
2014-09-13 15:14:12 +02:00 |
wernsaar
|
1a352b24e6
|
updated KERNEL.HASWELL
|
2014-09-13 12:23:27 +02:00 |
wernsaar
|
5194818d4b
|
updated zgemv_t_4.c
|
2014-09-13 09:48:34 +02:00 |
wernsaar
|
8a39cdb1c1
|
added optimized zgemv_t kernel for haswell
|
2014-09-13 09:47:07 +02:00 |
wernsaar
|
fd2478c9e2
|
optimized interface/zgemv.c for multithreading
|
2014-09-12 19:18:23 +02:00 |
wernsaar
|
0a1390f2d8
|
enabled optimized zgemv_t kernel for bulldozer
|
2014-09-12 17:43:47 +02:00 |
wernsaar
|
a8b0812feb
|
optimized zgemv_t for bulldozer
|
2014-09-12 17:42:25 +02:00 |
wernsaar
|
a0fb68ab42
|
added optimized zgemv_t kernel for bulldozer
|
2014-09-12 17:04:22 +02:00 |
wernsaar
|
44c11165d5
|
bugfix in cgemv_t_4.c
|
2014-09-12 14:12:24 +02:00 |
wernsaar
|
564be4eb72
|
added optimized cgemv_t kernel
|
2014-09-12 13:38:01 +02:00 |
wernsaar
|
107c3ea7d5
|
added optimized zgemv_t routine
|
2014-09-12 12:35:20 +02:00 |
wernsaar
|
bb8d698335
|
optimized zgemv_n_microk_haswell-4.c for small size
|
2014-09-11 13:44:55 +02:00 |
wernsaar
|
e0192a6914
|
bugfix in zgemv_n_4.c
|
2014-09-11 13:18:00 +02:00 |
wernsaar
|
bced4594bb
|
added optimized zgemv_n kernel
|
2014-09-11 12:34:57 +02:00 |
wernsaar
|
cafba99b6b
|
bufix in cgemv_n_microk_haswell-4.c
|
2014-09-11 11:12:44 +02:00 |
wernsaar
|
ac8f232b2a
|
more optimizations
|
2014-09-11 10:25:48 +02:00 |
wernsaar
|
f98e1244c4
|
optimized cgemv_n_4.c
|
2014-09-10 19:26:14 +02:00 |
wernsaar
|
be95700b30
|
added optimized cgemv_kernel for haswell
|
2014-09-10 14:11:24 +02:00 |
wernsaar
|
4aa534ae93
|
added cgemv_n kernel, optimized for small sizes
|
2014-09-10 13:45:13 +02:00 |
Zhang Xianyi
|
1cba8e7b11
|
Merge pull request #446 from grisuthedragon/cblas_matcopy
Add a CBLAS interface for the BLAS extension s/d/c/z*matcopy routines.
|
2014-09-10 16:31:31 +08:00 |
Zhang Xianyi
|
d13e92f07e
|
Merge pull request #445 from wernsaar/develop
A lot of optimizations for gemv kernels
|
2014-09-10 16:28:14 +08:00 |
wernsaar
|
baa46e4fba
|
added and tested optimized dgemv_n kernel for haswell
|
2014-09-09 16:17:45 +02:00 |
wernsaar
|
faab7a181d
|
added optimized dgemv_n kernel for haswell
|
2014-09-09 15:32:32 +02:00 |
wernsaar
|
8109d8232c
|
optimized dgemv_t kernel for haswell
|
2014-09-09 14:38:08 +02:00 |
wernsaar
|
debc6d1a05
|
bugfix in KERNEL.HASWELL
|
2014-09-09 14:04:44 +02:00 |