Zhang Xianyi
|
515bc56ea9
|
Refs #946. Use nrm2 reference implementation for Power8.
|
2016-08-18 18:59:43 -07:00 |
Zhang Xianyi
|
ae70b916f4
|
Refs #929. Deal with zero and NaNs for scale.
|
2016-08-18 10:24:42 -07:00 |
Werner Saar
|
412bcd187a
|
optimized dtrsm_logic_LT_16x4_power8.S and dtrsm_macros_LT_16x4_power8.S
|
2016-05-23 11:20:41 +02:00 |
Werner Saar
|
8b140220c8
|
optimized dtrsm_kernel_LT for POWER8
|
2016-05-22 15:20:04 +02:00 |
Werner Saar
|
8fb5a1aaff
|
added optimized dtrsm_LT kernel for POWER8
|
2016-05-22 13:09:05 +02:00 |
Werner Saar
|
6a2bde7a2d
|
optimized dgemm and dgetrf for POWER8
|
2016-05-17 14:45:27 +02:00 |
Werner Saar
|
8310d4d3f7
|
optimized dgemm for 20 threads
|
2016-05-16 14:14:25 +02:00 |
Werner Saar
|
56948dbf0f
|
optimized dgemm for POWER8
|
2016-04-29 12:52:47 +02:00 |
Werner Saar
|
0d0c6f7d7d
|
optimized dgemm for POWER8
|
2016-04-27 14:01:08 +02:00 |
Werner Saar
|
a3da10662f
|
added sgemm_tcopy_8_power8.S
|
2016-04-23 10:04:41 +02:00 |
Werner Saar
|
d46f07bb4e
|
added cgemm_tcopy_8_power8.S
|
2016-04-23 07:37:18 +02:00 |
Werner Saar
|
879a51165f
|
Optimized zgemm and tested zgemm again
|
2016-04-22 13:07:12 +02:00 |
Werner Saar
|
9276c9012f
|
Optimized sgemm and dgemm and tested again.
|
2016-04-21 11:37:57 +02:00 |
Werner Saar
|
0001260f4b
|
optimized sgemm
|
2016-04-20 13:06:38 +02:00 |
Werner Saar
|
3c6294ca3d
|
added optimized sgemm_tcopy for power8
|
2016-04-19 16:08:54 +02:00 |
Werner Saar
|
e173c51c04
|
updated zgemm- and ztrmm-kernel for POWER8
|
2016-04-08 09:05:37 +02:00 |
Werner Saar
|
9c42f0374a
|
Updated cgemm- and sgemm-kernel for POWER8 SMP
|
2016-04-07 15:08:15 +02:00 |
Werner Saar
|
a51102e9b7
|
bugfixes for sgemm- and cgemm-kernel
|
2016-04-06 11:15:21 +02:00 |
Werner Saar
|
c5b1fbcb2e
|
updated optimized cgemm- and ctrmm-kernel for POWER8
|
2016-04-04 09:12:08 +02:00 |
Werner Saar
|
d4c0330967
|
updated cgemm- and ctrmm-kernel for POWER8
|
2016-04-03 14:30:49 +02:00 |
Werner Saar
|
6a9bbfc227
|
updated sgemm- and strmm-kernel for POWER8
|
2016-04-02 17:16:36 +02:00 |
Werner Saar
|
68a69c5b50
|
added optimized dgemv_n kernel for POWER8
|
2016-03-30 11:10:53 +02:00 |
Werner Saar
|
c2464a7c4a
|
added optimized casum kernel for POWER8
|
2016-03-28 14:12:08 +02:00 |
Werner Saar
|
294f933869
|
added optimized zasum kernel for POWER8
|
2016-03-28 13:37:32 +02:00 |
Werner Saar
|
f59c9bd6ef
|
added optimized sasum kernel for POWER8
|
2016-03-28 12:44:25 +02:00 |
Werner Saar
|
c53be46d78
|
added optimized dasum kernel for POWER8
|
2016-03-28 12:17:15 +02:00 |
Werner Saar
|
659ed16591
|
added otimized cswap and zswap kernels for POWER8
|
2016-03-27 18:31:37 +02:00 |
Werner Saar
|
35c98a3556
|
added optimized zscal kernel for POWER8
|
2016-03-27 16:31:50 +02:00 |
Werner Saar
|
f1a5dd06c5
|
added optimized sscal kernel for POWER8
|
2016-03-27 11:05:56 +02:00 |
Werner Saar
|
35f1f21a7f
|
added drot- and srot-kernel optimimized for POWER8
|
2016-03-27 08:57:11 +02:00 |
Werner Saar
|
3d9a50e841
|
added optimized sswap kernel for POWER8
|
2016-03-25 17:34:55 +01:00 |
Werner Saar
|
828c849b44
|
added optimized ccopy kernel for POWER8
|
2016-03-25 16:54:25 +01:00 |
Werner Saar
|
ecc0bc9813
|
added optimized scopy kernel for POWER8
|
2016-03-25 16:06:56 +01:00 |
Werner Saar
|
12f209b7b0
|
added optimized zswap kernel for POWER8
|
2016-03-25 15:27:34 +01:00 |
Werner Saar
|
7316a87930
|
added optimized dswap kernel for POWER8
|
2016-03-25 14:35:43 +01:00 |
Werner Saar
|
0bff057a87
|
added optimized dcopy kernel for POWER8
|
2016-03-25 13:03:02 +01:00 |
Werner Saar
|
1e6cf9808c
|
added optimized dscal kernel for POWER8
|
2016-03-25 09:42:08 +01:00 |
Werner Saar
|
55eda3813b
|
added optimized zaxpy kernel for POWER8
|
2016-03-23 11:20:23 +01:00 |
Werner Saar
|
0664ba4c97
|
added optimized daxpy kernel for POWER8
|
2016-03-22 14:50:03 +01:00 |
Werner Saar
|
11c44dede1
|
added optimized sdot kernel for POWER8
|
2016-03-21 13:18:23 +01:00 |
Werner Saar
|
9e4584d069
|
added optimized zdot kernel for POWER8
|
2016-03-21 10:12:07 +01:00 |
Werner Saar
|
cd9fafc054
|
ddot for POWER8: updated licence information
|
2016-03-20 11:19:27 +01:00 |
Werner Saar
|
84b92e6373
|
added optimized ddot kernel for POWER8
|
2016-03-20 11:06:06 +01:00 |
Werner Saar
|
e1df5a6e23
|
fixed sgemm- and strmm-kernel
|
2016-03-18 12:12:03 +01:00 |
Werner Saar
|
5c658f8746
|
add optimized cgemm- and ctrmm-kernel for POWER8
|
2016-03-18 08:17:25 +01:00 |
Werner Saar
|
dcd15b546c
|
BUGFIX: KERNEL.POWER8
|
2016-03-14 14:36:59 +01:00 |
Werner Saar
|
96284ab295
|
added sgemm- and strmm-kernel for POWER8
|
2016-03-14 13:52:44 +01:00 |
Werner Saar
|
cd5241d0cf
|
modified KERNEL for power, to use the generic DSDOT-KERNEL
|
2016-03-06 09:07:24 +01:00 |
Werner Saar
|
085f215257
|
Modified assembly label name, so that they are hidden.
Added license informations.
|
2016-03-05 10:27:27 +01:00 |
Werner Saar
|
0afc76fd65
|
enabled gemm_beta assembly kernels
|
2016-03-04 15:01:15 +01:00 |