Werner Saar
|
ef30e52c8f
|
Merge pull request #846 from wernsaar/develop
Optimized sgemm and dgemm for POWER8
|
2016-04-21 13:52:24 +02:00 |
Werner Saar
|
dd2b897795
|
added bugfixes for some make files and smallscaling.c
|
2016-04-21 12:54:32 +02:00 |
Werner Saar
|
9276c9012f
|
Optimized sgemm and dgemm and tested again.
|
2016-04-21 11:37:57 +02:00 |
Werner Saar
|
391584af85
|
optimized Makefile.power for POWER8
|
2016-04-20 15:28:28 +02:00 |
wernsaar
|
6fbca2a4a1
|
Merge pull request #845 from wernsaar/develop
optimized sgemm for power8
|
2016-04-20 13:44:22 +02:00 |
Werner Saar
|
0001260f4b
|
optimized sgemm
|
2016-04-20 13:06:38 +02:00 |
Werner Saar
|
3c6294ca3d
|
added optimized sgemm_tcopy for power8
|
2016-04-19 16:08:54 +02:00 |
Zhang Xianyi
|
dd43661cfd
|
Init IBM z system (s390x) porting.
|
2016-04-15 18:02:24 -04:00 |
Zhang Xianyi
|
9253dadaa7
|
Bump to 0.2.19.dev.
|
2016-04-12 15:32:10 -04:00 |
Zhang Xianyi
|
1e03a62b67
|
Update doc for 0.2.18 version.
|
2016-04-12 15:28:31 -04:00 |
Zhang Xianyi
|
faa73690e4
|
Delete LOCAL_BUFFER_SIZE for other architectures.
|
2016-04-12 11:49:28 -04:00 |
Zhang Xianyi
|
f24d5307cf
|
Refs #834. Fix zgemv config bug on Steamroller.
|
2016-04-12 22:26:11 +08:00 |
Werner Saar
|
8037d78eed
|
bugfix for arm scal.c and zscal.c
|
2016-04-11 11:21:36 +02:00 |
Werner Saar
|
1ca750471a
|
added cholesky benchmarks to Makefile for ESSL
|
2016-04-10 11:28:20 +02:00 |
wernsaar
|
0a4276bc2f
|
Merge pull request #837 from wernsaar/develop
updated zgemm- and ztrmm-kernel for POWER8
|
2016-04-08 11:13:27 +02:00 |
Werner Saar
|
08bddde3f3
|
updated benchmark Makefile for ESSL
|
2016-04-08 10:37:59 +02:00 |
Werner Saar
|
e173c51c04
|
updated zgemm- and ztrmm-kernel for POWER8
|
2016-04-08 09:05:37 +02:00 |
Werner Saar
|
9c42f0374a
|
Updated cgemm- and sgemm-kernel for POWER8 SMP
|
2016-04-07 15:08:15 +02:00 |
Zhang Xianyi
|
d4380c1fe4
|
Refs xianyi/OpenBLAS-CI#10 , Fix sdot for scipy test_iterative.test_convergence test failure on AMD bulldozer and piledriver.
|
2016-04-07 01:44:18 +08:00 |
Werner Saar
|
a51102e9b7
|
bugfixes for sgemm- and cgemm-kernel
|
2016-04-06 11:15:21 +02:00 |
wernsaar
|
7282419525
|
Merge pull request #833 from wernsaar/develop
updated optimized cgemm- and ctrmm-kernel for POWER8
|
2016-04-04 12:29:51 +02:00 |
Werner Saar
|
c5b1fbcb2e
|
updated optimized cgemm- and ctrmm-kernel for POWER8
|
2016-04-04 09:12:08 +02:00 |
wernsaar
|
e1cdd15b30
|
Merge pull request #832 from wernsaar/develop
updated cgemm- and ctrmm-kernel for POWER8
|
2016-04-03 15:05:25 +02:00 |
Werner Saar
|
d4c0330967
|
updated cgemm- and ctrmm-kernel for POWER8
|
2016-04-03 14:30:49 +02:00 |
Werner Saar
|
12540cedb5
|
added ESSL to Makefile for benchmarks
|
2016-04-03 07:21:48 +02:00 |
wernsaar
|
99adc8b062
|
Merge pull request #831 from wernsaar/develop
updated sgemm- and strmm-kernel for POWER8
|
2016-04-02 18:05:44 +02:00 |
Werner Saar
|
6a9bbfc227
|
updated sgemm- and strmm-kernel for POWER8
|
2016-04-02 17:16:36 +02:00 |
Zhang Xianyi
|
3349e9debd
|
Merge pull request #830 from eschnett/patch-1
Correct small typo in comment
|
2016-04-01 17:35:22 -04:00 |
Erik Schnetter
|
dd7612358d
|
Correct small typo in comment
|
2016-04-01 13:49:33 -04:00 |
Zhang Xianyi
|
e5a6ef3808
|
Merge pull request #829 from jeromerobert/bug828
Allow to force to do not use -j as make argument
|
2016-03-31 21:59:40 -04:00 |
Jerome Robert
|
7aac0aff8e
|
Allow to force to do not use -j as make argument
Close #828 (hopefully)
|
2016-03-31 23:03:52 +02:00 |
wernsaar
|
26d7f06206
|
Merge pull request #827 from wernsaar/develop
added optimized dgemv_n kernel for POWER8
|
2016-03-30 12:04:49 +02:00 |
Werner Saar
|
68a69c5b50
|
added optimized dgemv_n kernel for POWER8
|
2016-03-30 11:10:53 +02:00 |
wernsaar
|
a571359afd
|
Merge pull request #826 from wernsaar/develop
added optimized asum kernels for POWER8
|
2016-03-28 15:09:52 +02:00 |
Werner Saar
|
c2464a7c4a
|
added optimized casum kernel for POWER8
|
2016-03-28 14:12:08 +02:00 |
Werner Saar
|
294f933869
|
added optimized zasum kernel for POWER8
|
2016-03-28 13:37:32 +02:00 |
Werner Saar
|
f59c9bd6ef
|
added optimized sasum kernel for POWER8
|
2016-03-28 12:44:25 +02:00 |
Werner Saar
|
c53be46d78
|
added optimized dasum kernel for POWER8
|
2016-03-28 12:17:15 +02:00 |
wernsaar
|
bbb2d73d73
|
Merge pull request #825 from wernsaar/develop
added optimized cswap and zswap kernel for POWER8
|
2016-03-27 19:04:06 +02:00 |
Werner Saar
|
659ed16591
|
added otimized cswap and zswap kernels for POWER8
|
2016-03-27 18:31:37 +02:00 |
Werner Saar
|
35c98a3556
|
added optimized zscal kernel for POWER8
|
2016-03-27 16:31:50 +02:00 |
Werner Saar
|
f1a5dd06c5
|
added optimized sscal kernel for POWER8
|
2016-03-27 11:05:56 +02:00 |
wernsaar
|
e125a3dc33
|
Merge pull request #824 from wernsaar/develop
added optimized drot-kernel and srot-kernel for POWER8
|
2016-03-27 10:43:17 +02:00 |
Werner Saar
|
35f1f21a7f
|
added drot- and srot-kernel optimimized for POWER8
|
2016-03-27 08:57:11 +02:00 |
Zhang Xianyi
|
7b4b7179ba
|
Merge pull request #819 from ashwinyes/develop_20160324_fixes_optimizations
Cortex-A57: Fixes and Optimizations
|
2016-03-27 00:04:20 -04:00 |
Werner Saar
|
7a92c1538e
|
added benchmark test for srot and drot
|
2016-03-26 07:14:13 +01:00 |
wernsaar
|
5727268141
|
Merge pull request #823 from wernsaar/develop
added optimized copy and swap kernels for POWER8
|
2016-03-25 18:08:48 +01:00 |
Werner Saar
|
3d9a50e841
|
added optimized sswap kernel for POWER8
|
2016-03-25 17:34:55 +01:00 |
Werner Saar
|
828c849b44
|
added optimized ccopy kernel for POWER8
|
2016-03-25 16:54:25 +01:00 |
Werner Saar
|
ecc0bc9813
|
added optimized scopy kernel for POWER8
|
2016-03-25 16:06:56 +01:00 |