Werner Saar
|
bd06b246cc
|
Merge pull request #890 from wernsaar/develop
optimized dtrsm_kernel_LT for POWER8
|
2016-05-22 16:01:35 +02:00 |
Werner Saar
|
8b140220c8
|
optimized dtrsm_kernel_LT for POWER8
|
2016-05-22 15:20:04 +02:00 |
Werner Saar
|
8fb5a1aaff
|
added optimized dtrsm_LT kernel for POWER8
|
2016-05-22 13:09:05 +02:00 |
Kaustubh Raste
|
ad9f317870
|
STRSM optimization for MIPS P5600 and I6400 using MSA
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
|
2016-05-20 10:59:03 +05:30 |
Shivraj Patil
|
c4ba40e308
|
SGEMM optimization for MIPS P5600 and I6400 using MSA. Unrolled k loop in DGEMM kernel function
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
|
2016-05-19 11:04:42 +05:30 |
Zhang Xianyi
|
7a19065369
|
Merge pull request #878 from ksraste/develop
DTRSM bug fix for MIPS P5600 and I6400
|
2016-05-19 11:16:43 +08:00 |
Werner Saar
|
6a2bde7a2d
|
optimized dgemm and dgetrf for POWER8
|
2016-05-17 14:45:27 +02:00 |
Kaustubh Raste
|
d7cbc7ac13
|
DTRSM bug fix for MIPS P5600 and I6400
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
|
2016-05-17 15:48:02 +05:30 |
Werner Saar
|
88011f625d
|
Merge pull request #876 from wernsaar/develop
optimized dgemm on power8 for 20 threads
|
2016-05-16 14:52:40 +02:00 |
Werner Saar
|
8310d4d3f7
|
optimized dgemm for 20 threads
|
2016-05-16 14:14:25 +02:00 |
Kaustubh Raste
|
edb5980c13
|
DTRSM optimization for MIPS P5600 and I6400 using MSA
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
|
2016-05-09 15:15:26 +05:30 |
Shivraj Patil
|
085cf236c2
|
conflict resolved by syncing with 'xianyi:develop'
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
|
2016-05-04 11:07:14 +05:30 |
Shivraj Patil
|
b7b3d8ec8e
|
DGEMM optimization for MIPS P5600 and I6400 using MSA
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
|
2016-05-03 14:42:26 +05:30 |
Zhang Xianyi
|
cd7af5260a
|
Merge pull request #847 from sva-img/develop
MIPS P5600(32 bit) and I6400(64 bit) cores support added.
|
2016-04-29 11:44:36 -04:00 |
Werner Saar
|
56948dbf0f
|
optimized dgemm for POWER8
|
2016-04-29 12:52:47 +02:00 |
Werner Saar
|
0d0c6f7d7d
|
optimized dgemm for POWER8
|
2016-04-27 14:01:08 +02:00 |
Werner Saar
|
298b13bba4
|
updated some kernel files for EXCAVATOR
|
2016-04-25 10:36:23 +02:00 |
Werner Saar
|
78b05f6476
|
bugfix for EXCAVATOR and DYNAMIC_ARCH
|
2016-04-25 10:13:30 +02:00 |
Werner Saar
|
a3da10662f
|
added sgemm_tcopy_8_power8.S
|
2016-04-23 10:04:41 +02:00 |
Werner Saar
|
d46f07bb4e
|
added cgemm_tcopy_8_power8.S
|
2016-04-23 07:37:18 +02:00 |
Werner Saar
|
879a51165f
|
Optimized zgemm and tested zgemm again
|
2016-04-22 13:07:12 +02:00 |
Shivraj Patil
|
2c3dfe2bf3
|
MIPS P5600(32 bit) and I6400(64 bit) cores support added.
Seperated mips and mips64 files.
Configurations support for mips 32 bit.
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
|
2016-04-22 14:03:18 +05:30 |
Werner Saar
|
9276c9012f
|
Optimized sgemm and dgemm and tested again.
|
2016-04-21 11:37:57 +02:00 |
wernsaar
|
6fbca2a4a1
|
Merge pull request #845 from wernsaar/develop
optimized sgemm for power8
|
2016-04-20 13:44:22 +02:00 |
Werner Saar
|
0001260f4b
|
optimized sgemm
|
2016-04-20 13:06:38 +02:00 |
Werner Saar
|
3c6294ca3d
|
added optimized sgemm_tcopy for power8
|
2016-04-19 16:08:54 +02:00 |
Zhang Xianyi
|
f24d5307cf
|
Refs #834. Fix zgemv config bug on Steamroller.
|
2016-04-12 22:26:11 +08:00 |
Werner Saar
|
8037d78eed
|
bugfix for arm scal.c and zscal.c
|
2016-04-11 11:21:36 +02:00 |
wernsaar
|
0a4276bc2f
|
Merge pull request #837 from wernsaar/develop
updated zgemm- and ztrmm-kernel for POWER8
|
2016-04-08 11:13:27 +02:00 |
Werner Saar
|
e173c51c04
|
updated zgemm- and ztrmm-kernel for POWER8
|
2016-04-08 09:05:37 +02:00 |
Werner Saar
|
9c42f0374a
|
Updated cgemm- and sgemm-kernel for POWER8 SMP
|
2016-04-07 15:08:15 +02:00 |
Zhang Xianyi
|
d4380c1fe4
|
Refs xianyi/OpenBLAS-CI#10 , Fix sdot for scipy test_iterative.test_convergence test failure on AMD bulldozer and piledriver.
|
2016-04-07 01:44:18 +08:00 |
Werner Saar
|
a51102e9b7
|
bugfixes for sgemm- and cgemm-kernel
|
2016-04-06 11:15:21 +02:00 |
Werner Saar
|
c5b1fbcb2e
|
updated optimized cgemm- and ctrmm-kernel for POWER8
|
2016-04-04 09:12:08 +02:00 |
Werner Saar
|
d4c0330967
|
updated cgemm- and ctrmm-kernel for POWER8
|
2016-04-03 14:30:49 +02:00 |
Werner Saar
|
6a9bbfc227
|
updated sgemm- and strmm-kernel for POWER8
|
2016-04-02 17:16:36 +02:00 |
Werner Saar
|
68a69c5b50
|
added optimized dgemv_n kernel for POWER8
|
2016-03-30 11:10:53 +02:00 |
Werner Saar
|
c2464a7c4a
|
added optimized casum kernel for POWER8
|
2016-03-28 14:12:08 +02:00 |
Werner Saar
|
294f933869
|
added optimized zasum kernel for POWER8
|
2016-03-28 13:37:32 +02:00 |
Werner Saar
|
f59c9bd6ef
|
added optimized sasum kernel for POWER8
|
2016-03-28 12:44:25 +02:00 |
Werner Saar
|
c53be46d78
|
added optimized dasum kernel for POWER8
|
2016-03-28 12:17:15 +02:00 |
Werner Saar
|
659ed16591
|
added otimized cswap and zswap kernels for POWER8
|
2016-03-27 18:31:37 +02:00 |
Werner Saar
|
35c98a3556
|
added optimized zscal kernel for POWER8
|
2016-03-27 16:31:50 +02:00 |
Werner Saar
|
f1a5dd06c5
|
added optimized sscal kernel for POWER8
|
2016-03-27 11:05:56 +02:00 |
wernsaar
|
e125a3dc33
|
Merge pull request #824 from wernsaar/develop
added optimized drot-kernel and srot-kernel for POWER8
|
2016-03-27 10:43:17 +02:00 |
Werner Saar
|
35f1f21a7f
|
added drot- and srot-kernel optimimized for POWER8
|
2016-03-27 08:57:11 +02:00 |
Zhang Xianyi
|
7b4b7179ba
|
Merge pull request #819 from ashwinyes/develop_20160324_fixes_optimizations
Cortex-A57: Fixes and Optimizations
|
2016-03-27 00:04:20 -04:00 |
Werner Saar
|
3d9a50e841
|
added optimized sswap kernel for POWER8
|
2016-03-25 17:34:55 +01:00 |
Werner Saar
|
828c849b44
|
added optimized ccopy kernel for POWER8
|
2016-03-25 16:54:25 +01:00 |
Werner Saar
|
ecc0bc9813
|
added optimized scopy kernel for POWER8
|
2016-03-25 16:06:56 +01:00 |