Gian-Carlo Pascutto
|
832a272784
|
Revert Zen param.h to Haswell values (instead of Excavator).
|
2017-04-18 12:40:25 +02:00 |
Denis Steckelmacher
|
c9ff735da6
|
Add ZEN support (tested for auto-detected static backend)
|
2017-03-19 15:32:50 +01:00 |
Martin Kroeker
|
cd135e2b59
|
Merge pull request #1130 from quickwritereader/develop
Blas 3 for single precision
|
2017-03-15 10:00:52 +01:00 |
Abdurrauf
|
08786c4b95
|
strmm and ctrmm
|
2017-03-13 01:23:16 +04:00 |
Abdurrauf
|
82e80fa82b
|
initial strmm(sgemm). not tuned yet
|
2017-03-06 04:27:40 +04:00 |
Martin Kroeker
|
ffc1d6c468
|
Merge pull request #1108 from ashwinyes/develop_20170203_thunderx2t99
Optimized Implementations for ThunderX2T99
|
2017-02-28 16:02:19 +01:00 |
Ashwin Sekhar T K
|
19ba133383
|
THUNDERX2T99: Add Optimized ZGEMM Implementation
|
2017-02-28 05:31:41 +00:00 |
Abdurrauf
|
0d96b0e2a7
|
Merge branch 'z13' into develop
|
2017-02-26 06:17:33 +04:00 |
Abdurrauf
|
848cb27b1e
|
ztrmm kernel.
|
2017-02-26 06:14:12 +04:00 |
Ashwin Sekhar T K
|
2757b49767
|
THUNDERX2T99: Add Optimized CGEMM Implementation
|
2017-01-30 17:44:26 +05:30 |
Ashwin Sekhar T K
|
f279ff4789
|
THUNDERX2T99: Add Optimized SGEMM Implementation
|
2017-01-16 21:44:33 +05:30 |
Ashwin Sekhar T K
|
4b55fae337
|
ARM64: Add Cavium THUNDERX2T99 Target
|
2017-01-11 11:18:40 +05:30 |
Andrew Pinski
|
fb200c7245
|
ARM64: Add Cavium THUNDERX Target
|
2017-01-10 15:01:37 +05:30 |
Ashwin Sekhar T K
|
4713e7c47f
|
ARM64: Add the VULCAN Target
|
2017-01-10 15:01:17 +05:30 |
Zhang Xianyi
|
b678471d65
|
Merge branch 'z13' into develop
Conflicts:
CONTRIBUTORS.md
|
2017-01-09 05:52:42 -05:00 |
Abdurrauf
|
6418667818
|
dtrmm and dgemm for z13
|
2017-01-04 19:32:33 +04:00 |
Shivraj Patil
|
9687437928
|
MIPS n32 ABI and build time mips simd support check
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
|
2016-08-10 17:44:22 +05:30 |
Shivraj Patil
|
d1c6469283
|
MIPS n32 ABI support, MSA support detection and rename ARCH, ARCHFLAGS
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
|
2016-08-08 11:58:01 +05:30 |
Shivraj Patil
|
beb1d076a4
|
Added MSA optimization for GEMV_N, GEMV_T, ASUM, DOT functions
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
|
2016-07-15 18:38:25 +05:30 |
Zhang Xianyi
|
8a592ee386
|
Merge pull request #924 from ashwinyes/develop_aarch64_improvements_20160714
Improvements to Aarch64 kernels
|
2016-07-14 15:47:55 -04:00 |
Ashwin Sekhar T K
|
0a5ff9f9f9
|
Improvements to TRMM and GEMM kernels
|
2016-07-14 13:56:04 +05:30 |
Shivraj Patil
|
57df7956ee
|
Added CGEMM, ZGEMM, STRMM, DTRMM, CTRMM, ZTRMM. Updated macros in SGEMM, DGEMM, STRMM.
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
|
2016-06-28 17:51:10 +05:30 |
Shivraj Patil
|
c4ba40e308
|
SGEMM optimization for MIPS P5600 and I6400 using MSA. Unrolled k loop in DGEMM kernel function
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
|
2016-05-19 11:04:42 +05:30 |
Werner Saar
|
88011f625d
|
Merge pull request #876 from wernsaar/develop
optimized dgemm on power8 for 20 threads
|
2016-05-16 14:52:40 +02:00 |
Werner Saar
|
8310d4d3f7
|
optimized dgemm for 20 threads
|
2016-05-16 14:14:25 +02:00 |
Shivraj Patil
|
085cf236c2
|
conflict resolved by syncing with 'xianyi:develop'
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
|
2016-05-04 11:07:14 +05:30 |
Shivraj Patil
|
b7b3d8ec8e
|
DGEMM optimization for MIPS P5600 and I6400 using MSA
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
|
2016-05-03 14:42:26 +05:30 |
Zhang Xianyi
|
cd7af5260a
|
Merge pull request #847 from sva-img/develop
MIPS P5600(32 bit) and I6400(64 bit) cores support added.
|
2016-04-29 11:44:36 -04:00 |
Werner Saar
|
782f75ba94
|
optimized param.h for POWER8
|
2016-04-27 15:48:09 +02:00 |
Werner Saar
|
0d0c6f7d7d
|
optimized dgemm for POWER8
|
2016-04-27 14:01:08 +02:00 |
Werner Saar
|
40ac64ae4f
|
updated param.h for EXCAVATOR
|
2016-04-25 10:40:04 +02:00 |
Werner Saar
|
089aad57f7
|
updated param.h for POWER8
|
2016-04-23 14:26:24 +02:00 |
Werner Saar
|
879a51165f
|
Optimized zgemm and tested zgemm again
|
2016-04-22 13:07:12 +02:00 |
Shivraj Patil
|
2c3dfe2bf3
|
MIPS P5600(32 bit) and I6400(64 bit) cores support added.
Seperated mips and mips64 files.
Configurations support for mips 32 bit.
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
|
2016-04-22 14:03:18 +05:30 |
Werner Saar
|
3c6294ca3d
|
added optimized sgemm_tcopy for power8
|
2016-04-19 16:08:54 +02:00 |
Zhang Xianyi
|
dd43661cfd
|
Init IBM z system (s390x) porting.
|
2016-04-15 18:02:24 -04:00 |
Werner Saar
|
e173c51c04
|
updated zgemm- and ztrmm-kernel for POWER8
|
2016-04-08 09:05:37 +02:00 |
Werner Saar
|
9c42f0374a
|
Updated cgemm- and sgemm-kernel for POWER8 SMP
|
2016-04-07 15:08:15 +02:00 |
Werner Saar
|
a51102e9b7
|
bugfixes for sgemm- and cgemm-kernel
|
2016-04-06 11:15:21 +02:00 |
Werner Saar
|
c5b1fbcb2e
|
updated optimized cgemm- and ctrmm-kernel for POWER8
|
2016-04-04 09:12:08 +02:00 |
Werner Saar
|
6a9bbfc227
|
updated sgemm- and strmm-kernel for POWER8
|
2016-04-02 17:16:36 +02:00 |
Werner Saar
|
e1df5a6e23
|
fixed sgemm- and strmm-kernel
|
2016-03-18 12:12:03 +01:00 |
Werner Saar
|
5c658f8746
|
add optimized cgemm- and ctrmm-kernel for POWER8
|
2016-03-18 08:17:25 +01:00 |
Werner Saar
|
96284ab295
|
added sgemm- and strmm-kernel for POWER8
|
2016-03-14 13:52:44 +01:00 |
Werner Saar
|
91e1c5080c
|
modified configuration, to use power6 sgemm kernel for power8
|
2016-03-04 13:38:57 +01:00 |
Werner Saar
|
b752858d6c
|
added dgemm-, dtrmm-, zgemm- and ztrmm-kernel for power8
|
2016-03-01 07:33:56 +01:00 |
Zhang Xianyi
|
3e8d6ea74f
|
Init POWER8 kernels by POWER6.
|
2015-11-03 12:34:23 +08:00 |
Werner Saar
|
b07d733a71
|
added updates for syrk and syr2k
|
2016-01-21 13:16:44 +01:00 |
Ashwin Sekhar T K
|
39937d15cd
|
Change BUFFER_SIZE for Cortex A57 to 20 MB
Change the GEMM_P, GEMM_Q, GEMM_R values for Cortex A57
|
2015-11-20 01:12:04 +05:30 |
Ashwin Sekhar T K
|
1397b47197
|
Optimized zgemm kernel for CORTEXA57
|
2015-11-09 14:15:53 +05:30 |