Ashwin Sekhar T K
|
2757b49767
|
THUNDERX2T99: Add Optimized CGEMM Implementation
|
2017-01-30 17:44:26 +05:30 |
Ashwin Sekhar T K
|
f279ff4789
|
THUNDERX2T99: Add Optimized SGEMM Implementation
|
2017-01-16 21:44:33 +05:30 |
Ashwin Sekhar T K
|
4b55fae337
|
ARM64: Add Cavium THUNDERX2T99 Target
|
2017-01-11 11:18:40 +05:30 |
Andrew Pinski
|
fb200c7245
|
ARM64: Add Cavium THUNDERX Target
|
2017-01-10 15:01:37 +05:30 |
Ashwin Sekhar T K
|
4713e7c47f
|
ARM64: Add the VULCAN Target
|
2017-01-10 15:01:17 +05:30 |
Zhang Xianyi
|
b678471d65
|
Merge branch 'z13' into develop
Conflicts:
CONTRIBUTORS.md
|
2017-01-09 05:52:42 -05:00 |
Abdurrauf
|
6418667818
|
dtrmm and dgemm for z13
|
2017-01-04 19:32:33 +04:00 |
Shivraj Patil
|
9687437928
|
MIPS n32 ABI and build time mips simd support check
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
|
2016-08-10 17:44:22 +05:30 |
Shivraj Patil
|
d1c6469283
|
MIPS n32 ABI support, MSA support detection and rename ARCH, ARCHFLAGS
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
|
2016-08-08 11:58:01 +05:30 |
Shivraj Patil
|
beb1d076a4
|
Added MSA optimization for GEMV_N, GEMV_T, ASUM, DOT functions
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
|
2016-07-15 18:38:25 +05:30 |
Zhang Xianyi
|
8a592ee386
|
Merge pull request #924 from ashwinyes/develop_aarch64_improvements_20160714
Improvements to Aarch64 kernels
|
2016-07-14 15:47:55 -04:00 |
Ashwin Sekhar T K
|
0a5ff9f9f9
|
Improvements to TRMM and GEMM kernels
|
2016-07-14 13:56:04 +05:30 |
Shivraj Patil
|
57df7956ee
|
Added CGEMM, ZGEMM, STRMM, DTRMM, CTRMM, ZTRMM. Updated macros in SGEMM, DGEMM, STRMM.
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
|
2016-06-28 17:51:10 +05:30 |
Shivraj Patil
|
c4ba40e308
|
SGEMM optimization for MIPS P5600 and I6400 using MSA. Unrolled k loop in DGEMM kernel function
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
|
2016-05-19 11:04:42 +05:30 |
Werner Saar
|
88011f625d
|
Merge pull request #876 from wernsaar/develop
optimized dgemm on power8 for 20 threads
|
2016-05-16 14:52:40 +02:00 |
Werner Saar
|
8310d4d3f7
|
optimized dgemm for 20 threads
|
2016-05-16 14:14:25 +02:00 |
Shivraj Patil
|
085cf236c2
|
conflict resolved by syncing with 'xianyi:develop'
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
|
2016-05-04 11:07:14 +05:30 |
Shivraj Patil
|
b7b3d8ec8e
|
DGEMM optimization for MIPS P5600 and I6400 using MSA
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
|
2016-05-03 14:42:26 +05:30 |
Zhang Xianyi
|
cd7af5260a
|
Merge pull request #847 from sva-img/develop
MIPS P5600(32 bit) and I6400(64 bit) cores support added.
|
2016-04-29 11:44:36 -04:00 |
Werner Saar
|
782f75ba94
|
optimized param.h for POWER8
|
2016-04-27 15:48:09 +02:00 |
Werner Saar
|
0d0c6f7d7d
|
optimized dgemm for POWER8
|
2016-04-27 14:01:08 +02:00 |
Werner Saar
|
40ac64ae4f
|
updated param.h for EXCAVATOR
|
2016-04-25 10:40:04 +02:00 |
Werner Saar
|
089aad57f7
|
updated param.h for POWER8
|
2016-04-23 14:26:24 +02:00 |
Werner Saar
|
879a51165f
|
Optimized zgemm and tested zgemm again
|
2016-04-22 13:07:12 +02:00 |
Shivraj Patil
|
2c3dfe2bf3
|
MIPS P5600(32 bit) and I6400(64 bit) cores support added.
Seperated mips and mips64 files.
Configurations support for mips 32 bit.
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
|
2016-04-22 14:03:18 +05:30 |
Werner Saar
|
3c6294ca3d
|
added optimized sgemm_tcopy for power8
|
2016-04-19 16:08:54 +02:00 |
Zhang Xianyi
|
dd43661cfd
|
Init IBM z system (s390x) porting.
|
2016-04-15 18:02:24 -04:00 |
Werner Saar
|
e173c51c04
|
updated zgemm- and ztrmm-kernel for POWER8
|
2016-04-08 09:05:37 +02:00 |
Werner Saar
|
9c42f0374a
|
Updated cgemm- and sgemm-kernel for POWER8 SMP
|
2016-04-07 15:08:15 +02:00 |
Werner Saar
|
a51102e9b7
|
bugfixes for sgemm- and cgemm-kernel
|
2016-04-06 11:15:21 +02:00 |
Werner Saar
|
c5b1fbcb2e
|
updated optimized cgemm- and ctrmm-kernel for POWER8
|
2016-04-04 09:12:08 +02:00 |
Werner Saar
|
6a9bbfc227
|
updated sgemm- and strmm-kernel for POWER8
|
2016-04-02 17:16:36 +02:00 |
Werner Saar
|
e1df5a6e23
|
fixed sgemm- and strmm-kernel
|
2016-03-18 12:12:03 +01:00 |
Werner Saar
|
5c658f8746
|
add optimized cgemm- and ctrmm-kernel for POWER8
|
2016-03-18 08:17:25 +01:00 |
Werner Saar
|
96284ab295
|
added sgemm- and strmm-kernel for POWER8
|
2016-03-14 13:52:44 +01:00 |
Werner Saar
|
91e1c5080c
|
modified configuration, to use power6 sgemm kernel for power8
|
2016-03-04 13:38:57 +01:00 |
Werner Saar
|
b752858d6c
|
added dgemm-, dtrmm-, zgemm- and ztrmm-kernel for power8
|
2016-03-01 07:33:56 +01:00 |
Zhang Xianyi
|
3e8d6ea74f
|
Init POWER8 kernels by POWER6.
|
2015-11-03 12:34:23 +08:00 |
Werner Saar
|
b07d733a71
|
added updates for syrk and syr2k
|
2016-01-21 13:16:44 +01:00 |
Ashwin Sekhar T K
|
39937d15cd
|
Change BUFFER_SIZE for Cortex A57 to 20 MB
Change the GEMM_P, GEMM_Q, GEMM_R values for Cortex A57
|
2015-11-20 01:12:04 +05:30 |
Ashwin Sekhar T K
|
1397b47197
|
Optimized zgemm kernel for CORTEXA57
|
2015-11-09 14:15:53 +05:30 |
Ashwin Sekhar T K
|
45f78963ac
|
Optimized cgemm kernel for CORTEXA57
Also, add a generic ztrmm 4x4 kernel
|
2015-11-09 14:15:53 +05:30 |
Ashwin Sekhar T K
|
402443bf9c
|
Optimized dgemm kernel for CORTEXA57
|
2015-11-09 14:15:53 +05:30 |
Ashwin Sekhar T K
|
f2f8a0fe8b
|
Adding arm64 target CORTEXA57
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
|
2015-11-09 14:15:50 +05:30 |
Werner Saar
|
9bd962f655
|
modified haswell parameter dgemm_unroll_n
|
2015-06-13 10:28:27 +02:00 |
Zhang Xianyi
|
51ff17d46e
|
Add AMD Excavator target.
|
2015-05-13 16:16:30 -05:00 |
Zhang Xianyi
|
229ce2ccd1
|
Add cortex-a9 and cortex-a15 targets.
|
2015-01-12 08:55:29 +00:00 |
Werner Saar
|
ddf983d643
|
added optimizations for steamroller
|
2014-12-30 20:14:45 +08:00 |
Werner Saar
|
4319769b79
|
added target processor STEAMROLLER
|
2014-12-28 20:16:46 +08:00 |
Werner Saar
|
587e16fba3
|
Ref #458: Backport, sandybrigde uses nehalem zgemm kernel
|
2014-12-22 17:01:18 +01:00 |