Commit Graph

108 Commits

Author SHA1 Message Date
Ashwin Sekhar T K 2757b49767 THUNDERX2T99: Add Optimized CGEMM Implementation 2017-01-30 17:44:26 +05:30
Ashwin Sekhar T K f279ff4789 THUNDERX2T99: Add Optimized SGEMM Implementation 2017-01-16 21:44:33 +05:30
Ashwin Sekhar T K 4b55fae337 ARM64: Add Cavium THUNDERX2T99 Target 2017-01-11 11:18:40 +05:30
Andrew Pinski fb200c7245 ARM64: Add Cavium THUNDERX Target 2017-01-10 15:01:37 +05:30
Ashwin Sekhar T K 4713e7c47f ARM64: Add the VULCAN Target 2017-01-10 15:01:17 +05:30
Zhang Xianyi b678471d65 Merge branch 'z13' into develop
Conflicts:
	CONTRIBUTORS.md
2017-01-09 05:52:42 -05:00
Abdurrauf 6418667818 dtrmm and dgemm for z13 2017-01-04 19:32:33 +04:00
Shivraj Patil 9687437928 MIPS n32 ABI and build time mips simd support check
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
2016-08-10 17:44:22 +05:30
Shivraj Patil d1c6469283 MIPS n32 ABI support, MSA support detection and rename ARCH, ARCHFLAGS
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
2016-08-08 11:58:01 +05:30
Shivraj Patil beb1d076a4 Added MSA optimization for GEMV_N, GEMV_T, ASUM, DOT functions
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
2016-07-15 18:38:25 +05:30
Zhang Xianyi 8a592ee386 Merge pull request #924 from ashwinyes/develop_aarch64_improvements_20160714
Improvements to Aarch64 kernels
2016-07-14 15:47:55 -04:00
Ashwin Sekhar T K 0a5ff9f9f9 Improvements to TRMM and GEMM kernels 2016-07-14 13:56:04 +05:30
Shivraj Patil 57df7956ee Added CGEMM, ZGEMM, STRMM, DTRMM, CTRMM, ZTRMM. Updated macros in SGEMM, DGEMM, STRMM.
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
2016-06-28 17:51:10 +05:30
Shivraj Patil c4ba40e308 SGEMM optimization for MIPS P5600 and I6400 using MSA. Unrolled k loop in DGEMM kernel function
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
2016-05-19 11:04:42 +05:30
Werner Saar 88011f625d Merge pull request #876 from wernsaar/develop
optimized dgemm on power8 for 20 threads
2016-05-16 14:52:40 +02:00
Werner Saar 8310d4d3f7 optimized dgemm for 20 threads 2016-05-16 14:14:25 +02:00
Shivraj Patil 085cf236c2 conflict resolved by syncing with 'xianyi:develop'
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
2016-05-04 11:07:14 +05:30
Shivraj Patil b7b3d8ec8e DGEMM optimization for MIPS P5600 and I6400 using MSA
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
2016-05-03 14:42:26 +05:30
Zhang Xianyi cd7af5260a Merge pull request #847 from sva-img/develop
MIPS P5600(32 bit) and I6400(64 bit) cores support added.
2016-04-29 11:44:36 -04:00
Werner Saar 782f75ba94 optimized param.h for POWER8 2016-04-27 15:48:09 +02:00
Werner Saar 0d0c6f7d7d optimized dgemm for POWER8 2016-04-27 14:01:08 +02:00
Werner Saar 40ac64ae4f updated param.h for EXCAVATOR 2016-04-25 10:40:04 +02:00
Werner Saar 089aad57f7 updated param.h for POWER8 2016-04-23 14:26:24 +02:00
Werner Saar 879a51165f Optimized zgemm and tested zgemm again 2016-04-22 13:07:12 +02:00
Shivraj Patil 2c3dfe2bf3 MIPS P5600(32 bit) and I6400(64 bit) cores support added.
Seperated mips and mips64 files.
Configurations support for mips 32 bit.

Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
2016-04-22 14:03:18 +05:30
Werner Saar 3c6294ca3d added optimized sgemm_tcopy for power8 2016-04-19 16:08:54 +02:00
Zhang Xianyi dd43661cfd Init IBM z system (s390x) porting. 2016-04-15 18:02:24 -04:00
Werner Saar e173c51c04 updated zgemm- and ztrmm-kernel for POWER8 2016-04-08 09:05:37 +02:00
Werner Saar 9c42f0374a Updated cgemm- and sgemm-kernel for POWER8 SMP 2016-04-07 15:08:15 +02:00
Werner Saar a51102e9b7 bugfixes for sgemm- and cgemm-kernel 2016-04-06 11:15:21 +02:00
Werner Saar c5b1fbcb2e updated optimized cgemm- and ctrmm-kernel for POWER8 2016-04-04 09:12:08 +02:00
Werner Saar 6a9bbfc227 updated sgemm- and strmm-kernel for POWER8 2016-04-02 17:16:36 +02:00
Werner Saar e1df5a6e23 fixed sgemm- and strmm-kernel 2016-03-18 12:12:03 +01:00
Werner Saar 5c658f8746 add optimized cgemm- and ctrmm-kernel for POWER8 2016-03-18 08:17:25 +01:00
Werner Saar 96284ab295 added sgemm- and strmm-kernel for POWER8 2016-03-14 13:52:44 +01:00
Werner Saar 91e1c5080c modified configuration, to use power6 sgemm kernel for power8 2016-03-04 13:38:57 +01:00
Werner Saar b752858d6c added dgemm-, dtrmm-, zgemm- and ztrmm-kernel for power8 2016-03-01 07:33:56 +01:00
Zhang Xianyi 3e8d6ea74f Init POWER8 kernels by POWER6. 2015-11-03 12:34:23 +08:00
Werner Saar b07d733a71 added updates for syrk and syr2k 2016-01-21 13:16:44 +01:00
Ashwin Sekhar T K 39937d15cd Change BUFFER_SIZE for Cortex A57 to 20 MB
Change the GEMM_P, GEMM_Q, GEMM_R values for Cortex A57
2015-11-20 01:12:04 +05:30
Ashwin Sekhar T K 1397b47197 Optimized zgemm kernel for CORTEXA57 2015-11-09 14:15:53 +05:30
Ashwin Sekhar T K 45f78963ac Optimized cgemm kernel for CORTEXA57
Also, add a generic ztrmm 4x4 kernel
2015-11-09 14:15:53 +05:30
Ashwin Sekhar T K 402443bf9c Optimized dgemm kernel for CORTEXA57 2015-11-09 14:15:53 +05:30
Ashwin Sekhar T K f2f8a0fe8b Adding arm64 target CORTEXA57
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
2015-11-09 14:15:50 +05:30
Werner Saar 9bd962f655 modified haswell parameter dgemm_unroll_n 2015-06-13 10:28:27 +02:00
Zhang Xianyi 51ff17d46e Add AMD Excavator target. 2015-05-13 16:16:30 -05:00
Zhang Xianyi 229ce2ccd1 Add cortex-a9 and cortex-a15 targets. 2015-01-12 08:55:29 +00:00
Werner Saar ddf983d643 added optimizations for steamroller 2014-12-30 20:14:45 +08:00
Werner Saar 4319769b79 added target processor STEAMROLLER 2014-12-28 20:16:46 +08:00
Werner Saar 587e16fba3 Ref #458: Backport, sandybrigde uses nehalem zgemm kernel 2014-12-22 17:01:18 +01:00