Commit Graph

69 Commits

Author SHA1 Message Date
Ashwin Sekhar T K
4899d67f7d THUDNERX2T99: Fix clang compilation 2017-08-02 11:28:45 -07:00
Ashwin Sekhar T K
67473d09dd THUNDERX2T99: Bug Fixes in D/Z NRM2 and ZGEMM 2017-02-28 01:11:38 -08:00
Ashwin Sekhar T K
19ba133383 THUNDERX2T99: Add Optimized ZGEMM Implementation 2017-02-28 05:31:41 +00:00
Ashwin Sekhar T K
a3935f0dfb THUNDERX2T99: Add Optimized D/Z NRM2 Implementation 2017-02-23 10:02:15 -08:00
Ashwin Sekhar T K
738628e9a8 ARM64: Remove unused code 2017-02-21 21:42:32 -08:00
Ashwin Sekhar T K
ab3ffab96a THUNDERX2T99: Add Optimized C/Z DOT Implementation 2017-02-21 03:40:59 -08:00
Ashwin Sekhar T K
f036be9ce2 THUNDERX2T99: Add Optimized SDOT Implementation 2017-02-21 03:24:32 -08:00
Ashwin Sekhar T K
faba876fda THUNDERX2T99: Bug fix in C/Z IAMAX 2017-02-19 23:11:50 -08:00
Ashwin Sekhar T K
172a62d73e THUNDERX2T99: Add Optimized C/Z IAMAX Implementation 2017-02-17 03:06:32 -08:00
Ashwin Sekhar T K
228c75a69c THUNDERX2T99: Add parallel SCNRM2 Implementation 2017-02-14 04:10:06 -08:00
Ashwin Sekhar T K
8e89668f62 THUNDERX2T99: Fix bug in SNRM2 2017-02-07 02:14:33 -08:00
Ashwin Sekhar T K
f63deae9de THUNDERX2T99: Add Optimized S/D IAMAX Implementation 2017-02-07 01:35:55 -08:00
Ashwin Sekhar T K
071a830e8b THUNDERX2T99: Add optimized S/D/C/Z SWAP Implementations 2017-02-03 03:55:06 -08:00
Ashwin Sekhar T K
d09f88192c THUNDERX2T99: Add optimized S/D/C/Z COPY Implementations 2017-02-02 15:26:38 +05:30
Ashwin Sekhar T K
e58233460a THUDNERX2T99: Add optimized D/C/Z ASUM Implementations 2017-02-02 15:26:22 +05:30
Ashwin Sekhar T K
99bd2892bf THUNDERX2T99: Add optimized CASUM Implementation 2017-01-30 17:44:32 +05:30
Ashwin Sekhar T K
ff6f572f2e THUNDERX2T99: Rename labels in for DDOT and SNRM2 2017-01-30 17:44:32 +05:30
Ashwin Sekhar T K
e0dc5f58c5 THUNDERX2T99: Remove Duplicate Code 2017-01-30 17:44:32 +05:30
Ashwin Sekhar T K
2757b49767 THUNDERX2T99: Add Optimized CGEMM Implementation 2017-01-30 17:44:26 +05:30
Ashwin Sekhar T K
907e286eb6 THUNDERX2T99: Add threaded SNRM2 Implementation 2017-01-24 21:39:29 +05:30
Ashwin Sekhar T K
cde3aee08b ARM64: Rename kernel files to have consistent naming 2017-01-24 14:53:34 +05:30
Ashwin Sekhar T K
ee6ea7e988 THUNDERX2T99: Add Optimized CNRM2 Implementation 2017-01-24 10:23:32 +05:30
Ashwin Sekhar T K
ca0b36b012 THUNDERX2T99: Add Optimized SNRM2 Implementation 2017-01-24 10:23:21 +05:30
Ashwin Sekhar T K
d0a79ca6e0 THUNDERX2T99: Add threaded DDOT Implementation 2017-01-19 11:11:42 +05:30
Ashwin Sekhar T K
0c07003ccf THUNDERX2T99: Add Optimized DDOT Implementation 2017-01-19 11:11:07 +05:30
Ashwin Sekhar T K
f33fcedb30 THUNDERX2T99: Improve SGEMM 2017-01-19 11:11:07 +05:30
Ashwin Sekhar T K
0f1d6e8b39 THUNDERX2T99: Improve DGEMM 2017-01-19 11:11:07 +05:30
Ashwin Sekhar T K
981064acc6 THUNDERX2T99: Add Optimized DAXPY Implementation 2017-01-19 11:10:57 +05:30
Ashwin Sekhar T K
f279ff4789 THUNDERX2T99: Add Optimized SGEMM Implementation 2017-01-16 21:44:33 +05:30
Ashwin Sekhar T K
759f37feba ARM64: Let target VULCAN inherit THUNDERX2T99 properties 2017-01-16 21:44:19 +05:30
Ashwin Sekhar T K
4b55fae337 ARM64: Add Cavium THUNDERX2T99 Target 2017-01-11 11:18:40 +05:30
Andrew Pinski
95649dee28 THUNDERX: Add optimized version of daxpy
This is better for single core but does not change anything for multiple cores
2017-01-11 11:18:36 +05:30
Andrew Pinski
8fdb0655e9 THUNDERX: Add an optimized version of ddot 2017-01-10 15:01:37 +05:30
Andrew Pinski
fb200c7245 ARM64: Add Cavium THUNDERX Target 2017-01-10 15:01:37 +05:30
Ashwin Sekhar T K
0b8e876d89 VULCAN: Add optimized DGEMM implementation 2017-01-10 15:01:37 +05:30
Ashwin Sekhar T K
4713e7c47f ARM64: Add the VULCAN Target 2017-01-10 15:01:17 +05:30
Ashwin Sekhar T K
6085386b10 CORTEXA57: Add assembly kernels for copy routines 2017-01-10 15:01:05 +05:30
Ashwin Sekhar T K
c54a29bb48 Cortex A57: Improvements to DGEMM 8x4 kernel 2016-07-26 10:58:21 +05:30
Ashwin Sekhar T K
0a5ff9f9f9 Improvements to TRMM and GEMM kernels 2016-07-14 13:56:04 +05:30
Ashwin Sekhar T K
8a40f1355e Improvements to GEMV kernels 2016-07-14 13:50:38 +05:30
Ashwin Sekhar T K
78782485b6 Improvements to COPY and IAMAX kernels 2016-07-14 13:49:34 +05:30
Ashwin Sekhar T K
278511ad2d Cortex-A57: Fix clang compilation errors 2016-03-24 10:42:04 +05:30
Ashwin Sekhar T K
3b5ffb49d3 Cortex-A57: Improve DGEMM 8x4 Implementation 2016-03-24 10:25:18 +05:30
Ashwin Sekhar T K
5ac02f6dc7 Optimize Dgemm 4x4 for Cortex A57 2016-03-14 19:35:23 +05:30
Ashwin Sekhar T K
7aa1ad4923 Functional Assembly Kernels for CortexA57
Adding functional (non-optimized) kernels for Cortex-A57
with the following layouts.
SGEMM - 16x4, 8x8
CGEMM - 8x4
DGEMM - 8x4, 4x8
2016-03-14 19:33:21 +05:30
Zhang Xianyi
74b0672223 Fix c/zaxpyc kernel bug on Cortex-A57. 2016-02-23 22:47:53 +00:00
Ashwin Sekhar T K
318f0949c3 lapack-test fixes in nrm2 kernels for Cortex A57 2015-11-23 13:43:36 +05:30
Ashwin Sekhar T K
98965da2e8 lapack-test fixes for Cortex A57 2015-11-20 01:15:04 +05:30
Ashwin Sekhar T K
c99c43d51e Optimized trmm kernels for CORTEXA57 2015-11-09 14:15:54 +05:30
Ashwin Sekhar T K
1397b47197 Optimized zgemm kernel for CORTEXA57 2015-11-09 14:15:53 +05:30