Ashwin Sekhar T K
8a40f1355e
Improvements to GEMV kernels
2016-07-14 13:50:38 +05:30
Ashwin Sekhar T K
78782485b6
Improvements to COPY and IAMAX kernels
2016-07-14 13:49:34 +05:30
Ashwin Sekhar T K
278511ad2d
Cortex-A57: Fix clang compilation errors
2016-03-24 10:42:04 +05:30
Ashwin Sekhar T K
3b5ffb49d3
Cortex-A57: Improve DGEMM 8x4 Implementation
2016-03-24 10:25:18 +05:30
Ashwin Sekhar T K
5ac02f6dc7
Optimize Dgemm 4x4 for Cortex A57
2016-03-14 19:35:23 +05:30
Ashwin Sekhar T K
7aa1ad4923
Functional Assembly Kernels for CortexA57
...
Adding functional (non-optimized) kernels for Cortex-A57
with the following layouts.
SGEMM - 16x4, 8x8
CGEMM - 8x4
DGEMM - 8x4, 4x8
2016-03-14 19:33:21 +05:30
Zhang Xianyi
74b0672223
Fix c/zaxpyc kernel bug on Cortex-A57.
2016-02-23 22:47:53 +00:00
Ashwin Sekhar T K
318f0949c3
lapack-test fixes in nrm2 kernels for Cortex A57
2015-11-23 13:43:36 +05:30
Ashwin Sekhar T K
98965da2e8
lapack-test fixes for Cortex A57
2015-11-20 01:15:04 +05:30
Ashwin Sekhar T K
c99c43d51e
Optimized trmm kernels for CORTEXA57
2015-11-09 14:15:54 +05:30
Ashwin Sekhar T K
1397b47197
Optimized zgemm kernel for CORTEXA57
2015-11-09 14:15:53 +05:30
Ashwin Sekhar T K
45f78963ac
Optimized cgemm kernel for CORTEXA57
...
Also, add a generic ztrmm 4x4 kernel
2015-11-09 14:15:53 +05:30
Ashwin Sekhar T K
402443bf9c
Optimized dgemm kernel for CORTEXA57
2015-11-09 14:15:53 +05:30
Ashwin Sekhar T K
19fdbee291
Improve the sgemm kernel for CORTEXA57
2015-11-09 14:15:53 +05:30
Ashwin Sekhar T K
3b0cdfab1e
Optimized gemv kernels for CORTEXA57
...
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
2015-11-09 14:15:52 +05:30
Ashwin Sekhar T K
46efa6a1da
Optimized swap kernels for CORTEXA57
...
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
2015-11-09 14:15:52 +05:30
Ashwin Sekhar T K
ea1465cdf8
Optimized scal kernels for CORTEXA57
...
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
2015-11-09 14:15:52 +05:30
Ashwin Sekhar T K
fb4be3b3eb
Optimized rot kernels for CORTEXA57
...
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
2015-11-09 14:15:52 +05:30
Ashwin Sekhar T K
6c2f4ddbcd
Optimized nrm2 kernels for CORTEXA57
2015-11-09 14:15:51 +05:30
Ashwin Sekhar T K
870c4d49c0
Optimized dot kernels for CORTEXA57
...
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
2015-11-09 14:15:51 +05:30
Ashwin Sekhar T K
cd7684097c
Optimized copy kernels for CORTEXA57
...
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
2015-11-09 14:15:51 +05:30
Ashwin Sekhar T K
2690b71b1f
Optimized axpy kernels for CORTEXA57
...
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
2015-11-09 14:15:51 +05:30
Ashwin Sekhar T K
3e4acedf0e
Optimized asum kernels for CORTEXA57
...
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
2015-11-09 14:15:51 +05:30
Ashwin Sekhar T K
2610752dbb
Optimized iamax kernels for CORTEXA57
...
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
2015-11-09 14:15:50 +05:30
Ashwin Sekhar T K
dbb213655e
Optimized amax kernels for CORTEXA57
...
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
2015-11-09 14:15:50 +05:30
Ashwin Sekhar T K
f2f8a0fe8b
Adding arm64 target CORTEXA57
...
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
2015-11-09 14:15:50 +05:30
Zhang Xianyi
e5b96e55a7
Fix build bug for ARM64.
2015-03-24 15:27:17 -05:00
Benedikt Huber
58c90d5937
# The first commit's message is:
...
Optimizations for APM's xgene-1 (aarch64).
1) general system updates to support armv8 better. Make all did not work, one needed to supply TARGET=ARMV8.
2) sgem 4x4 kernel in assembler using SIMD, and configuration changes to use it.
3) strmm 4x4 kernel in C. Since the sgem kernel does 4x4, the trmm kernel must also do 4xN.
Added Dave Nuechterlein to the contributors list.
2014-11-11 22:19:23 +08:00
Timothy Gu
6c2ead30f0
Remove all trailing whitespace except lapack-netlib
...
Signed-off-by: Timothy Gu <timothygu99@gmail.com>
2014-06-27 12:05:18 -07:00
wernsaar
fe5f46c330
added experimental support for ARMV8
2013-11-24 15:47:00 +01:00