Martin Kroeker
0f27a03607
Add workaround for NVIDIA HPC mishandling of the asm DOT kernels
2021-01-12 16:39:35 +01:00
Qiyu8
60e6c68e38
Adapt ARM architect
2020-09-29 16:36:14 +08:00
Ashwin Sekhar T K
d5aeff636f
ARM64: Enable DYNAMIC_ARCH
...
Enable DYNAMIC_ARCH feature on ARM64. This patch uses the cpuid
feature in linux kernel to detect the core type at runtime
(https://www.kernel.org/doc/Documentation/arm64/cpu-feature-registers.txt ).
If this feature is missing in kernel, then the user should use the
OPENBLAS_CORETYPE env variable to select the desired core type.
2018-10-22 01:49:35 -07:00
Ashwin Sekhar T K
162e312832
ARM64: Remove dependency of CORTEXA57 Makefile on ARMV8 Makefile
2018-10-17 08:01:45 -07:00
Martin Kroeker
c9d408064a
Use dot.S also for DSDOT on CORTEXA57
2018-02-25 19:48:09 +01:00
Ashwin Sekhar T K
6085386b10
CORTEXA57: Add assembly kernels for copy routines
2017-01-10 15:01:05 +05:30
Ashwin Sekhar T K
7aa1ad4923
Functional Assembly Kernels for CortexA57
...
Adding functional (non-optimized) kernels for Cortex-A57
with the following layouts.
SGEMM - 16x4, 8x8
CGEMM - 8x4
DGEMM - 8x4, 4x8
2016-03-14 19:33:21 +05:30
Ashwin Sekhar T K
318f0949c3
lapack-test fixes in nrm2 kernels for Cortex A57
2015-11-23 13:43:36 +05:30
Ashwin Sekhar T K
98965da2e8
lapack-test fixes for Cortex A57
2015-11-20 01:15:04 +05:30
Ashwin Sekhar T K
c99c43d51e
Optimized trmm kernels for CORTEXA57
2015-11-09 14:15:54 +05:30
Ashwin Sekhar T K
1397b47197
Optimized zgemm kernel for CORTEXA57
2015-11-09 14:15:53 +05:30
Ashwin Sekhar T K
45f78963ac
Optimized cgemm kernel for CORTEXA57
...
Also, add a generic ztrmm 4x4 kernel
2015-11-09 14:15:53 +05:30
Ashwin Sekhar T K
402443bf9c
Optimized dgemm kernel for CORTEXA57
2015-11-09 14:15:53 +05:30
Ashwin Sekhar T K
19fdbee291
Improve the sgemm kernel for CORTEXA57
2015-11-09 14:15:53 +05:30
Ashwin Sekhar T K
3b0cdfab1e
Optimized gemv kernels for CORTEXA57
...
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
2015-11-09 14:15:52 +05:30
Ashwin Sekhar T K
46efa6a1da
Optimized swap kernels for CORTEXA57
...
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
2015-11-09 14:15:52 +05:30
Ashwin Sekhar T K
ea1465cdf8
Optimized scal kernels for CORTEXA57
...
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
2015-11-09 14:15:52 +05:30
Ashwin Sekhar T K
fb4be3b3eb
Optimized rot kernels for CORTEXA57
...
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
2015-11-09 14:15:52 +05:30
Ashwin Sekhar T K
6c2f4ddbcd
Optimized nrm2 kernels for CORTEXA57
2015-11-09 14:15:51 +05:30
Ashwin Sekhar T K
870c4d49c0
Optimized dot kernels for CORTEXA57
...
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
2015-11-09 14:15:51 +05:30
Ashwin Sekhar T K
cd7684097c
Optimized copy kernels for CORTEXA57
...
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
2015-11-09 14:15:51 +05:30
Ashwin Sekhar T K
2690b71b1f
Optimized axpy kernels for CORTEXA57
...
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
2015-11-09 14:15:51 +05:30
Ashwin Sekhar T K
3e4acedf0e
Optimized asum kernels for CORTEXA57
...
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
2015-11-09 14:15:51 +05:30
Ashwin Sekhar T K
2610752dbb
Optimized iamax kernels for CORTEXA57
...
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
2015-11-09 14:15:50 +05:30
Ashwin Sekhar T K
dbb213655e
Optimized amax kernels for CORTEXA57
...
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
2015-11-09 14:15:50 +05:30
Ashwin Sekhar T K
f2f8a0fe8b
Adding arm64 target CORTEXA57
...
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
2015-11-09 14:15:50 +05:30