OpenBLAS/kernel
Rajalakshmi Srinivasaraghavan ad745c0bae Optimize scopy/ccopy for POWER10
This patch makes use of new POWER10 vector pair instructions for
loads and stores. Also reorganized all variants of copy functions
to make use of same kernel.
2020-10-21 09:53:45 -05:00
..
alpha Add implementations of ssum/dsum and csum/zsum 2019-03-30 22:05:11 +01:00
arm Add double precision universal intrinsics for X86/ARM 2020-10-15 10:29:42 +08:00
arm64 Merge pull request #2867 from Qiyu8/usimd-floatdot 2020-10-10 12:10:25 +02:00
generic Rename "HALF" and "sh" to "BFLOAT16" and "sb" 2020-10-11 23:56:17 +02:00
ia64 Add ia64 implementation of ?sum 2019-03-30 22:18:03 +01:00
mips Delete KERNEL.1004K 2020-04-19 15:44:30 +02:00
mips64 Fix compilation problem on loongson platform 2020-04-09 19:28:15 +08:00
power Optimize scopy/ccopy for POWER10 2020-10-21 09:53:45 -05:00
simd Revert "add double precision SSE" 2020-10-15 08:37:02 +02:00
sparc Add SPARC implementation of ?sum 2019-03-30 22:25:06 +01:00
x86 Enable COOPERLAKE build target 2020-08-13 06:18:00 +08:00
x86_64 Fix build with -Werror=return-type 2020-10-21 08:43:39 +02:00
zarch s390x: fix cscal and zscal implementations 2020-09-21 13:10:05 +02:00
CMakeLists.txt Change "HALF" and "sh" to "BFLOAT16" and "sb" 2020-10-12 00:00:55 +02:00
Makefile fix core list for sse/sse2 2020-10-16 09:55:48 +02:00
Makefile.L1 Fix build issues with bfloat16 2020-10-13 11:00:22 -05:00
Makefile.L2 Allow compiling only a subset of kernels for specific variable types 2020-10-11 14:52:09 +02:00
Makefile.L3 Fix build issues with bfloat16 2020-10-13 11:00:22 -05:00
Makefile.LA Support NO_LAPACK=1 to build the lib without LAPACK functions. 2011-03-04 11:51:32 +08:00
setparam-ref.c Use ifdef instead of if 2020-10-15 19:05:37 +02:00