OpenBLAS/kernel
Martin Kroeker 9e2f316ede Power8 inline assembly fixes
Quoting patch author amodra from #1078
Lots of issues here.
- The vsx regs weren't listed as clobbered.
- Poor choice of vsx regs, which along with the lack of clobbers led to
  trashing v0..v21 and fr14..fr23.  Ideally you'd let gcc choose all
  temp vsx regs, but asms currently have a limit of 30 i/o parms.
- Other regs were clobbered unnecessarily, seemingly in an attempt to
  clobber inputs, with gcc-7 complaining about the clobber of r2.
  (Changed inputs should be also listed as outputs or as an i/o.)
- "r" constraint used instead of "b" for gprs used in insns where the
  r0 encoding means zero rather than r0.
- There were unused asm inputs too.
- All memory was clobbered rather than hooking up memory outputs with
  proper memory constraints, and that and the lack of proper memory
  input constraints meant the asms needed to be volatile and their
  containing function noinline.
- Some parameters were being passed unnecessarily via memory.
- When a copy of a
2017-02-13 23:38:50 +01:00
..
alpha Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
arm Update zdot.c 2016-10-05 18:57:14 +02:00
arm64 THUNDERX2T99: Add optimized S/D/C/Z COPY Implementations 2017-02-02 15:26:38 +05:30
generic remove dead code 2016-10-31 12:46:56 +01:00
ia64 Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
mips Added rot functions. 2017-01-17 12:15:07 +05:30
mips64 Added MSA optimization for GEMV_N, GEMV_T, ASUM, DOT functions 2016-07-15 18:38:25 +05:30
power Power8 inline assembly fixes 2017-02-13 23:38:50 +01:00
sparc Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
x86 Fix cmake bug on MSVC 32-bit. 2015-10-26 14:52:13 -05:00
x86_64 Change file comments to work around clang 3.9 assembler bug 2016-10-13 16:51:08 +02:00
zarch dtrmm and dgemm for z13 2017-01-04 19:32:33 +04:00
CMakeLists.txt Replace CMAKE_SOURCE_DIR/CMAKE_BINARY_DIR with PROJECT_SOURCE_DIR/PROJECT_BINARY_DIR 2016-05-25 09:13:28 +02:00
Makefile MIPS n32 ABI and build time mips simd support check 2016-08-10 17:44:22 +05:30
Makefile.L1 Remove duplicate -D args in kernel/Makefile.L1 2015-11-09 14:15:48 +05:30
Makefile.L2 Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
Makefile.L3 Merge branch 'z13' into develop 2017-01-09 05:52:42 -05:00
Makefile.LA Support NO_LAPACK=1 to build the lib without LAPACK functions. 2011-03-04 11:51:32 +08:00
setparam-ref.c prepared kernel/setparam-ref.c for UNROLL values, that are not a power of two 2017-01-11 11:56:50 +01:00