Commit Graph

3390 Commits

Author SHA1 Message Date
Matt Brown 32c7fe6bff Optimise sasum for POWER9
Use lxvd2x instruction instead of lxvw4x.
lxvd2x performs far better on the new POWER architecture than lxvw4x.
2017-06-14 17:02:10 +10:00
Matt Brown 19bdf9d52b Optimise casum for POWER9
Use lxvd2x instruction instead of lxvw4x.
lxvd2x performs far better on the new POWER architecture than lxvw4x.
2017-06-14 17:00:07 +10:00
Matt Brown 4f09030fdc Optimise cswap for POWER9
Use lxvd2x instruction instead of lxvw4x.
lxvd2x performs far better on the new POWER architecture than lxvw4x.
2017-06-14 16:59:53 +10:00
Matt Brown 6f4eca5ea4 Optimise sswap for POWER9
Use lxvd2x instruction instead of lxvw4x.
lxvd2x performs far better on the new POWER architecture than lxvw4x.
2017-06-14 16:59:13 +10:00
Matt Brown be55f96cbd Optimise scopy for POWER9
Use lxvd2x instruction instead of lxvw4x.
lxvd2x performs far better on the new POWER architecture than lxvw4x.
2017-06-14 16:59:13 +10:00
Matt Brown 96dd0ef4f7 Optimise ccopy for POWER9
Use lxvd2x instruction instead of lxvw4x.
lxvd2x performs far better on the new POWER architecture than lxvw4x.
2017-06-14 16:58:59 +10:00
Martin Kroeker 8f0d6c06a9 Fix installation of header files with cmake (#1186)
* Fix installation of header files with cmake 

Install only the required header files, with openblas_config.h preprocessed like in Makefile.install
Fixes #1184

* Update CMakeLists.txt

Escape remaining semicolons in awk argument list (to get it working on Windows as well)

* Update CMakeLists.txt

* Update CMakeLists.txt

* Update CMakeLists.txt

* Update CMakeLists.txt

* Update CMakeLists.txt

* Add files via upload

* Update CMakeLists.txt

* Update CMakeLists.txt

* Update CMakeLists.txt

see if it is the single quotes that cause the problem on windows

* Update CMakeLists.txt

* Update CMakeLists.txt

* Update CMakeLists.txt

* Update CMakeLists.txt

* Update CMakeLists.txt

* Use C utility instead of awk for header generation in cmake builds

* Update CMakeLists.txt

* Fix generation and installation of header files

Generate openblas_config.h and f77blas.h with same contents as in plain Makefile builds and install only the public header files
2017-06-01 16:36:26 +02:00
Martin Kroeker 410a07cbec Merge pull request #1190 from oviradoi/utest_make_complex
Update test to use openblas_make_complex_float and openblas_make_comp…
2017-06-01 16:35:52 +02:00
Ovidiu Radoi 72f95a0acc Update test to use openblas_make_complex_float and openblas_make_complex_double functions 2017-05-30 12:12:49 +03:00
Martin Kroeker e545b81e76 Merge pull request #1189 from pawosm-arm/flang
build: Flang has the same interface as PGI
2017-05-28 11:07:57 +02:00
Paul Osmialowski d7afdf9137 build: Flang has the same interface as PGI
Signed-off-by: Paul Osmialowski <pawel.osmialowski@arm.com>
2017-05-27 06:26:48 +01:00
Martin Kroeker 4f4daaa42a Merge pull request #1188 from pawosm-arm/flang
build: Flang compiler support
2017-05-26 23:02:47 +02:00
Paul Osmialowski 42bbe74791 build: LLVM: Add Flang compiler support and enable OpenMP for Clang
Signed-off-by: Paul Osmialowski <pawel.osmialowski@arm.com>
2017-05-25 17:03:20 +01:00
Zhang Xianyi c8322c65e4 Merge pull request #1187 from mine260309/develop
build: fix libxlmass errors building on Power CPU
2017-05-24 15:54:58 +08:00
Lei YU 87dde1fde6
build: fix libxlmass errors building on Power CPU
IBM MASS library is upgraded to 8.1.5 and 8.1.3 is not available.
Update README.md and Makefile.power to use version 8.1.5 of libxlmass.
2017-05-24 14:51:52 +08:00
Martin Kroeker 42466e54fa Merge pull request #1182 from martin-frbg/martin-frbg-patch-1
Build shared library on Android without SONAME versioning
2017-05-10 19:39:09 +02:00
Martin Kroeker 3b0624d50f Build shared library on Android without SONAME versioning
Android does not support versioned SONAME entries, ref. #1173
2017-05-10 13:08:13 +02:00
Martin Kroeker fd4e68128e Merge pull request #1178 from jcowgill/mips-fixes
MIPS threading fixes
2017-05-06 17:20:10 +02:00
Martin Kroeker 6464d1723a Merge pull request #1179 from jcowgill/memory-fixes
Fixes to driver/others/memory.c
2017-05-06 13:08:46 +02:00
James Cowgill 59c97cfee4 memory: Fix buffer overflow when position == NUM_BUFFERS 2017-05-05 17:47:03 +01:00
James Cowgill de7875ca5d mips: remove incorrect blas_lock implementations
MIPS 32-bit currently has an empty blas_lock implementation which is
worse than nothing at all. MIPS 64-bit does has a blas_lock
implementation but is broken. Remove them and fallback to the generic
version in common.h which should do the right thing on MIPS.
2017-05-05 17:28:03 +01:00
James Cowgill 67836c2ab4 mips: implement MB and WMB
The MIPS architecture has weak memory ordering and therefore requires
sutible memory barriers when doing lock free programming with multiple
threads (just like ARM does). This commit implements those barriers for
MIPS and MIPS64 using GCC bultins which is probably easiest way.
2017-05-05 17:14:03 +01:00
James Cowgill 5fecfe0f42 memory: switch loop condition around in blas_memory_free
Before this commit, the "position < NUM_BUFFERS" loop condition from
blas_memory_free will be completely optimized away by GCC. This is
because the condition can only be false after undefined behavior has
already been invoked (reading past the end of an array). As a
consequence of this bug, GCC also removes the subsequent if statement
and all the code after the error label because all of it is dead.

This commit switches the loop condition around so it works as intended.
2017-05-05 16:01:58 +01:00
Martin Kroeker bba6676803 Merge pull request #1175 from martin-frbg/lapack_143
Fix workspace computation in LAPACKE ?tpmqrt
2017-05-05 12:00:04 +02:00
Martin Kroeker 5649b2c53a Merge pull request #1176 from staticfloat/sf/dynamic_arch
Fix DYNAMIC_ARCH=1 breaking builds on non-x86 platforms
2017-05-05 11:59:41 +02:00
Elliot Saba 6e972994b2 Force `DYNAMIC_ARCH` to empty when `DYNAMIC_CORE` is not set 2017-05-04 12:55:31 -07:00
Elliot Saba 5b04cf7ab4 Add Makefile debugging trick so that we can inspect runtime Makefile variables 2017-05-04 11:51:29 -07:00
Martin Kroeker d5ea8fd823 Fix workspace computation for side=L
From netlib PR#144
2017-05-04 20:01:41 +02:00
Martin Kroeker 4beffaaa4b Fix workspace computation for side=L
From netlib PR#144
2017-05-04 19:59:02 +02:00
Martin Kroeker fb28e4adc9 Fix workspace computation for side=L
From netlib PR#144
2017-05-04 19:55:02 +02:00
Martin Kroeker 26faa3ca47 Fix workspace allocation in lapacke_ctp for side=L
from netlib PR #144
2017-05-04 19:49:51 +02:00
Martin Kroeker 4f75989634 Merge pull request #1169 from martin-frbg/cblas_xerbla
Add trivial implementation of cblas_xerbla
2017-05-04 19:32:50 +02:00
Martin Kroeker 1e06b49854 Update xerbla.c 2017-04-26 20:29:30 +02:00
Martin Kroeker 7f546f54fa Add cblas_xerbla 2017-04-26 20:01:34 +02:00
Martin Kroeker a809431e34 Add cblas_xerbla() 2017-04-26 19:58:59 +02:00
Martin Kroeker 5ee1cf0223 Merge pull request #1165 from rcoscali/patch-1
README.md update
2017-04-21 15:14:16 +02:00
Rémi Cohen-Scali 9aea7a0d9a Update README.md 2017-04-21 14:18:57 +02:00
Martin Kroeker da0987507c Merge pull request #1164 from sharkcz/s390x
detect CPU on zArch
2017-04-21 10:53:49 +02:00
Dan Horák 81fed55782 detect CPU on zArch 2017-04-20 21:13:41 +02:00
Martin Kroeker 35387edb8d Merge pull request #1160 from gcp/extra-streamroller-cpuid
Add an extra familiy/model combination used by AMD Steamrolller.
2017-04-19 20:03:23 +02:00
Gian-Carlo Pascutto 9c884986ad Add an extra familiy/model combination used by AMD Steamrolller (Godavari). 2017-04-19 19:15:47 +02:00
Martin Kroeker f2f0e98bb5 Merge pull request #1158 from martin-frbg/force-zen
Make FORCE_ZEN option in getarch.c actually set target names to ZEN
2017-04-19 15:04:41 +02:00
Martin Kroeker 166d64eb7c Fix FORCE_ZEN option in getarch.c 2017-04-19 14:20:42 +02:00
Martin Kroeker e078339e8d Merge pull request #1157 from gcp/revert-zen-param
Revert Zen param.h to Haswell values (instead of Excavator).
2017-04-18 13:32:16 +02:00
Gian-Carlo Pascutto 832a272784 Revert Zen param.h to Haswell values (instead of Excavator). 2017-04-18 12:40:25 +02:00
Martin Kroeker 356606314c Merge pull request #1156 from SoapGentoo/cmake-fixes
Use GNUInstallDirs to allow changing target directories
2017-04-18 09:00:24 +02:00
David Seifert ed79a29d87 Use GNUInstallDirs to allow changing target directories
* Multi-lib distributions need to change the libdir
  which is only portably possible with `GNUInstallDirs`.
* Multi-arch distributions such as Debian and Exherbo
  need to be able to change the bindir.
2017-04-16 00:43:47 +02:00
Martin Kroeker 77d16ffc69 Merge pull request #1154 from sharkcz/s390x
add lapack laswp directory for zarch
2017-04-13 16:37:29 +02:00
Dan Horák 56762d5e4c add lapack laswp for zarch 2017-04-13 15:38:59 +02:00
Zhang Xianyi 90dd190a6d Build shared library for Android. 2017-04-11 12:01:18 +08:00