Matt Brown
32c7fe6bff
Optimise sasum for POWER9
...
Use lxvd2x instruction instead of lxvw4x.
lxvd2x performs far better on the new POWER architecture than lxvw4x.
2017-06-14 17:02:10 +10:00
Matt Brown
19bdf9d52b
Optimise casum for POWER9
...
Use lxvd2x instruction instead of lxvw4x.
lxvd2x performs far better on the new POWER architecture than lxvw4x.
2017-06-14 17:00:07 +10:00
Matt Brown
4f09030fdc
Optimise cswap for POWER9
...
Use lxvd2x instruction instead of lxvw4x.
lxvd2x performs far better on the new POWER architecture than lxvw4x.
2017-06-14 16:59:53 +10:00
Matt Brown
6f4eca5ea4
Optimise sswap for POWER9
...
Use lxvd2x instruction instead of lxvw4x.
lxvd2x performs far better on the new POWER architecture than lxvw4x.
2017-06-14 16:59:13 +10:00
Matt Brown
be55f96cbd
Optimise scopy for POWER9
...
Use lxvd2x instruction instead of lxvw4x.
lxvd2x performs far better on the new POWER architecture than lxvw4x.
2017-06-14 16:59:13 +10:00
Matt Brown
96dd0ef4f7
Optimise ccopy for POWER9
...
Use lxvd2x instruction instead of lxvw4x.
lxvd2x performs far better on the new POWER architecture than lxvw4x.
2017-06-14 16:58:59 +10:00
Martin Kroeker
8f0d6c06a9
Fix installation of header files with cmake ( #1186 )
...
* Fix installation of header files with cmake
Install only the required header files, with openblas_config.h preprocessed like in Makefile.install
Fixes #1184
* Update CMakeLists.txt
Escape remaining semicolons in awk argument list (to get it working on Windows as well)
* Update CMakeLists.txt
* Update CMakeLists.txt
* Update CMakeLists.txt
* Update CMakeLists.txt
* Update CMakeLists.txt
* Add files via upload
* Update CMakeLists.txt
* Update CMakeLists.txt
* Update CMakeLists.txt
see if it is the single quotes that cause the problem on windows
* Update CMakeLists.txt
* Update CMakeLists.txt
* Update CMakeLists.txt
* Update CMakeLists.txt
* Update CMakeLists.txt
* Use C utility instead of awk for header generation in cmake builds
* Update CMakeLists.txt
* Fix generation and installation of header files
Generate openblas_config.h and f77blas.h with same contents as in plain Makefile builds and install only the public header files
2017-06-01 16:36:26 +02:00
Martin Kroeker
410a07cbec
Merge pull request #1190 from oviradoi/utest_make_complex
...
Update test to use openblas_make_complex_float and openblas_make_comp…
2017-06-01 16:35:52 +02:00
Ovidiu Radoi
72f95a0acc
Update test to use openblas_make_complex_float and openblas_make_complex_double functions
2017-05-30 12:12:49 +03:00
Martin Kroeker
e545b81e76
Merge pull request #1189 from pawosm-arm/flang
...
build: Flang has the same interface as PGI
2017-05-28 11:07:57 +02:00
Paul Osmialowski
d7afdf9137
build: Flang has the same interface as PGI
...
Signed-off-by: Paul Osmialowski <pawel.osmialowski@arm.com>
2017-05-27 06:26:48 +01:00
Martin Kroeker
4f4daaa42a
Merge pull request #1188 from pawosm-arm/flang
...
build: Flang compiler support
2017-05-26 23:02:47 +02:00
Paul Osmialowski
42bbe74791
build: LLVM: Add Flang compiler support and enable OpenMP for Clang
...
Signed-off-by: Paul Osmialowski <pawel.osmialowski@arm.com>
2017-05-25 17:03:20 +01:00
Zhang Xianyi
c8322c65e4
Merge pull request #1187 from mine260309/develop
...
build: fix libxlmass errors building on Power CPU
2017-05-24 15:54:58 +08:00
Lei YU
87dde1fde6
build: fix libxlmass errors building on Power CPU
...
IBM MASS library is upgraded to 8.1.5 and 8.1.3 is not available.
Update README.md and Makefile.power to use version 8.1.5 of libxlmass.
2017-05-24 14:51:52 +08:00
Martin Kroeker
42466e54fa
Merge pull request #1182 from martin-frbg/martin-frbg-patch-1
...
Build shared library on Android without SONAME versioning
2017-05-10 19:39:09 +02:00
Martin Kroeker
3b0624d50f
Build shared library on Android without SONAME versioning
...
Android does not support versioned SONAME entries, ref. #1173
2017-05-10 13:08:13 +02:00
Martin Kroeker
fd4e68128e
Merge pull request #1178 from jcowgill/mips-fixes
...
MIPS threading fixes
2017-05-06 17:20:10 +02:00
Martin Kroeker
6464d1723a
Merge pull request #1179 from jcowgill/memory-fixes
...
Fixes to driver/others/memory.c
2017-05-06 13:08:46 +02:00
James Cowgill
59c97cfee4
memory: Fix buffer overflow when position == NUM_BUFFERS
2017-05-05 17:47:03 +01:00
James Cowgill
de7875ca5d
mips: remove incorrect blas_lock implementations
...
MIPS 32-bit currently has an empty blas_lock implementation which is
worse than nothing at all. MIPS 64-bit does has a blas_lock
implementation but is broken. Remove them and fallback to the generic
version in common.h which should do the right thing on MIPS.
2017-05-05 17:28:03 +01:00
James Cowgill
67836c2ab4
mips: implement MB and WMB
...
The MIPS architecture has weak memory ordering and therefore requires
sutible memory barriers when doing lock free programming with multiple
threads (just like ARM does). This commit implements those barriers for
MIPS and MIPS64 using GCC bultins which is probably easiest way.
2017-05-05 17:14:03 +01:00
James Cowgill
5fecfe0f42
memory: switch loop condition around in blas_memory_free
...
Before this commit, the "position < NUM_BUFFERS" loop condition from
blas_memory_free will be completely optimized away by GCC. This is
because the condition can only be false after undefined behavior has
already been invoked (reading past the end of an array). As a
consequence of this bug, GCC also removes the subsequent if statement
and all the code after the error label because all of it is dead.
This commit switches the loop condition around so it works as intended.
2017-05-05 16:01:58 +01:00
Martin Kroeker
bba6676803
Merge pull request #1175 from martin-frbg/lapack_143
...
Fix workspace computation in LAPACKE ?tpmqrt
2017-05-05 12:00:04 +02:00
Martin Kroeker
5649b2c53a
Merge pull request #1176 from staticfloat/sf/dynamic_arch
...
Fix DYNAMIC_ARCH=1 breaking builds on non-x86 platforms
2017-05-05 11:59:41 +02:00
Elliot Saba
6e972994b2
Force `DYNAMIC_ARCH` to empty when `DYNAMIC_CORE` is not set
2017-05-04 12:55:31 -07:00
Elliot Saba
5b04cf7ab4
Add Makefile debugging trick so that we can inspect runtime Makefile variables
2017-05-04 11:51:29 -07:00
Martin Kroeker
d5ea8fd823
Fix workspace computation for side=L
...
From netlib PR#144
2017-05-04 20:01:41 +02:00
Martin Kroeker
4beffaaa4b
Fix workspace computation for side=L
...
From netlib PR#144
2017-05-04 19:59:02 +02:00
Martin Kroeker
fb28e4adc9
Fix workspace computation for side=L
...
From netlib PR#144
2017-05-04 19:55:02 +02:00
Martin Kroeker
26faa3ca47
Fix workspace allocation in lapacke_ctp for side=L
...
from netlib PR #144
2017-05-04 19:49:51 +02:00
Martin Kroeker
4f75989634
Merge pull request #1169 from martin-frbg/cblas_xerbla
...
Add trivial implementation of cblas_xerbla
2017-05-04 19:32:50 +02:00
Martin Kroeker
1e06b49854
Update xerbla.c
2017-04-26 20:29:30 +02:00
Martin Kroeker
7f546f54fa
Add cblas_xerbla
2017-04-26 20:01:34 +02:00
Martin Kroeker
a809431e34
Add cblas_xerbla()
2017-04-26 19:58:59 +02:00
Martin Kroeker
5ee1cf0223
Merge pull request #1165 from rcoscali/patch-1
...
README.md update
2017-04-21 15:14:16 +02:00
Rémi Cohen-Scali
9aea7a0d9a
Update README.md
2017-04-21 14:18:57 +02:00
Martin Kroeker
da0987507c
Merge pull request #1164 from sharkcz/s390x
...
detect CPU on zArch
2017-04-21 10:53:49 +02:00
Dan Horák
81fed55782
detect CPU on zArch
2017-04-20 21:13:41 +02:00
Martin Kroeker
35387edb8d
Merge pull request #1160 from gcp/extra-streamroller-cpuid
...
Add an extra familiy/model combination used by AMD Steamrolller.
2017-04-19 20:03:23 +02:00
Gian-Carlo Pascutto
9c884986ad
Add an extra familiy/model combination used by AMD Steamrolller (Godavari).
2017-04-19 19:15:47 +02:00
Martin Kroeker
f2f0e98bb5
Merge pull request #1158 from martin-frbg/force-zen
...
Make FORCE_ZEN option in getarch.c actually set target names to ZEN
2017-04-19 15:04:41 +02:00
Martin Kroeker
166d64eb7c
Fix FORCE_ZEN option in getarch.c
2017-04-19 14:20:42 +02:00
Martin Kroeker
e078339e8d
Merge pull request #1157 from gcp/revert-zen-param
...
Revert Zen param.h to Haswell values (instead of Excavator).
2017-04-18 13:32:16 +02:00
Gian-Carlo Pascutto
832a272784
Revert Zen param.h to Haswell values (instead of Excavator).
2017-04-18 12:40:25 +02:00
Martin Kroeker
356606314c
Merge pull request #1156 from SoapGentoo/cmake-fixes
...
Use GNUInstallDirs to allow changing target directories
2017-04-18 09:00:24 +02:00
David Seifert
ed79a29d87
Use GNUInstallDirs to allow changing target directories
...
* Multi-lib distributions need to change the libdir
which is only portably possible with `GNUInstallDirs`.
* Multi-arch distributions such as Debian and Exherbo
need to be able to change the bindir.
2017-04-16 00:43:47 +02:00
Martin Kroeker
77d16ffc69
Merge pull request #1154 from sharkcz/s390x
...
add lapack laswp directory for zarch
2017-04-13 16:37:29 +02:00
Dan Horák
56762d5e4c
add lapack laswp for zarch
2017-04-13 15:38:59 +02:00
Zhang Xianyi
90dd190a6d
Build shared library for Android.
2017-04-11 12:01:18 +08:00