Martin Kroeker
49e62c0e77
fixed syrk_thread.c taken from wernsaar
...
Stride calculation fix copied from https://github.com/wernsaar/OpenBLAS/commit/88900e1
2017-07-06 17:30:12 +02:00
Martin Kroeker
3381f23709
Handle different object extensions in Makefile
...
The optimized LAPACK functions from interface use OS-dependent suffixes .o/.obj for the object files, while netlib LAPACK uses .o throughout. ReLAPACK object names have to match in order for function replacement in the growing library file to work.
2017-07-06 10:12:00 +02:00
Zhang Xianyi
fa6a920caa
Link -lm or -lm_hard for Android ARMv7.
2017-07-05 17:05:06 +08:00
Zhang Xianyi
a6515bb858
Merge pull request #1218 from m-brow/power9
...
Optimise loads on Power9 LE
2017-07-03 13:48:29 +08:00
Zhang Xianyi
c66b842d66
Merge pull request #1212 from neilsh-msft/develop
...
Add Microsoft Windows 10 UWP build support
2017-07-03 13:43:48 +08:00
Martin Kroeker
df2dfe65d6
Update Makefile
2017-07-02 01:46:23 +02:00
Martin Kroeker
2c8d634619
Add files via upload
2017-07-02 00:50:14 +02:00
Ashwin Sekhar T K
37efb5bc1d
arm: Remove unnecessary files/code
...
Since softfp code has been added to all required vfp kernels,
the code for auto detection of abi is no longer required.
The option to force softfp ABI on make command line by giving
ARM_SOFTFP_ABI=1 is retained. But there is no need to give this option
anymore.
Also the newly added C versions of 4x4/4x2 gemm/trmm kernels are removed.
These are longer required. Moreover these kernels has bugs.
2017-07-02 03:06:36 +05:30
Ashwin Sekhar T K
97d671eb61
arm: add softfp support in zgemm/ztrmm vfp kernels
2017-07-02 02:54:32 +05:30
Ashwin Sekhar T K
305cd2e8b4
arm: add softfp support in cgemm/ctrmm vfp kernels
2017-07-02 02:42:32 +05:30
Ashwin Sekhar T K
09bc6ebe5b
arm: add softfp support in dgemm/dtrmm vfp kernels
2017-07-02 02:24:38 +05:30
Ashwin Sekhar T K
872a11a2bf
arm: add softfp support in sgemm/strmm vfp kernels
2017-07-02 02:23:48 +05:30
Ashwin Sekhar T K
eda9e8632a
generic: Bug fixes in generic 4x2 and 4x4 gemm kernels
2017-07-02 02:00:48 +05:30
Ashwin Sekhar T K
8f83d3f961
arm: add softfp support in vfp gemv kernels
2017-07-02 01:03:31 +05:30
Martin Kroeker
e5e47cfdb5
Merge pull request #1220 from ashwinyes/develop_aarch64_20170701_t99_options
...
arm64: Change mtune/mcpu options for THUNDERX2T99 target
2017-07-01 20:43:23 +02:00
Ashwin Sekhar T K
ebf9e9dabe
arm64: Change mtune/mcpu options for THUNDERX2T99 target
2017-07-01 11:17:10 -07:00
Ashwin Sekhar T K
83bd547517
arm: add softfp support in kernel/arm/swap_vfp.S
2017-07-01 20:37:40 +05:30
Ashwin Sekhar T K
e25f4c01d6
arm: add softfp support in kernel/arm/nrm2_vfp*.S
2017-07-01 19:57:28 +05:30
Ashwin Sekhar T K
54915ce343
arm: add softfp support in kernel/arm/*dot_vfp.S
2017-06-30 23:46:02 +05:30
Ashwin Sekhar T K
0150fabdb6
arm: add softfp support in kernel/arm/rot_vfp.S
2017-06-30 21:52:32 +05:30
Ashwin Sekhar T K
4f0773f07d
arm: add softfp support in kernel/arm/axpy_vfp.S
2017-06-30 20:25:59 +05:30
Ashwin Sekhar T K
aa5edebc80
arm: add softfp support in kernel/arm/asum_vfp.S
2017-06-30 18:21:05 +05:30
Ashwin Sekhar T K
89924b3d5b
arm: Use assembly implementations based on the ARM abi
...
In case of softfp abi, assembly implementations of only those APIs are
used which doesnt have a floating point argument or return value.
In case of hard abi, all assembly implementations are used.
2017-06-30 18:21:05 +05:30
Ashwin Sekhar T K
da7f0ff425
generic: add some generic gemm and trmm kernels
...
Added generic 4x4 and 4x2 gemm kernels
Added generic 4x2 trmm kernel
2017-06-30 18:21:05 +05:30
Ashwin Sekhar T K
0d5c8e5386
arm: Determine the abi from compiler if not specified on command line
...
If ARM abi is not explicitly mentioned on the command line, then set the
arm abi to softfp or hard according to the compiler environment.
This assumes that compiler sets the defines __ARM_PCS and __ARM_PCS_VFP
accordingly.
2017-06-30 18:20:59 +05:30
Martin Kroeker
912410f214
Add ReLAPACK to Makefiles
2017-06-28 18:15:21 +02:00
Martin Kroeker
b122413fb0
Restore ReLAPACK test folder
2017-06-28 18:13:14 +02:00
Martin Kroeker
9b7b5f7fdc
Add Elmar Peise's ReLAPACK
2017-06-28 17:38:41 +02:00
Neil Shipp
34513be726
Add Microsoft Windows 10 UWP build support
2017-06-23 13:07:34 -07:00
Zhang Xianyi
482015f8d6
Merge branch 'arm_soft_fp_abi' into develop
2017-06-23 11:35:25 +08:00
Zhang Xianyi
639000e34f
Merge pull request #1211 from neilsh-msft/develop
...
Add 64bit support for Microsoft Visual Studio
2017-06-23 11:33:09 +08:00
Neil Shipp
5de7727cc7
Reorder dependencies to allow in-place build to succeed the first time.
2017-06-22 18:05:19 -07:00
Neil Shipp
96df4b9b17
Avoid truncating cblas.h when compiling gencblas target
2017-06-22 17:08:09 -07:00
Neil Shipp
29dc8e0c61
Revert changes to sed and awk
2017-06-21 17:49:57 -07:00
Neil Shipp
65e56cb29d
Add 64bit support for Microsoft Visual Studio
2017-06-21 13:38:22 -07:00
Matt Brown
bd831a03a8
Optimise sscal for POWER9
...
Use lxvd2x instruction instead of lxvw4x.
lxvd2x performs far better on the new POWER architecture than lxvw4x.
2017-06-14 17:02:46 +10:00
Matt Brown
edc97918f8
Optimise srot for POWER9
...
Use lxvd2x instruction instead of lxvw4x.
lxvd2x performs far better on the new POWER architecture than lxvw4x.
2017-06-14 17:02:35 +10:00
Matt Brown
e0034de22d
Optimise sdot for POWER9
...
Use lxvd2x instruction instead of lxvw4x.
lxvd2x performs far better on the new POWER architecture than lxvw4x.
2017-06-14 17:02:19 +10:00
Matt Brown
32c7fe6bff
Optimise sasum for POWER9
...
Use lxvd2x instruction instead of lxvw4x.
lxvd2x performs far better on the new POWER architecture than lxvw4x.
2017-06-14 17:02:10 +10:00
Matt Brown
19bdf9d52b
Optimise casum for POWER9
...
Use lxvd2x instruction instead of lxvw4x.
lxvd2x performs far better on the new POWER architecture than lxvw4x.
2017-06-14 17:00:07 +10:00
Matt Brown
4f09030fdc
Optimise cswap for POWER9
...
Use lxvd2x instruction instead of lxvw4x.
lxvd2x performs far better on the new POWER architecture than lxvw4x.
2017-06-14 16:59:53 +10:00
Matt Brown
6f4eca5ea4
Optimise sswap for POWER9
...
Use lxvd2x instruction instead of lxvw4x.
lxvd2x performs far better on the new POWER architecture than lxvw4x.
2017-06-14 16:59:13 +10:00
Matt Brown
be55f96cbd
Optimise scopy for POWER9
...
Use lxvd2x instruction instead of lxvw4x.
lxvd2x performs far better on the new POWER architecture than lxvw4x.
2017-06-14 16:59:13 +10:00
Matt Brown
96dd0ef4f7
Optimise ccopy for POWER9
...
Use lxvd2x instruction instead of lxvw4x.
lxvd2x performs far better on the new POWER architecture than lxvw4x.
2017-06-14 16:58:59 +10:00
Martin Kroeker
8f0d6c06a9
Fix installation of header files with cmake ( #1186 )
...
* Fix installation of header files with cmake
Install only the required header files, with openblas_config.h preprocessed like in Makefile.install
Fixes #1184
* Update CMakeLists.txt
Escape remaining semicolons in awk argument list (to get it working on Windows as well)
* Update CMakeLists.txt
* Update CMakeLists.txt
* Update CMakeLists.txt
* Update CMakeLists.txt
* Update CMakeLists.txt
* Add files via upload
* Update CMakeLists.txt
* Update CMakeLists.txt
* Update CMakeLists.txt
see if it is the single quotes that cause the problem on windows
* Update CMakeLists.txt
* Update CMakeLists.txt
* Update CMakeLists.txt
* Update CMakeLists.txt
* Update CMakeLists.txt
* Use C utility instead of awk for header generation in cmake builds
* Update CMakeLists.txt
* Fix generation and installation of header files
Generate openblas_config.h and f77blas.h with same contents as in plain Makefile builds and install only the public header files
2017-06-01 16:36:26 +02:00
Martin Kroeker
410a07cbec
Merge pull request #1190 from oviradoi/utest_make_complex
...
Update test to use openblas_make_complex_float and openblas_make_comp…
2017-06-01 16:35:52 +02:00
Ovidiu Radoi
72f95a0acc
Update test to use openblas_make_complex_float and openblas_make_complex_double functions
2017-05-30 12:12:49 +03:00
Martin Kroeker
e545b81e76
Merge pull request #1189 from pawosm-arm/flang
...
build: Flang has the same interface as PGI
2017-05-28 11:07:57 +02:00
Paul Osmialowski
d7afdf9137
build: Flang has the same interface as PGI
...
Signed-off-by: Paul Osmialowski <pawel.osmialowski@arm.com>
2017-05-27 06:26:48 +01:00
Martin Kroeker
4f4daaa42a
Merge pull request #1188 from pawosm-arm/flang
...
build: Flang compiler support
2017-05-26 23:02:47 +02:00