Commit Graph

3078 Commits

Author SHA1 Message Date
Pauli Virtanen 845e6d750f Add trivial smoketest for xpotrf 2017-09-30 19:07:54 +02:00
Tim Moon a89d6711c6 Increasing flexibility of GEMM benchmark.
m, n, and k can be set to arbitrary constants. A and B matrices can be transposed independently.
2017-09-28 12:56:29 -07:00
Martin Kroeker 9c017a2218 Save and restore VSX registers 2017-09-28 12:17:09 +02:00
Tim Moon 0e6b11b708 Merge https://github.com/timmoon10/OpenBLAS into develop 2017-09-27 19:26:38 -07:00
Tim Moon 6aaa107865 Reducing threads for multi-threaded GEMMs on small matrices. 2017-09-27 19:25:33 -07:00
Martin Kroeker 00c42dc815 Merge pull request #1314 from martin-frbg/nofortran-fix-2
Rewrite NOFORTRAN conditionals
2017-09-26 10:34:18 +02:00
Martin Kroeker 79e754e548 Rewrite NOFORTRAN conditionals
... so that they do not trigger accidentally when NOFORTRAN is empty/unset
2017-09-25 23:45:14 +02:00
Martin Kroeker 2ccd7f6e0c Merge pull request #1310 from sva-img/develop
Added mips I6500 core
2017-09-22 09:34:54 +02:00
Shivraj Patil e3d844b062 Added mips I6500 core
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
2017-09-22 11:57:43 +05:30
Martin Kroeker def146efed Merge pull request #1308 from sebastien-villemot/develop
Add support for TARGET=ZARCH_GENERIC and TARGET=Z13
2017-09-19 14:04:37 +02:00
Sébastien Villemot 7543e578a4 Add support for TARGET=ZARCH_GENERIC and TARGET=Z13 2017-09-19 12:16:42 +02:00
Martin Kroeker 601c71fe54 Merge pull request #1304 from martin-frbg/aix-build-fixes
(Plain make) build system fixes for AIX
2017-09-18 10:16:40 +02:00
Martin Kroeker 3810a6fd99 (Plain make) build system fixes for AIX
- retry fortran compiler test with aix-specific option if generic -m32/-m64 fails
- pass any custom ARFLAGS to lapack
- no addition of -m32/-m64 to the CFLAGS and FFLAGS on AIX
2017-09-18 01:29:21 +02:00
Martin Kroeker 742f54c235 Merge pull request #1303 from martin-frbg/imatcopy-rowscols
Fix cols/rows mixup in omatcopy 2nd step for BlasTrans cases
2017-09-14 21:46:26 +02:00
Martin Kroeker d674fbb4c7 Fix cols/rows mixup in omatcopy 2nd step for BlasTrans cases
Equivalent of #1244 (issue #899) for the non-complex cases. Fixes #1289
2017-09-14 19:59:05 +02:00
Martin Kroeker 2922c15f36 Merge pull request #1302 from martin-frbg/nofortran-fix
Remove default FEXTRALIBS in NOFORTRAN case
2017-09-14 11:54:20 +02:00
Martin Kroeker 3a245a376f Remove default FEXTRALIBS in NOFORTRAN case 2017-09-14 09:21:04 +02:00
Martin Kroeker 46c9357c72 Merge pull request #1288 from quickwritereader/develop
Optimized standard Blas Level-1,2 (excluding nrm2 functions) for z13 (double precision). Issue 884
2017-09-09 23:47:17 +02:00
Martin Kroeker 1c3e2d3dd5 Merge pull request #1293 from embray/cygwin/install
More canonical installation on Cygwin
2017-09-09 23:46:27 +02:00
Martin Kroeker f66d908282 Merge pull request #1299 from martin-frbg/race_fixes
Fix thread data races uncovered by gcc thread sanitizer
2017-09-09 23:41:53 +02:00
Martin Kroeker ba1f91f17b Convert another caller of "allocation" to LOCK_COMMAND
... as the "allocation" code jumped to now does UNLOCK_COMMAND instead of blas_unlock
2017-09-09 20:30:33 +02:00
Martin Kroeker f460776f0f Fix thread data races 2017-09-09 19:07:06 +02:00
Martin Kroeker e882f3d6f3 Fix thread data race in memory.c 2017-09-09 18:58:38 +02:00
Erik M. Bray dddedbab5d More canonical installation on Cygwin:
* The DLL is named cygopenblas.dll, not libopenblas.dll
* The import lib (still called libopenblas.dll.a) is installed
2017-09-07 14:18:56 +02:00
Abdurrauf 1cfdb2295d Optimized standard Blas Level-1,2 (excluding nrm2 functions) for z13 (double precision) 2017-09-06 16:41:08 +04:00
Martin Kroeker 00740c0e34 Merge pull request #1290 from martin-frbg/imatcopy
Use in-place transform shortcut only if matrix is square
2017-09-03 13:02:10 +02:00
Martin Kroeker 254db9bd7c Use in-place transform shortcut only if matrix is square 2017-09-03 09:52:55 +02:00
Martin Kroeker f2074f9ac1 Merge pull request #1286 from martin-frbg/baytrail
Fix coretype detection for Bay Trail Atom
2017-08-27 13:23:57 +02:00
Martin Kroeker aece65ea29 Fix coretype detection for Bay Trail Atom
My earlier PR #982 appears to have been incomplete in this regard - fixes #1285
2017-08-27 13:06:54 +02:00
Sacha ef64991506 Clean up config file writing. 2017-08-23 12:47:38 +10:00
Sacha 7a867082d8 Fix open_blas.config which was never working out-of-source. Remove need for gen_config_h.exe. If OpenMP is requested, do not silently ignore when it isn't available. 2017-08-23 11:16:24 +10:00
Sacha Refshauge a1b87eac6b Do not require Perl for MSVC if CMake >= 3.4 2017-08-23 07:19:02 +10:00
Sacha Refshauge 47ebce4d1a Clean up, fix old typos. Simplify arch usages. Move system arch check to earlier position. 2017-08-21 00:37:29 +10:00
Sacha Refshauge 69b560751c Improvements to previous commit (cross-compile).
Fix typos and bad if statements discovered in 0.2.20.
2017-08-20 22:50:31 +10:00
Sacha Refshauge 0a7a527a92 Add support for cross compiling.
Add support for not having host compiler as CMake cannot detect such a compiler.
Add support for not using getarch.
Successfully builds Android ARMV8. Any target can be added by supplying the TARGET_CORE config in prebuild.cmake.
2017-08-20 20:08:53 +10:00
Martin Kroeker 50715e8945 Merge pull request #1281 from sharkcz/armv8
fix detection of generic ARMv8 CPUs
2017-08-19 20:37:19 +02:00
Sacha Refshauge 11911fd941 Add kernel/Makefile.LA to CMake 2017-08-20 00:59:14 +10:00
Sacha Refshauge 408b4fe83f Add a CMake GCC and Clang target to Travis CI 2017-08-20 00:59:00 +10:00
Sacha Refshauge 4474465438 Remove _static usages for tests 2017-08-20 00:13:46 +10:00
Sacha Refshauge b9ec72546c Only run utest without NOFORTRAN, same as Makefile. Linux now compiles. 2017-08-20 00:13:24 +10:00
Sacha Refshauge 37858d1146 Fix threading usage in CMake: s/SMP/USE_THREAD/ 2017-08-19 15:07:42 +10:00
Dan Horák 1763e01567 fix detection of generic ARMv8 CPUs 2017-08-18 14:53:29 +02:00
Sacha Refshauge 6aac06587d Fix typos and use CMake OpenMP support. 2017-08-17 17:27:01 +10:00
7c1acc07f0 Fix bug that required fortran. Fix bug that needed CXX var. Remove redundant set vars. Fix threading detection. Do not attempt to run code if cross compiling. 2017-08-17 03:32:04 +10:00
38d273ea03 Drop some redundant vars and improve arch detection in CMake. 2017-08-17 02:04:36 +10:00
7242cdc4ec Allow CMake to determine if it is building static or shared. 2017-08-17 00:51:04 +10:00
90a4dab501 Let CMake deal with build type. 2017-08-17 00:35:54 +10:00
Martin Kroeker a8a342ccc4 Merge pull request #1277 from cconrads-scicomp/fix-installation-instructions
Make: fix installation instructions
2017-08-10 23:42:23 +02:00
Martin Kroeker 9e9a9553db Merge pull request #1276 from cconrads-scicomp/android_-lm_fix
ARM: do not add linker flag `-lm` unconditionally
2017-08-10 21:35:32 +02:00
Martin Kroeker be7c1b6324 Merge pull request #1275 from cconrads-scicomp/recognize-gfortran-on-arm
ARM: recognize gfortran pre-releases
2017-08-10 21:32:09 +02:00