Commit Graph

3414 Commits

Author SHA1 Message Date
Martin Kroeker ae93532fd3 Merge pull request #1249 from martin-frbg/cgroup
Honor cgroup/cpuset limits when enumerating cpus
2017-07-25 23:31:57 +02:00
Martin Kroeker c4af196a2d Honor cgroup/cpuset limits when enumerating cpus 2017-07-25 22:47:34 +02:00
Martin Kroeker 480e697681 Revert "Honor cgroup/cpuset limits when enumerating cpus" (#1246) 2017-07-24 16:17:50 +02:00
Zhang Xianyi 3c4c472584 Merge pull request #1236 from martin-frbg/l1cache
Use cpuid 4 with subleafs to query L1 cache size on Intel processors
2017-07-24 12:07:00 +08:00
Zhang Xianyi a797666fbe Bump develop version for 0.3.0. 2017-07-24 12:06:29 +08:00
Zhang Xianyi 5dde4e65d3 Merge branch 'develop'
0.2.20 version
2017-07-24 12:03:35 +08:00
Zhang Xianyi 27a9df6477 Update doc for 0.2.20 version. 2017-07-24 11:55:10 +08:00
Zhang Xianyi 7224022473 Merge pull request #1239 from martin-frbg/cgroups
Honor cgroup/cpuset limits when enumerating cpus
2017-07-24 11:46:52 +08:00
Zhang Xianyi 468ac3df9e Merge pull request #1244 from martin-frbg/micmuc_cimatcopy
Fix complex imatcopy for Trans cases with non-square matrix
2017-07-24 11:45:27 +08:00
Martin Kroeker 376048156b Use in-place transform shortcut only if matrix is square 2017-07-21 11:20:15 +02:00
Martin Kroeker d1c5b8f913 Add files via upload 2017-07-20 20:51:06 +02:00
Martin Kroeker 91bde7d315 Exchange rows and cols in final omatcopy with BlasTrans
This is MicMuc's patch from #899
2017-07-15 22:02:53 +02:00
Martin Kroeker 80373ea039 More fixes for silly misedits 2017-07-15 12:48:42 +02:00
Martin Kroeker d12b75a6c4 Fixup braces lost in previous edit 2017-07-15 11:53:28 +02:00
Martin Kroeker 7294fb1d9d Merge branch 'develop' into cgroups 2017-07-15 10:40:42 +02:00
Martin Kroeker 31e086d6a6 Disable ReLAPACK by default (#1238)
* Disable ReLAPACK by default; mention it in final build message if included

* Add files via upload

* Add files via upload

* Add files via upload
2017-07-13 22:01:47 +02:00
Zhang Xianyi cbb47736af Merge pull request #1214 from martin-frbg/relapack
Initial import of ReLAPACK
2017-07-13 20:31:08 +08:00
Zhang Xianyi 2a7c6930ac Merge pull request #1234 from brada4/develop
Fix write past fixed size buffer
2017-07-13 20:27:37 +08:00
Martin Kroeker 00774b1105 Add dummy implementation of cpuid_count for the CPUIDEMU case 2017-07-12 21:56:23 +02:00
Martin Kroeker 6497aae57c Use cpuid 4 with subleafs to query L1 cache size on Intel processors 2017-07-12 20:43:09 +02:00
Martin Kroeker c4ec882020 Merge pull request #1235 from xianyi/revert-1233-cpuid-fix
Revert "Fix unintentional fall-through cases in get_cacheinfo"
2017-07-12 09:37:55 +02:00
Martin Kroeker d33fc32cf3 Revert "Fix unintentional fall-through cases in get_cacheinfo" 2017-07-12 09:35:11 +02:00
Andrew 529bfc36ec Fix write past fixed size buffer 2017-07-12 00:59:30 +02:00
Martin Kroeker 88249ca5f7 Add files via upload 2017-07-11 18:48:13 +02:00
Martin Kroeker 731c518cff Add files via upload 2017-07-11 18:42:39 +02:00
Martin Kroeker 29fc429d9a Honor cgroup/cpuset constraints when enumerating cpus 2017-07-11 18:27:33 +02:00
Martin Kroeker e2d3b1561a Merge pull request #1233 from martin-frbg/cpuid-fix
Fix unintentional fall-through cases in get_cacheinfo
2017-07-11 17:10:55 +02:00
Martin Kroeker 4a012c3d20 Fix unintentional fall-through cases in get_cacheinfo
These appear to be unintended side effects of PR #1091, probably causing #1232
2017-07-11 15:39:15 +02:00
Zhang Xianyi d5ef0dee9a Merge pull request #1226 from ashwinyes/develop_arm_clang_ual_fix
arm: Fix clang compilation for ARMv7
2017-07-10 20:04:42 +08:00
Zhang Xianyi a590e6135c Merge pull request #1221 from ashwinyes/develop_arm_softfp
arm: add support for softfp in arm vfp assembly files
2017-07-10 20:03:57 +08:00
Zhang Xianyi 4239dd65ce Merge branch 'develop' into develop_arm_softfp 2017-07-10 20:02:36 +08:00
Martin Kroeker 3db2adf872 Merge pull request #1230 from martin-frbg/rhel5
Add sched_getcpu implementation for pre-2.6 glibc
2017-07-09 13:16:16 +02:00
Martin Kroeker ad2462811a Do not add -lpthread on Android builds (#1229)
* Do not add -lpthread on Android builds

* Do not add -lpthread on Android cmake builds
2017-07-09 13:15:24 +02:00
Martin Kroeker c1cf62d2c0 Add sched_getcpu implementation for pre-2.6 glibc
Fixes #1210, compilation on RHEL5 with affinity enabled
2017-07-09 09:45:38 +02:00
Zhang Xianyi bfe1656b8b Merge pull request #1225 from martin-frbg/stolen_from_wernsaar_fork
fixed syrk_thread.c taken from wernsaar
2017-07-07 15:43:33 +08:00
Ashwin Sekhar T K f02d535fde arm: Fix clang compilation for ARMv7
clang is not recognizing some pre-UAL VFP mnemonics like fnmacs, fnmacd,
fnmuls and fnmuld. Replaced them with equivalent UAL mnemonics which are
vmls.f32, vmls.f64, vnmul.f32 and vnmul.f64 respectively.
2017-07-07 12:35:58 +05:30
Martin Kroeker 49e62c0e77 fixed syrk_thread.c taken from wernsaar
Stride calculation fix copied from https://github.com/wernsaar/OpenBLAS/commit/88900e1
2017-07-06 17:30:12 +02:00
Martin Kroeker 3381f23709 Handle different object extensions in Makefile
The optimized LAPACK functions from interface use OS-dependent suffixes .o/.obj for the object files, while netlib LAPACK uses .o throughout. ReLAPACK object names have to match in order for function replacement in the growing library file to work.
2017-07-06 10:12:00 +02:00
Zhang Xianyi fa6a920caa Link -lm or -lm_hard for Android ARMv7. 2017-07-05 17:05:06 +08:00
Zhang Xianyi a6515bb858 Merge pull request #1218 from m-brow/power9
Optimise loads on Power9 LE
2017-07-03 13:48:29 +08:00
Zhang Xianyi c66b842d66 Merge pull request #1212 from neilsh-msft/develop
Add Microsoft Windows 10 UWP build support
2017-07-03 13:43:48 +08:00
Martin Kroeker df2dfe65d6 Update Makefile 2017-07-02 01:46:23 +02:00
Martin Kroeker 2c8d634619 Add files via upload 2017-07-02 00:50:14 +02:00
Ashwin Sekhar T K 37efb5bc1d arm: Remove unnecessary files/code
Since softfp code has been added to all required vfp kernels,
the code for auto detection of abi is no longer required.

The option to force softfp ABI on make command line by giving
ARM_SOFTFP_ABI=1 is retained. But there is no need to give this option
anymore.

Also the newly added C versions of 4x4/4x2 gemm/trmm kernels are removed.
These are longer required. Moreover these kernels has bugs.
2017-07-02 03:06:36 +05:30
Ashwin Sekhar T K 97d671eb61 arm: add softfp support in zgemm/ztrmm vfp kernels 2017-07-02 02:54:32 +05:30
Ashwin Sekhar T K 305cd2e8b4 arm: add softfp support in cgemm/ctrmm vfp kernels 2017-07-02 02:42:32 +05:30
Ashwin Sekhar T K 09bc6ebe5b arm: add softfp support in dgemm/dtrmm vfp kernels 2017-07-02 02:24:38 +05:30
Ashwin Sekhar T K 872a11a2bf arm: add softfp support in sgemm/strmm vfp kernels 2017-07-02 02:23:48 +05:30
Ashwin Sekhar T K eda9e8632a generic: Bug fixes in generic 4x2 and 4x4 gemm kernels 2017-07-02 02:00:48 +05:30
Ashwin Sekhar T K 8f83d3f961 arm: add softfp support in vfp gemv kernels 2017-07-02 01:03:31 +05:30