Commit Graph

2473 Commits

Author SHA1 Message Date
Martin Kroeker 254db9bd7c Use in-place transform shortcut only if matrix is square 2017-09-03 09:52:55 +02:00
Martin Kroeker 1e9247c276 Merge pull request #1260 from xianyi/revert-1254-xbmv_range
Revert "Fix calculated range limit exceeding actual data size for last thread"
2017-08-01 20:07:32 +02:00
Martin Kroeker a6f533b248 Revert "Fix calculated range limit exceeding actual data size for last thread" 2017-08-01 19:28:08 +02:00
Martin Kroeker e70a6b92bf Merge pull request #1257 from martin-frbg/cgroups-prereq
Rework __GLIBC_PREREQ checks to avoid breaking non-glibc builds
2017-08-01 11:23:03 +02:00
Martin Kroeker 63cfa32691 Rework __GLIBC_PREREQ checks to avoid breaking non-glibc builds 2017-07-31 21:02:43 +02:00
Martin Kroeker d537e0de8c Merge pull request #1254 from martin-frbg/xbmv_range
Fix calculated range limit exceeding actual data size for last thread
2017-07-31 17:46:40 +02:00
Martin Kroeker 585c0010a5 Fix range limit exceeding actual data size in last step 2017-07-28 00:27:02 +02:00
Martin Kroeker 857f61bc5d Fix range limit exceeding data size in last step 2017-07-28 00:21:53 +02:00
Martin Kroeker 9332042d5f Fix range exceeding actual data size in quick_divide 2017-07-28 00:13:24 +02:00
Martin Kroeker ae93532fd3 Merge pull request #1249 from martin-frbg/cgroup
Honor cgroup/cpuset limits when enumerating cpus
2017-07-25 23:31:57 +02:00
Martin Kroeker c4af196a2d Honor cgroup/cpuset limits when enumerating cpus 2017-07-25 22:47:34 +02:00
Martin Kroeker 480e697681 Revert "Honor cgroup/cpuset limits when enumerating cpus" (#1246) 2017-07-24 16:17:50 +02:00
Zhang Xianyi 3c4c472584 Merge pull request #1236 from martin-frbg/l1cache
Use cpuid 4 with subleafs to query L1 cache size on Intel processors
2017-07-24 12:07:00 +08:00
Zhang Xianyi a797666fbe Bump develop version for 0.3.0. 2017-07-24 12:06:29 +08:00
Zhang Xianyi 5dde4e65d3 Merge branch 'develop'
0.2.20 version
2017-07-24 12:03:35 +08:00
Zhang Xianyi 27a9df6477 Update doc for 0.2.20 version. 2017-07-24 11:55:10 +08:00
Zhang Xianyi 7224022473 Merge pull request #1239 from martin-frbg/cgroups
Honor cgroup/cpuset limits when enumerating cpus
2017-07-24 11:46:52 +08:00
Zhang Xianyi 468ac3df9e Merge pull request #1244 from martin-frbg/micmuc_cimatcopy
Fix complex imatcopy for Trans cases with non-square matrix
2017-07-24 11:45:27 +08:00
Martin Kroeker 376048156b Use in-place transform shortcut only if matrix is square 2017-07-21 11:20:15 +02:00
Martin Kroeker d1c5b8f913 Add files via upload 2017-07-20 20:51:06 +02:00
Martin Kroeker 91bde7d315 Exchange rows and cols in final omatcopy with BlasTrans
This is MicMuc's patch from #899
2017-07-15 22:02:53 +02:00
Martin Kroeker 80373ea039 More fixes for silly misedits 2017-07-15 12:48:42 +02:00
Martin Kroeker d12b75a6c4 Fixup braces lost in previous edit 2017-07-15 11:53:28 +02:00
Martin Kroeker 7294fb1d9d Merge branch 'develop' into cgroups 2017-07-15 10:40:42 +02:00
Martin Kroeker 31e086d6a6 Disable ReLAPACK by default (#1238)
* Disable ReLAPACK by default; mention it in final build message if included

* Add files via upload

* Add files via upload

* Add files via upload
2017-07-13 22:01:47 +02:00
Zhang Xianyi cbb47736af Merge pull request #1214 from martin-frbg/relapack
Initial import of ReLAPACK
2017-07-13 20:31:08 +08:00
Zhang Xianyi 2a7c6930ac Merge pull request #1234 from brada4/develop
Fix write past fixed size buffer
2017-07-13 20:27:37 +08:00
Martin Kroeker 00774b1105 Add dummy implementation of cpuid_count for the CPUIDEMU case 2017-07-12 21:56:23 +02:00
Martin Kroeker 6497aae57c Use cpuid 4 with subleafs to query L1 cache size on Intel processors 2017-07-12 20:43:09 +02:00
Martin Kroeker c4ec882020 Merge pull request #1235 from xianyi/revert-1233-cpuid-fix
Revert "Fix unintentional fall-through cases in get_cacheinfo"
2017-07-12 09:37:55 +02:00
Martin Kroeker d33fc32cf3 Revert "Fix unintentional fall-through cases in get_cacheinfo" 2017-07-12 09:35:11 +02:00
Andrew 529bfc36ec Fix write past fixed size buffer 2017-07-12 00:59:30 +02:00
Martin Kroeker 88249ca5f7 Add files via upload 2017-07-11 18:48:13 +02:00
Martin Kroeker 731c518cff Add files via upload 2017-07-11 18:42:39 +02:00
Martin Kroeker 29fc429d9a Honor cgroup/cpuset constraints when enumerating cpus 2017-07-11 18:27:33 +02:00
Martin Kroeker e2d3b1561a Merge pull request #1233 from martin-frbg/cpuid-fix
Fix unintentional fall-through cases in get_cacheinfo
2017-07-11 17:10:55 +02:00
Martin Kroeker 4a012c3d20 Fix unintentional fall-through cases in get_cacheinfo
These appear to be unintended side effects of PR #1091, probably causing #1232
2017-07-11 15:39:15 +02:00
Zhang Xianyi d5ef0dee9a Merge pull request #1226 from ashwinyes/develop_arm_clang_ual_fix
arm: Fix clang compilation for ARMv7
2017-07-10 20:04:42 +08:00
Zhang Xianyi a590e6135c Merge pull request #1221 from ashwinyes/develop_arm_softfp
arm: add support for softfp in arm vfp assembly files
2017-07-10 20:03:57 +08:00
Zhang Xianyi 4239dd65ce Merge branch 'develop' into develop_arm_softfp 2017-07-10 20:02:36 +08:00
Martin Kroeker 3db2adf872 Merge pull request #1230 from martin-frbg/rhel5
Add sched_getcpu implementation for pre-2.6 glibc
2017-07-09 13:16:16 +02:00
Martin Kroeker ad2462811a Do not add -lpthread on Android builds (#1229)
* Do not add -lpthread on Android builds

* Do not add -lpthread on Android cmake builds
2017-07-09 13:15:24 +02:00
Martin Kroeker c1cf62d2c0 Add sched_getcpu implementation for pre-2.6 glibc
Fixes #1210, compilation on RHEL5 with affinity enabled
2017-07-09 09:45:38 +02:00
Zhang Xianyi bfe1656b8b Merge pull request #1225 from martin-frbg/stolen_from_wernsaar_fork
fixed syrk_thread.c taken from wernsaar
2017-07-07 15:43:33 +08:00
Ashwin Sekhar T K f02d535fde arm: Fix clang compilation for ARMv7
clang is not recognizing some pre-UAL VFP mnemonics like fnmacs, fnmacd,
fnmuls and fnmuld. Replaced them with equivalent UAL mnemonics which are
vmls.f32, vmls.f64, vnmul.f32 and vnmul.f64 respectively.
2017-07-07 12:35:58 +05:30
Martin Kroeker 49e62c0e77 fixed syrk_thread.c taken from wernsaar
Stride calculation fix copied from https://github.com/wernsaar/OpenBLAS/commit/88900e1
2017-07-06 17:30:12 +02:00
Martin Kroeker 3381f23709 Handle different object extensions in Makefile
The optimized LAPACK functions from interface use OS-dependent suffixes .o/.obj for the object files, while netlib LAPACK uses .o throughout. ReLAPACK object names have to match in order for function replacement in the growing library file to work.
2017-07-06 10:12:00 +02:00
Zhang Xianyi fa6a920caa Link -lm or -lm_hard for Android ARMv7. 2017-07-05 17:05:06 +08:00
Zhang Xianyi a6515bb858 Merge pull request #1218 from m-brow/power9
Optimise loads on Power9 LE
2017-07-03 13:48:29 +08:00
Zhang Xianyi c66b842d66 Merge pull request #1212 from neilsh-msft/develop
Add Microsoft Windows 10 UWP build support
2017-07-03 13:43:48 +08:00