Commit Graph

7452 Commits

Author SHA1 Message Date
Martin Kroeker 80373ea039 More fixes for silly misedits 2017-07-15 12:48:42 +02:00
Martin Kroeker d12b75a6c4 Fixup braces lost in previous edit 2017-07-15 11:53:28 +02:00
Martin Kroeker 7294fb1d9d Merge branch 'develop' into cgroups 2017-07-15 10:40:42 +02:00
Martin Kroeker 31e086d6a6 Disable ReLAPACK by default (#1238)
* Disable ReLAPACK by default; mention it in final build message if included

* Add files via upload

* Add files via upload

* Add files via upload
2017-07-13 22:01:47 +02:00
Zhang Xianyi cbb47736af Merge pull request #1214 from martin-frbg/relapack
Initial import of ReLAPACK
2017-07-13 20:31:08 +08:00
Zhang Xianyi 2a7c6930ac Merge pull request #1234 from brada4/develop
Fix write past fixed size buffer
2017-07-13 20:27:37 +08:00
Martin Kroeker 00774b1105 Add dummy implementation of cpuid_count for the CPUIDEMU case 2017-07-12 21:56:23 +02:00
Martin Kroeker 6497aae57c Use cpuid 4 with subleafs to query L1 cache size on Intel processors 2017-07-12 20:43:09 +02:00
Martin Kroeker c4ec882020 Merge pull request #1235 from xianyi/revert-1233-cpuid-fix
Revert "Fix unintentional fall-through cases in get_cacheinfo"
2017-07-12 09:37:55 +02:00
Martin Kroeker d33fc32cf3 Revert "Fix unintentional fall-through cases in get_cacheinfo" 2017-07-12 09:35:11 +02:00
Andrew 529bfc36ec Fix write past fixed size buffer 2017-07-12 00:59:30 +02:00
Martin Kroeker 88249ca5f7 Add files via upload 2017-07-11 18:48:13 +02:00
Martin Kroeker 731c518cff Add files via upload 2017-07-11 18:42:39 +02:00
Martin Kroeker 29fc429d9a Honor cgroup/cpuset constraints when enumerating cpus 2017-07-11 18:27:33 +02:00
Martin Kroeker e2d3b1561a Merge pull request #1233 from martin-frbg/cpuid-fix
Fix unintentional fall-through cases in get_cacheinfo
2017-07-11 17:10:55 +02:00
Martin Kroeker 4a012c3d20 Fix unintentional fall-through cases in get_cacheinfo
These appear to be unintended side effects of PR #1091, probably causing #1232
2017-07-11 15:39:15 +02:00
Zhang Xianyi d5ef0dee9a Merge pull request #1226 from ashwinyes/develop_arm_clang_ual_fix
arm: Fix clang compilation for ARMv7
2017-07-10 20:04:42 +08:00
Zhang Xianyi a590e6135c Merge pull request #1221 from ashwinyes/develop_arm_softfp
arm: add support for softfp in arm vfp assembly files
2017-07-10 20:03:57 +08:00
Zhang Xianyi 4239dd65ce Merge branch 'develop' into develop_arm_softfp 2017-07-10 20:02:36 +08:00
Martin Kroeker 3db2adf872 Merge pull request #1230 from martin-frbg/rhel5
Add sched_getcpu implementation for pre-2.6 glibc
2017-07-09 13:16:16 +02:00
Martin Kroeker ad2462811a Do not add -lpthread on Android builds (#1229)
* Do not add -lpthread on Android builds

* Do not add -lpthread on Android cmake builds
2017-07-09 13:15:24 +02:00
Martin Kroeker c1cf62d2c0 Add sched_getcpu implementation for pre-2.6 glibc
Fixes #1210, compilation on RHEL5 with affinity enabled
2017-07-09 09:45:38 +02:00
Zhang Xianyi bfe1656b8b Merge pull request #1225 from martin-frbg/stolen_from_wernsaar_fork
fixed syrk_thread.c taken from wernsaar
2017-07-07 15:43:33 +08:00
Ashwin Sekhar T K f02d535fde arm: Fix clang compilation for ARMv7
clang is not recognizing some pre-UAL VFP mnemonics like fnmacs, fnmacd,
fnmuls and fnmuld. Replaced them with equivalent UAL mnemonics which are
vmls.f32, vmls.f64, vnmul.f32 and vnmul.f64 respectively.
2017-07-07 12:35:58 +05:30
Martin Kroeker 49e62c0e77 fixed syrk_thread.c taken from wernsaar
Stride calculation fix copied from https://github.com/wernsaar/OpenBLAS/commit/88900e1
2017-07-06 17:30:12 +02:00
Martin Kroeker 3381f23709 Handle different object extensions in Makefile
The optimized LAPACK functions from interface use OS-dependent suffixes .o/.obj for the object files, while netlib LAPACK uses .o throughout. ReLAPACK object names have to match in order for function replacement in the growing library file to work.
2017-07-06 10:12:00 +02:00
Zhang Xianyi fa6a920caa Link -lm or -lm_hard for Android ARMv7. 2017-07-05 17:05:06 +08:00
Zhang Xianyi a6515bb858 Merge pull request #1218 from m-brow/power9
Optimise loads on Power9 LE
2017-07-03 13:48:29 +08:00
Zhang Xianyi c66b842d66 Merge pull request #1212 from neilsh-msft/develop
Add Microsoft Windows 10 UWP build support
2017-07-03 13:43:48 +08:00
Martin Kroeker df2dfe65d6 Update Makefile 2017-07-02 01:46:23 +02:00
Martin Kroeker 2c8d634619 Add files via upload 2017-07-02 00:50:14 +02:00
Ashwin Sekhar T K 37efb5bc1d arm: Remove unnecessary files/code
Since softfp code has been added to all required vfp kernels,
the code for auto detection of abi is no longer required.

The option to force softfp ABI on make command line by giving
ARM_SOFTFP_ABI=1 is retained. But there is no need to give this option
anymore.

Also the newly added C versions of 4x4/4x2 gemm/trmm kernels are removed.
These are longer required. Moreover these kernels has bugs.
2017-07-02 03:06:36 +05:30
Ashwin Sekhar T K 97d671eb61 arm: add softfp support in zgemm/ztrmm vfp kernels 2017-07-02 02:54:32 +05:30
Ashwin Sekhar T K 305cd2e8b4 arm: add softfp support in cgemm/ctrmm vfp kernels 2017-07-02 02:42:32 +05:30
Ashwin Sekhar T K 09bc6ebe5b arm: add softfp support in dgemm/dtrmm vfp kernels 2017-07-02 02:24:38 +05:30
Ashwin Sekhar T K 872a11a2bf arm: add softfp support in sgemm/strmm vfp kernels 2017-07-02 02:23:48 +05:30
Ashwin Sekhar T K eda9e8632a generic: Bug fixes in generic 4x2 and 4x4 gemm kernels 2017-07-02 02:00:48 +05:30
Ashwin Sekhar T K 8f83d3f961 arm: add softfp support in vfp gemv kernels 2017-07-02 01:03:31 +05:30
Martin Kroeker e5e47cfdb5 Merge pull request #1220 from ashwinyes/develop_aarch64_20170701_t99_options
arm64: Change mtune/mcpu options for THUNDERX2T99 target
2017-07-01 20:43:23 +02:00
Ashwin Sekhar T K ebf9e9dabe arm64: Change mtune/mcpu options for THUNDERX2T99 target 2017-07-01 11:17:10 -07:00
Ashwin Sekhar T K 83bd547517 arm: add softfp support in kernel/arm/swap_vfp.S 2017-07-01 20:37:40 +05:30
Ashwin Sekhar T K e25f4c01d6 arm: add softfp support in kernel/arm/nrm2_vfp*.S 2017-07-01 19:57:28 +05:30
Ashwin Sekhar T K 54915ce343 arm: add softfp support in kernel/arm/*dot_vfp.S 2017-06-30 23:46:02 +05:30
Ashwin Sekhar T K 0150fabdb6 arm: add softfp support in kernel/arm/rot_vfp.S 2017-06-30 21:52:32 +05:30
Ashwin Sekhar T K 4f0773f07d arm: add softfp support in kernel/arm/axpy_vfp.S 2017-06-30 20:25:59 +05:30
Ashwin Sekhar T K aa5edebc80 arm: add softfp support in kernel/arm/asum_vfp.S 2017-06-30 18:21:05 +05:30
Ashwin Sekhar T K 89924b3d5b arm: Use assembly implementations based on the ARM abi
In case of softfp abi, assembly implementations of only those APIs are
used which doesnt have a floating point argument or return value.

In case of hard abi, all assembly implementations are used.
2017-06-30 18:21:05 +05:30
Ashwin Sekhar T K da7f0ff425 generic: add some generic gemm and trmm kernels
Added generic 4x4 and 4x2 gemm kernels
Added generic 4x2 trmm kernel
2017-06-30 18:21:05 +05:30
Ashwin Sekhar T K 0d5c8e5386 arm: Determine the abi from compiler if not specified on command line
If ARM abi is not explicitly mentioned on the command line, then set the
arm abi to softfp or hard according to the compiler environment.
This assumes that compiler sets the defines __ARM_PCS and __ARM_PCS_VFP
accordingly.
2017-06-30 18:20:59 +05:30
Martin Kroeker 912410f214 Add ReLAPACK to Makefiles 2017-06-28 18:15:21 +02:00