Martin Kroeker
80373ea039
More fixes for silly misedits
2017-07-15 12:48:42 +02:00
Martin Kroeker
d12b75a6c4
Fixup braces lost in previous edit
2017-07-15 11:53:28 +02:00
Martin Kroeker
7294fb1d9d
Merge branch 'develop' into cgroups
2017-07-15 10:40:42 +02:00
Martin Kroeker
31e086d6a6
Disable ReLAPACK by default ( #1238 )
...
* Disable ReLAPACK by default; mention it in final build message if included
* Add files via upload
* Add files via upload
* Add files via upload
2017-07-13 22:01:47 +02:00
Zhang Xianyi
cbb47736af
Merge pull request #1214 from martin-frbg/relapack
...
Initial import of ReLAPACK
2017-07-13 20:31:08 +08:00
Zhang Xianyi
2a7c6930ac
Merge pull request #1234 from brada4/develop
...
Fix write past fixed size buffer
2017-07-13 20:27:37 +08:00
Martin Kroeker
00774b1105
Add dummy implementation of cpuid_count for the CPUIDEMU case
2017-07-12 21:56:23 +02:00
Martin Kroeker
6497aae57c
Use cpuid 4 with subleafs to query L1 cache size on Intel processors
2017-07-12 20:43:09 +02:00
Martin Kroeker
c4ec882020
Merge pull request #1235 from xianyi/revert-1233-cpuid-fix
...
Revert "Fix unintentional fall-through cases in get_cacheinfo"
2017-07-12 09:37:55 +02:00
Martin Kroeker
d33fc32cf3
Revert "Fix unintentional fall-through cases in get_cacheinfo"
2017-07-12 09:35:11 +02:00
Andrew
529bfc36ec
Fix write past fixed size buffer
2017-07-12 00:59:30 +02:00
Martin Kroeker
88249ca5f7
Add files via upload
2017-07-11 18:48:13 +02:00
Martin Kroeker
731c518cff
Add files via upload
2017-07-11 18:42:39 +02:00
Martin Kroeker
29fc429d9a
Honor cgroup/cpuset constraints when enumerating cpus
2017-07-11 18:27:33 +02:00
Martin Kroeker
e2d3b1561a
Merge pull request #1233 from martin-frbg/cpuid-fix
...
Fix unintentional fall-through cases in get_cacheinfo
2017-07-11 17:10:55 +02:00
Martin Kroeker
4a012c3d20
Fix unintentional fall-through cases in get_cacheinfo
...
These appear to be unintended side effects of PR #1091 , probably causing #1232
2017-07-11 15:39:15 +02:00
Zhang Xianyi
d5ef0dee9a
Merge pull request #1226 from ashwinyes/develop_arm_clang_ual_fix
...
arm: Fix clang compilation for ARMv7
2017-07-10 20:04:42 +08:00
Zhang Xianyi
a590e6135c
Merge pull request #1221 from ashwinyes/develop_arm_softfp
...
arm: add support for softfp in arm vfp assembly files
2017-07-10 20:03:57 +08:00
Zhang Xianyi
4239dd65ce
Merge branch 'develop' into develop_arm_softfp
2017-07-10 20:02:36 +08:00
Martin Kroeker
3db2adf872
Merge pull request #1230 from martin-frbg/rhel5
...
Add sched_getcpu implementation for pre-2.6 glibc
2017-07-09 13:16:16 +02:00
Martin Kroeker
ad2462811a
Do not add -lpthread on Android builds ( #1229 )
...
* Do not add -lpthread on Android builds
* Do not add -lpthread on Android cmake builds
2017-07-09 13:15:24 +02:00
Martin Kroeker
c1cf62d2c0
Add sched_getcpu implementation for pre-2.6 glibc
...
Fixes #1210 , compilation on RHEL5 with affinity enabled
2017-07-09 09:45:38 +02:00
Zhang Xianyi
bfe1656b8b
Merge pull request #1225 from martin-frbg/stolen_from_wernsaar_fork
...
fixed syrk_thread.c taken from wernsaar
2017-07-07 15:43:33 +08:00
Ashwin Sekhar T K
f02d535fde
arm: Fix clang compilation for ARMv7
...
clang is not recognizing some pre-UAL VFP mnemonics like fnmacs, fnmacd,
fnmuls and fnmuld. Replaced them with equivalent UAL mnemonics which are
vmls.f32, vmls.f64, vnmul.f32 and vnmul.f64 respectively.
2017-07-07 12:35:58 +05:30
Martin Kroeker
49e62c0e77
fixed syrk_thread.c taken from wernsaar
...
Stride calculation fix copied from https://github.com/wernsaar/OpenBLAS/commit/88900e1
2017-07-06 17:30:12 +02:00
Martin Kroeker
3381f23709
Handle different object extensions in Makefile
...
The optimized LAPACK functions from interface use OS-dependent suffixes .o/.obj for the object files, while netlib LAPACK uses .o throughout. ReLAPACK object names have to match in order for function replacement in the growing library file to work.
2017-07-06 10:12:00 +02:00
Zhang Xianyi
fa6a920caa
Link -lm or -lm_hard for Android ARMv7.
2017-07-05 17:05:06 +08:00
Zhang Xianyi
a6515bb858
Merge pull request #1218 from m-brow/power9
...
Optimise loads on Power9 LE
2017-07-03 13:48:29 +08:00
Zhang Xianyi
c66b842d66
Merge pull request #1212 from neilsh-msft/develop
...
Add Microsoft Windows 10 UWP build support
2017-07-03 13:43:48 +08:00
Martin Kroeker
df2dfe65d6
Update Makefile
2017-07-02 01:46:23 +02:00
Martin Kroeker
2c8d634619
Add files via upload
2017-07-02 00:50:14 +02:00
Ashwin Sekhar T K
37efb5bc1d
arm: Remove unnecessary files/code
...
Since softfp code has been added to all required vfp kernels,
the code for auto detection of abi is no longer required.
The option to force softfp ABI on make command line by giving
ARM_SOFTFP_ABI=1 is retained. But there is no need to give this option
anymore.
Also the newly added C versions of 4x4/4x2 gemm/trmm kernels are removed.
These are longer required. Moreover these kernels has bugs.
2017-07-02 03:06:36 +05:30
Ashwin Sekhar T K
97d671eb61
arm: add softfp support in zgemm/ztrmm vfp kernels
2017-07-02 02:54:32 +05:30
Ashwin Sekhar T K
305cd2e8b4
arm: add softfp support in cgemm/ctrmm vfp kernels
2017-07-02 02:42:32 +05:30
Ashwin Sekhar T K
09bc6ebe5b
arm: add softfp support in dgemm/dtrmm vfp kernels
2017-07-02 02:24:38 +05:30
Ashwin Sekhar T K
872a11a2bf
arm: add softfp support in sgemm/strmm vfp kernels
2017-07-02 02:23:48 +05:30
Ashwin Sekhar T K
eda9e8632a
generic: Bug fixes in generic 4x2 and 4x4 gemm kernels
2017-07-02 02:00:48 +05:30
Ashwin Sekhar T K
8f83d3f961
arm: add softfp support in vfp gemv kernels
2017-07-02 01:03:31 +05:30
Martin Kroeker
e5e47cfdb5
Merge pull request #1220 from ashwinyes/develop_aarch64_20170701_t99_options
...
arm64: Change mtune/mcpu options for THUNDERX2T99 target
2017-07-01 20:43:23 +02:00
Ashwin Sekhar T K
ebf9e9dabe
arm64: Change mtune/mcpu options for THUNDERX2T99 target
2017-07-01 11:17:10 -07:00
Ashwin Sekhar T K
83bd547517
arm: add softfp support in kernel/arm/swap_vfp.S
2017-07-01 20:37:40 +05:30
Ashwin Sekhar T K
e25f4c01d6
arm: add softfp support in kernel/arm/nrm2_vfp*.S
2017-07-01 19:57:28 +05:30
Ashwin Sekhar T K
54915ce343
arm: add softfp support in kernel/arm/*dot_vfp.S
2017-06-30 23:46:02 +05:30
Ashwin Sekhar T K
0150fabdb6
arm: add softfp support in kernel/arm/rot_vfp.S
2017-06-30 21:52:32 +05:30
Ashwin Sekhar T K
4f0773f07d
arm: add softfp support in kernel/arm/axpy_vfp.S
2017-06-30 20:25:59 +05:30
Ashwin Sekhar T K
aa5edebc80
arm: add softfp support in kernel/arm/asum_vfp.S
2017-06-30 18:21:05 +05:30
Ashwin Sekhar T K
89924b3d5b
arm: Use assembly implementations based on the ARM abi
...
In case of softfp abi, assembly implementations of only those APIs are
used which doesnt have a floating point argument or return value.
In case of hard abi, all assembly implementations are used.
2017-06-30 18:21:05 +05:30
Ashwin Sekhar T K
da7f0ff425
generic: add some generic gemm and trmm kernels
...
Added generic 4x4 and 4x2 gemm kernels
Added generic 4x2 trmm kernel
2017-06-30 18:21:05 +05:30
Ashwin Sekhar T K
0d5c8e5386
arm: Determine the abi from compiler if not specified on command line
...
If ARM abi is not explicitly mentioned on the command line, then set the
arm abi to softfp or hard according to the compiler environment.
This assumes that compiler sets the defines __ARM_PCS and __ARM_PCS_VFP
accordingly.
2017-06-30 18:20:59 +05:30
Martin Kroeker
912410f214
Add ReLAPACK to Makefiles
2017-06-28 18:15:21 +02:00