Werner Saar
eea2e30b74
bugfix for arm locking
2015-05-23 11:40:40 +02:00
Werner Saar
19b8fd2aed
smp lock bugfix
2015-05-23 10:58:38 +02:00
wernsaar
0cc5212741
Merge pull request #580 from wernsaar/develop
...
added blas level1 swap benchmark
2015-05-23 09:46:39 +02:00
Werner Saar
c47c8e8cf5
added blas level1 swap benchmark
2015-05-21 08:51:42 +02:00
Zhang Xianyi
a11555c715
Support Android NDK armeabi-v7a-hard ABI. (-mfloat-abi=hard)
...
e.g.
make HOSTCC=gcc CC=arm-linux-androideabi-gcc NO_LAPACK=1 TARGET=ARMV7
In Android NDK, it uses armeabi-v7a-hard ABI.
TARGET_CFLAGS += -mhard-float -D_NDK_MATH_NO_SOFTFP=1
TARGET_LDFLAGS += -Wl,--no-warn-mismatch -lm_hard
For more information, please check hard-float example at
android_ndk/tests/device/hard-float/jni/.
2015-05-20 21:57:27 -05:00
wernsaar
897d03518e
Merge pull request #578 from wernsaar/develop
...
added blas level1 copy benchmark
2015-05-20 11:56:02 +02:00
Werner Saar
23fbc5728e
added blas level1 copy benchmark
2015-05-20 11:05:00 +02:00
Zhang Xianyi
6d40fa587f
Fix f_check bug.
2015-05-19 12:04:45 -05:00
wernsaar
22dcd79959
Merge pull request #577 from wernsaar/develop
...
Bugfix for armv6 memory barrier
2015-05-19 10:59:24 +02:00
Werner Saar
ea4df0aad3
Ref #574 : Bugfix for armv6 memory barrier
2015-05-19 10:43:12 +02:00
Zhang Xianyi
e127fb8fd8
1) Refs #575 . Remove g77 from compiler list.
...
2) If OpenBLAS cannot find Fortran compiler, it will only build BLAS
(without LAPACK).
2015-05-19 00:01:04 -05:00
wernsaar
7fb718a7d8
Merge pull request #572 from wernsaar/develop
...
added optimized cscal and zscal functions for steamroller
2015-05-18 13:47:38 +02:00
Werner Saar
24f58c8bb1
added optimized cscal and zscal kernels for steamroller
2015-05-18 12:40:07 +02:00
Werner Saar
95b1faf667
added optimized cscal and zscal kernels for steamroller and piledriver
2015-05-18 10:50:57 +02:00
Werner Saar
2d9e406050
added optimized cscal kernel for sandybridge
2015-05-18 08:46:06 +02:00
Werner Saar
59083e3ce1
added optimized cscal kernel for bulldozer
2015-05-18 07:33:52 +02:00
wernsaar
685be40339
Merge pull request #571 from wernsaar/develop
...
added optimized cscal and zscal functions
2015-05-17 14:09:14 +02:00
Werner Saar
31c9e399e9
added optimized cscal kernel for haswell
2015-05-17 13:44:09 +02:00
Werner Saar
7de6bb9889
added optimized zscal kernel for bulldozer
2015-05-17 11:45:19 +02:00
Werner Saar
d63034303b
added optimized zscal kernel for haswell
2015-05-16 16:41:45 +02:00
Zhang Xianyi
51ff17d46e
Add AMD Excavator target.
2015-05-13 16:16:30 -05:00
wernsaar
905534942a
Merge pull request #568 from wernsaar/develop
...
added optimized dscal kernel
2015-05-13 13:48:08 +02:00
Werner Saar
18e90ee2e3
bugfix: added static to functions
2015-05-13 13:31:26 +02:00
Werner Saar
e00cccc41e
added optimized dscal kernel for piledriver
2015-05-13 13:05:35 +02:00
Werner Saar
73f09bf64f
optimized dscal kernel for increment != 1
2015-05-13 12:14:39 +02:00
Werner Saar
02e772c7e4
added optimized dscal kernel for haswell
2015-05-12 17:19:58 +02:00
Werner Saar
7aee913991
added optimized dscal kernel for sandybridge
2015-05-12 16:27:43 +02:00
Werner Saar
e50a933037
added optimized dscal kernel for bulldozer
2015-05-12 12:28:44 +02:00
Zhang Xianyi
5f9011d6ef
Merge pull request #566 from powderluv/develop
...
Fix build with ALLOC_SHM=0 (Android NDK)
2015-05-11 20:59:12 -05:00
powderluv
ebb9eba987
Fix build with ALLOC_SHM=0 (Android NDK)
...
Refactor such that you can build with ALLOC_SHM=0. HughTLB
implicity depends on ALLOC_SHM=1. This patch allows
building for Android NDK r10d.
2015-05-10 00:10:26 -07:00
Zhang Xianyi
8e5a1083bb
Refs #532 . Improve gemv paralel with small m and large n case.
...
Splite the matrix and reduction.
2015-05-08 05:33:17 +08:00
Zhang Xianyi
6743beb748
Refs #565 . Fix the bug of generate FEXTRALIB.
2015-05-07 13:06:53 +08:00
Zhang Xianyi
bcabf72c08
Refs #565 . Merge branch 'andreasnoack-anj/bench' into develop
2015-05-07 12:52:14 +08:00
Andreas Noack
cda29f183b
Add vecLib benchmarks
2015-05-06 21:52:34 -04:00
wernsaar
e52d36450a
Merge pull request #564 from wernsaar/develop
...
Use only 1 thread in trsm if m or n < 2*GEMM_MULTITHREAD_THRESHOLD
2015-05-06 11:10:31 +02:00
Werner Saar
f8f2e261fe
use only 1 thread if m or n < 2*GEMM_MULTITHREAD_THRESHOLD
2015-05-06 10:41:53 +02:00
Werner Saar
be3c843700
added loops to trsm.c
2015-05-06 09:21:19 +02:00
wernsaar
e6f57db846
Merge pull request #563 from wernsaar/develop
...
Bugfix for gemm3m tests
2015-05-05 12:13:35 +02:00
Werner Saar
9bfd267d51
bugfix for gemm3m tests
2015-05-05 11:58:59 +02:00
Werner Saar
924bc5372e
removed gemm3m functions from normal checks
2015-05-05 11:39:43 +02:00
wernsaar
2b83a69650
Merge pull request #561 from wernsaar/develop
...
updated dgemv_n sgemv_n kernels
2015-05-04 11:11:13 +02:00
Werner Saar
133c11a156
updated dgemv_n kernel for nehalem
2015-04-30 14:38:06 +02:00
Werner Saar
30f52d53df
optimized dgemv_n kernel for haswell
2015-04-30 12:11:39 +02:00
Zhang Xianyi
a124637329
Merge pull request #560 from sebastien-villemot/develop
...
Fix detection of ARM architectures in c_check.
2015-04-29 11:36:47 -05:00
Sébastien Villemot
642aaba2e0
Fix detection of ARM architectures in c_check.
...
This is necessary to avoid the false detection of a cross-compiling environment.
2015-04-29 18:14:21 +02:00
wernsaar
4c616173e4
Merge pull request #558 from wernsaar/develop
...
optimizations for sandybridge
2015-04-28 17:30:16 +02:00
Werner Saar
5e83d80725
optimized dger kernel for sandybridge
2015-04-28 16:58:11 +02:00
Werner Saar
b2e1797dc6
added optimized sger kernel for sandybridge
2015-04-28 15:33:38 +02:00
Werner Saar
e216f686cb
optimized saxpy and daxpy for sandybridge
2015-04-28 10:18:32 +02:00
Zhang Xianyi
e42652f772
Merge pull request #554 from wernsaar/develop
...
added benchmarks for zgeru and cgeru
2015-04-25 08:11:36 -05:00