Commit Graph

7452 Commits

Author SHA1 Message Date
wernsaar 69c5169e7d Merge pull request #589 from wernsaar/develop
small modification of gemm.c
2015-06-03 12:14:09 +02:00
Werner Saar e19948baa1 small modification of gemm.c 2015-06-03 09:11:51 +02:00
wernsaar a2eaf234fc Merge pull request #587 from wernsaar/develop
added gesv benchmark
2015-06-02 15:29:49 +02:00
Werner Saar 6a13a94e71 added gesv benchmark 2015-06-02 13:35:49 +02:00
wernsaar eff43d3289 Merge pull request #585 from wernsaar/develop
bugfix for benchmark Makefile on MAC
2015-05-31 15:01:54 +02:00
Werner Saar 9c4817d07b bugfix for Makefile on mac 2015-05-31 14:16:51 +02:00
wernsaar 319f3a0451 Merge pull request #584 from wernsaar/develop
bugfixes, to build benchmarks with mingw on Windows OS
2015-05-29 13:27:20 +02:00
Werner Saar 02c7766f68 bugfixes, to build benchmarks with mingw on Windows OS 2015-05-29 12:56:22 +02:00
wernsaar f38cb67ca8 Merge pull request #581 from wernsaar/develop
bugfix for arm locking
2015-05-23 12:58:15 +02:00
Werner Saar eea2e30b74 bugfix for arm locking 2015-05-23 11:40:40 +02:00
Werner Saar 19b8fd2aed smp lock bugfix 2015-05-23 10:58:38 +02:00
wernsaar 0cc5212741 Merge pull request #580 from wernsaar/develop
added blas level1 swap  benchmark
2015-05-23 09:46:39 +02:00
Werner Saar c47c8e8cf5 added blas level1 swap benchmark 2015-05-21 08:51:42 +02:00
Zhang Xianyi a11555c715 Support Android NDK armeabi-v7a-hard ABI. (-mfloat-abi=hard)
e.g.
make HOSTCC=gcc CC=arm-linux-androideabi-gcc NO_LAPACK=1 TARGET=ARMV7

In Android NDK, it uses armeabi-v7a-hard ABI.
TARGET_CFLAGS += -mhard-float -D_NDK_MATH_NO_SOFTFP=1
TARGET_LDFLAGS += -Wl,--no-warn-mismatch -lm_hard
For more information, please check hard-float example at
android_ndk/tests/device/hard-float/jni/.
2015-05-20 21:57:27 -05:00
wernsaar 897d03518e Merge pull request #578 from wernsaar/develop
added blas level1 copy benchmark
2015-05-20 11:56:02 +02:00
Werner Saar 23fbc5728e added blas level1 copy benchmark 2015-05-20 11:05:00 +02:00
Zhang Xianyi 6d40fa587f Fix f_check bug. 2015-05-19 12:04:45 -05:00
wernsaar 22dcd79959 Merge pull request #577 from wernsaar/develop
Bugfix for armv6 memory barrier
2015-05-19 10:59:24 +02:00
Werner Saar ea4df0aad3 Ref #574: Bugfix for armv6 memory barrier 2015-05-19 10:43:12 +02:00
Zhang Xianyi e127fb8fd8 1) Refs #575. Remove g77 from compiler list.
2) If OpenBLAS cannot find Fortran compiler, it will only build BLAS
(without LAPACK).
2015-05-19 00:01:04 -05:00
wernsaar 7fb718a7d8 Merge pull request #572 from wernsaar/develop
added optimized cscal and zscal functions for steamroller
2015-05-18 13:47:38 +02:00
Werner Saar 24f58c8bb1 added optimized cscal and zscal kernels for steamroller 2015-05-18 12:40:07 +02:00
Werner Saar 95b1faf667 added optimized cscal and zscal kernels for steamroller and piledriver 2015-05-18 10:50:57 +02:00
Werner Saar 2d9e406050 added optimized cscal kernel for sandybridge 2015-05-18 08:46:06 +02:00
Werner Saar 59083e3ce1 added optimized cscal kernel for bulldozer 2015-05-18 07:33:52 +02:00
wernsaar 685be40339 Merge pull request #571 from wernsaar/develop
added optimized cscal and zscal functions
2015-05-17 14:09:14 +02:00
Werner Saar 31c9e399e9 added optimized cscal kernel for haswell 2015-05-17 13:44:09 +02:00
Werner Saar 7de6bb9889 added optimized zscal kernel for bulldozer 2015-05-17 11:45:19 +02:00
Werner Saar d63034303b added optimized zscal kernel for haswell 2015-05-16 16:41:45 +02:00
Zhang Xianyi 51ff17d46e Add AMD Excavator target. 2015-05-13 16:16:30 -05:00
wernsaar 905534942a Merge pull request #568 from wernsaar/develop
added optimized dscal kernel
2015-05-13 13:48:08 +02:00
Werner Saar 18e90ee2e3 bugfix: added static to functions 2015-05-13 13:31:26 +02:00
Werner Saar e00cccc41e added optimized dscal kernel for piledriver 2015-05-13 13:05:35 +02:00
Werner Saar 73f09bf64f optimized dscal kernel for increment != 1 2015-05-13 12:14:39 +02:00
Werner Saar 02e772c7e4 added optimized dscal kernel for haswell 2015-05-12 17:19:58 +02:00
Werner Saar 7aee913991 added optimized dscal kernel for sandybridge 2015-05-12 16:27:43 +02:00
Werner Saar e50a933037 added optimized dscal kernel for bulldozer 2015-05-12 12:28:44 +02:00
Zhang Xianyi 5f9011d6ef Merge pull request #566 from powderluv/develop
Fix build with ALLOC_SHM=0 (Android NDK)
2015-05-11 20:59:12 -05:00
powderluv ebb9eba987 Fix build with ALLOC_SHM=0 (Android NDK)
Refactor such that you can build with ALLOC_SHM=0. HughTLB
implicity depends on ALLOC_SHM=1. This patch allows
building for Android NDK r10d.
2015-05-10 00:10:26 -07:00
Zhang Xianyi 8e5a1083bb Refs #532. Improve gemv paralel with small m and large n case.
Splite the matrix and reduction.
2015-05-08 05:33:17 +08:00
Zhang Xianyi 6743beb748 Refs #565. Fix the bug of generate FEXTRALIB. 2015-05-07 13:06:53 +08:00
Zhang Xianyi bcabf72c08 Refs #565. Merge branch 'andreasnoack-anj/bench' into develop 2015-05-07 12:52:14 +08:00
Andreas Noack cda29f183b Add vecLib benchmarks 2015-05-06 21:52:34 -04:00
wernsaar e52d36450a Merge pull request #564 from wernsaar/develop
Use only 1 thread in trsm if m or n < 2*GEMM_MULTITHREAD_THRESHOLD
2015-05-06 11:10:31 +02:00
Werner Saar f8f2e261fe use only 1 thread if m or n < 2*GEMM_MULTITHREAD_THRESHOLD 2015-05-06 10:41:53 +02:00
Werner Saar be3c843700 added loops to trsm.c 2015-05-06 09:21:19 +02:00
wernsaar e6f57db846 Merge pull request #563 from wernsaar/develop
Bugfix for gemm3m tests
2015-05-05 12:13:35 +02:00
Werner Saar 9bfd267d51 bugfix for gemm3m tests 2015-05-05 11:58:59 +02:00
Werner Saar 924bc5372e removed gemm3m functions from normal checks 2015-05-05 11:39:43 +02:00
wernsaar 2b83a69650 Merge pull request #561 from wernsaar/develop
updated dgemv_n sgemv_n kernels
2015-05-04 11:11:13 +02:00