Commit Graph

7452 Commits

Author SHA1 Message Date
Theoractice aa744dfa59 Update memory.c 2016-03-22 20:02:37 +08:00
theoractice 61cf8f74d9 Fix access violation on Windows while static linking 2016-03-22 19:14:54 +08:00
Theoractice de202fa375 Merge pull request #1 from xianyi/develop
upd
2016-03-22 05:33:20 -05:00
wernsaar 6f93b53590 Merge pull request #812 from wernsaar/develop
added optimized sdot kernel for POWER8
2016-03-21 13:59:44 +01:00
Werner Saar 11c44dede1 added optimized sdot kernel for POWER8 2016-03-21 13:18:23 +01:00
wernsaar f00d642592 Merge pull request #811 from wernsaar/develop
added optimized zdot kernel for POWER8
2016-03-21 10:48:41 +01:00
Werner Saar 9e4584d069 added optimized zdot kernel for POWER8 2016-03-21 10:12:07 +01:00
Zhang Xianyi 2a5679da5f Merge branch 'release-0.2.17' into develop 2016-03-20 20:52:43 -04:00
Zhang Xianyi a71e8c82f6 Fix change log typo. 2016-03-20 20:52:15 -04:00
Zhang Xianyi 9b987badb0 Merge branch 'master' into develop
Bump to 0.2.18.dev

Conflicts:
	CMakeLists.txt
	Makefile.rule
2016-03-20 20:48:21 -04:00
Zhang Xianyi 1619b2f3c8 Merge branch 'release-0.2.17' 2016-03-20 20:44:01 -04:00
Zhang Xianyi 4f3153395a Update doc for 0.2.17. 2016-03-20 20:43:42 -04:00
Zhang Xianyi d7a1a7ff2a Merge branch 'release-0.2.17' into develop 2016-03-20 09:24:28 -04:00
Zhang Xianyi 308e6195b7 Refs #807. Enable BUILD_LAPACK_DEPRECATED=1 by default. 2016-03-20 09:22:56 -04:00
Zhang Xianyi 7a3d7b1f52 Merge pull request #808 from theoractice/develop
Fix a minor compiler error in VisualStudio with CMake
2016-03-20 09:07:47 -04:00
wernsaar 74cc2d6623 Merge pull request #809 from wernsaar/develop
Ref #795: added optimized ddot kernel for POWER8
2016-03-20 13:16:41 +01:00
theoractice fc3a558515 Fix a minor compiler error in VisualStudio with CMake 2016-03-20 18:58:18 +08:00
Werner Saar cd9fafc054 ddot for POWER8: updated licence information 2016-03-20 11:19:27 +01:00
Werner Saar 84b92e6373 added optimized ddot kernel for POWER8 2016-03-20 11:06:06 +01:00
wernsaar c279a53ed8 Merge pull request #806 from wernsaar/develop
adding optimized single precision blas level3 kernels for POWER8
2016-03-18 12:46:16 +01:00
Werner Saar e1df5a6e23 fixed sgemm- and strmm-kernel 2016-03-18 12:12:03 +01:00
Werner Saar 5c658f8746 add optimized cgemm- and ctrmm-kernel for POWER8 2016-03-18 08:17:25 +01:00
Zhang Xianyi ec4390a967 Bump devlop version to 0.2.17.dev. 2016-03-15 14:52:01 -04:00
Zhang Xianyi fced5744fb Merge branch 'release-0.2.16' 2016-03-15 14:49:10 -04:00
Zhang Xianyi 8c0fb1258d Update 0.2.16 doc 2016-03-15 14:48:41 -04:00
Zhang Xianyi aae581d004 Merge branch 'develop' into release-0.2.16 2016-03-15 13:56:01 -04:00
Zhang Xianyi e17303933a Merge pull request #802 from ashwinyes/develop_20160314_dgemm_optimization
DGEMM Optimizations for Cortex-A57
2016-03-14 20:31:03 -04:00
Zhang Xianyi f9226275f4 Merge pull request #801 from Keno/patch-3
Don't pass REALNAME to `.end`
2016-03-14 15:42:31 -04:00
Ashwin Sekhar T K cf8c7e28b3 Update CONTRIBUTORS.md 2016-03-14 20:01:02 +05:30
Ashwin Sekhar T K 5ac02f6dc7 Optimize Dgemm 4x4 for Cortex A57 2016-03-14 19:35:23 +05:30
Ashwin Sekhar T K 7aa1ad4923 Functional Assembly Kernels for CortexA57
Adding functional (non-optimized) kernels for Cortex-A57
with the following layouts.
SGEMM - 16x4, 8x8
CGEMM - 8x4
DGEMM - 8x4, 4x8
2016-03-14 19:33:21 +05:30
Werner Saar dcd15b546c BUGFIX: KERNEL.POWER8 2016-03-14 14:36:59 +01:00
Werner Saar 96284ab295 added sgemm- and strmm-kernel for POWER8 2016-03-14 13:52:44 +01:00
Keno Fischer d5e1255ca7 Don't pass REALNAME to `.end`
Putting the procedure there is an MSVC-ism, where it is optional. GCC silently ignores and Clang errors, so it is best to remove this.
2016-03-13 18:56:21 -04:00
Zhang Xianyi 587455868e Merge pull request #800 from jeromerobert/smallscaling
Fix smallscaling compilation
2016-03-10 15:45:33 -05:00
Jerome Robert 323c237e7b Fix smallscaling compilation
Also revert 0bbca5e
2016-03-10 20:24:41 +01:00
Werner Saar faa5e2e5e3 FIX: forgot the add the files cgemv_n_4.c and cgemv_t_4.c 2016-03-10 11:10:38 +01:00
wernsaar 551fdf53e8 Merge pull request #799 from wernsaar/develop
Added optimized cgemv_n and cgemv_t kernels for bulldozer, piledriver…
2016-03-10 10:22:08 +01:00
Werner Saar fdf291be30 Added optimized cgemv_n and cgemv_t kernels for bulldozer, piledriver and steamroller 2016-03-10 09:42:07 +01:00
Zhang Xianyi 68eb4fa329 Add missing openblas_env makefile. 2016-03-09 14:52:47 -05:00
Zhang Xianyi 05196a8497 Refs #716. Only call getenv at init function. 2016-03-09 12:50:07 -05:00
wernsaar db9b611b12 Merge pull request #798 from wernsaar/develop
Optimized zgemv_n kernel for bulldozer, piledriver and steamroller
2016-03-09 15:55:56 +01:00
Werner Saar 2e6333f74e modified common.h for piledriver 2016-03-09 15:48:29 +01:00
Werner Saar c99cc41cbd Added optimized zgemv_n kernel for bulldozer, piledriver and steamroller 2016-03-09 14:02:03 +01:00
wernsaar 711ecb8bd5 Merge pull request #797 from wernsaar/develop
bugfixes for lapack and lapacke
2016-03-07 16:44:17 +01:00
Werner Saar 10c2ebdfc5 BUGFIX: removed fixes for bugs #148 and #149, because info for xerbla is wrong 2016-03-07 10:34:04 +01:00
Werner Saar 26b3b3a3e6 bugfixes form lapack svn for bugs #142 - #155 2016-03-07 10:10:00 +01:00
Werner Saar acdff55a6a Bugfix for ztrmv 2016-03-07 09:39:34 +01:00
Zhang Xianyi 7d6b68eb4a Refs #786. Revert to default assembly kernel. 2016-03-07 11:34:58 +08:00
Werner Saar 0bbca5e803 removed build of smallscaling, because build on arm, arm64 and power fails 2016-03-06 11:54:41 +01:00