Commit Graph

1874 Commits

Author SHA1 Message Date
Zhang Xianyi fced5744fb Merge branch 'release-0.2.16' 2016-03-15 14:49:10 -04:00
Zhang Xianyi 8c0fb1258d Update 0.2.16 doc 2016-03-15 14:48:41 -04:00
Zhang Xianyi aae581d004 Merge branch 'develop' into release-0.2.16 2016-03-15 13:56:01 -04:00
Zhang Xianyi e17303933a Merge pull request #802 from ashwinyes/develop_20160314_dgemm_optimization
DGEMM Optimizations for Cortex-A57
2016-03-14 20:31:03 -04:00
Zhang Xianyi f9226275f4 Merge pull request #801 from Keno/patch-3
Don't pass REALNAME to `.end`
2016-03-14 15:42:31 -04:00
Ashwin Sekhar T K cf8c7e28b3 Update CONTRIBUTORS.md 2016-03-14 20:01:02 +05:30
Ashwin Sekhar T K 5ac02f6dc7 Optimize Dgemm 4x4 for Cortex A57 2016-03-14 19:35:23 +05:30
Ashwin Sekhar T K 7aa1ad4923 Functional Assembly Kernels for CortexA57
Adding functional (non-optimized) kernels for Cortex-A57
with the following layouts.
SGEMM - 16x4, 8x8
CGEMM - 8x4
DGEMM - 8x4, 4x8
2016-03-14 19:33:21 +05:30
Keno Fischer d5e1255ca7 Don't pass REALNAME to `.end`
Putting the procedure there is an MSVC-ism, where it is optional. GCC silently ignores and Clang errors, so it is best to remove this.
2016-03-13 18:56:21 -04:00
Zhang Xianyi 587455868e Merge pull request #800 from jeromerobert/smallscaling
Fix smallscaling compilation
2016-03-10 15:45:33 -05:00
Jerome Robert 323c237e7b Fix smallscaling compilation
Also revert 0bbca5e
2016-03-10 20:24:41 +01:00
Werner Saar faa5e2e5e3 FIX: forgot the add the files cgemv_n_4.c and cgemv_t_4.c 2016-03-10 11:10:38 +01:00
wernsaar 551fdf53e8 Merge pull request #799 from wernsaar/develop
Added optimized cgemv_n and cgemv_t kernels for bulldozer, piledriver…
2016-03-10 10:22:08 +01:00
Werner Saar fdf291be30 Added optimized cgemv_n and cgemv_t kernels for bulldozer, piledriver and steamroller 2016-03-10 09:42:07 +01:00
Zhang Xianyi 68eb4fa329 Add missing openblas_env makefile. 2016-03-09 14:52:47 -05:00
Zhang Xianyi 05196a8497 Refs #716. Only call getenv at init function. 2016-03-09 12:50:07 -05:00
wernsaar db9b611b12 Merge pull request #798 from wernsaar/develop
Optimized zgemv_n kernel for bulldozer, piledriver and steamroller
2016-03-09 15:55:56 +01:00
Werner Saar 2e6333f74e modified common.h for piledriver 2016-03-09 15:48:29 +01:00
Werner Saar c99cc41cbd Added optimized zgemv_n kernel for bulldozer, piledriver and steamroller 2016-03-09 14:02:03 +01:00
wernsaar 711ecb8bd5 Merge pull request #797 from wernsaar/develop
bugfixes for lapack and lapacke
2016-03-07 16:44:17 +01:00
Werner Saar 10c2ebdfc5 BUGFIX: removed fixes for bugs #148 and #149, because info for xerbla is wrong 2016-03-07 10:34:04 +01:00
Werner Saar 26b3b3a3e6 bugfixes form lapack svn for bugs #142 - #155 2016-03-07 10:10:00 +01:00
Werner Saar acdff55a6a Bugfix for ztrmv 2016-03-07 09:39:34 +01:00
Zhang Xianyi 7d6b68eb4a Refs #786. Revert to default assembly kernel. 2016-03-07 11:34:58 +08:00
Werner Saar 0bbca5e803 removed build of smallscaling, because build on arm, arm64 and power fails 2016-03-06 11:54:41 +01:00
Werner Saar cd5241d0cf modified KERNEL for power, to use the generic DSDOT-KERNEL 2016-03-06 09:07:24 +01:00
Werner Saar 8d652f11e7 updated smallscaling.c to build without C99 or C11
increased the threshold value of nep.in to 40
2016-03-06 08:40:51 +01:00
Zhang Xianyi 6c86570e1f Merge pull request #790 from jeromerobert/bug786
ztrmv_L.c: no longer need a 4kB buffer
2016-03-05 15:25:27 -05:00
Jerome Robert 53ba1a77c8 ztrmv_L.c: no longer need a 4kB buffer
Fix #786
2016-03-05 19:07:03 +01:00
Zhang Xianyi d23c7c713c Fixed #789 Fix utest/ctest.h on Mingw. 2016-03-05 09:34:37 -05:00
Zhang Xianyi 8c43d7fa5f Merge remote-tracking branch 'origin/power8' into develop
Refs #774
2016-03-05 06:03:19 -05:00
Werner Saar 085f215257 Modified assembly label name, so that they are hidden.
Added license informations.
2016-03-05 10:27:27 +01:00
Zhang Xianyi 8f758eeff9 Refs #786. avoid old assembly c/zgemv kernels. 2016-03-05 08:32:03 +08:00
Werner Saar 0afc76fd65 enabled gemm_beta assembly kernels 2016-03-04 15:01:15 +01:00
Werner Saar 91e1c5080c modified configuration, to use power6 sgemm kernel for power8 2016-03-04 13:38:57 +01:00
Werner Saar 73f04c2c72 enabled hemv assemly function for power8 2016-03-04 13:20:50 +01:00
Werner Saar 3e633152c6 enabled symv assembly kernels on power8 2016-03-04 13:08:18 +01:00
Werner Saar d5130ce7e3 enabled gemv assembly on power8 2016-03-04 12:53:31 +01:00
Werner Saar 4824b88fcb enabled all level1 assembly kernels for power8 2016-03-04 12:35:25 +01:00
Werner Saar cc26d888b8 BUGFIX: increased BUFFER_SIZE for POWER8 2016-03-04 10:26:53 +01:00
Zhang Xianyi 8577be2a95 Modify travis script. 2016-03-04 04:24:43 +08:00
Zhang Xianyi 1edf30b790 Change Opteron(SSE3) to Opteron_SSE3 at dyanmaic core name. 2016-03-01 20:13:08 +08:00
Werner Saar b752858d6c added dgemm-, dtrmm-, zgemm- and ztrmm-kernel for power8 2016-03-01 07:33:56 +01:00
Zhang Xianyi 4fc8c937d4 Refs #695 add testcase. 2016-03-01 01:05:56 -05:00
Zhang Xianyi efa4f5c936 Refs #695 #783. Replace default x86_64 cgemv_t
asm kernel by C kernel.
2016-03-01 11:18:56 +08:00
Zhang Xianyi 17d655fa64 Merge pull request #784 from peterph/develop
collected usage notes
2016-02-27 11:24:20 -05:00
Petr Cerny f68141cf1d collected usage notes 2016-02-27 16:57:22 +01:00
Zhang Xianyi aa90518201 Update Changelog for 0.2.16.rc1. 2016-02-24 15:21:22 -05:00
Zhang Xianyi 6b85dbb6dc Refs #696. Turn off stack limit setting on Linux.
I cannot reproduce SEGFAULT of lapack-test with default stack size
on ARM Linux.
2016-02-24 14:21:42 -05:00
Zhang Xianyi a0debd4293 Refs #696. Turn off stack limit setting on Linux.
I cannot reproduce SEGFAULT of lapack-test with default stack size
on ARM Linux.
2016-02-24 14:18:39 -05:00