Commit Graph

289 Commits

Author SHA1 Message Date
wernsaar 47b22763f8 reduced stack usage on windows to 16K 2014-04-24 14:09:26 +02:00
wernsaar 9db0fb8b02 bugfix for sdsdot 2014-02-28 14:59:36 +01:00
wernsaar f9daebba0a checked in bugfixes for ARM 2014-02-16 11:45:47 +01:00
Zhang Xianyi 9a557e90da Refs #340. Fixed SEGFAULT bug of dgemv_n on OSX. 2014-02-15 23:23:15 +08:00
wangqian 2d557eb1e0 Fixed computational error of dgemv_n. 2014-02-04 21:47:51 +08:00
Zhang Xianyi 05bb391c3a Refs #330. Fixed the compatible issue with clang on Mac OSX. 2013-12-16 20:31:17 +08:00
Zhang Xianyi 9b5be29886 Refs #310. Fixed Segfault bug on nehalem when Julia calling dgeqrt3 on OSX.
Please also check JuliaLang/julia#4099
Julia test script:
  A=rand(256, 256)
  qrfact(A)

I found this was a bug in kernel/x86_64/dgemm_ncopy_8.S.
However, I cannot use gdb with julia. Thus, this is a walkaround fix.
2013-12-12 23:23:04 +08:00
wernsaar 53eaf41901 added support for HASWELL 2013-12-02 13:17:51 +01:00
wernsaar 9423f980f6 modified trsm kernel 2013-12-02 10:08:14 +01:00
wernsaar c6156b2ef2 added trsm kernels from origin 2013-12-01 22:39:39 +01:00
wernsaar 034a5b2083 modified zsymv 2013-12-01 21:07:49 +01:00
wernsaar 27d4234d4d merged symv 2013-12-01 20:56:02 +01:00
wernsaar 402d6e91db Merge remote branch 'origin/develop' into armv7 2013-12-01 18:18:40 +01:00
wernsaar b3254eecaf Merge remote branch 'origin/haswell' into develop 2013-12-01 18:09:12 +01:00
wernsaar d910404f00 Merge remote branch 'origin/piledriver' into develop 2013-12-01 18:06:51 +01:00
wernsaar ffe70b1fdc modified Makefile.L3 2013-12-01 17:58:46 +01:00
wernsaar 0b6e13b689 Merge remote branch 'origin/develop' into haswell 2013-12-01 13:38:11 +01:00
wernsaar e09dc279a2 Merge remote branch 'origin/develop' into piledriver 2013-12-01 13:33:18 +01:00
wernsaar 4be4db590c Merge remote branch 'origin/develop' into armv7 2013-12-01 13:16:41 +01:00
wernsaar 5c648a8984 Merge remote branch 'origin/develop' into haswell 2013-12-01 11:25:33 +01:00
wernsaar c44dc4dd3c Merge remote branch 'origin/develop' into piledriver 2013-12-01 11:06:36 +01:00
wernsaar 9d3fae15a8 Merge branch 'develop' into armv7 2013-12-01 10:12:07 +01:00
wernsaar 2d3c884294 added complex gemv kernels for ARMV6 and ARMV7 2013-11-29 17:06:33 +01:00
wernsaar d54a061713 optimized gemv_n_vfp.S 2013-11-28 17:40:21 +01:00
wernsaar 86afb47e83 added optimized ctrmm kernel for ARMV6 2013-11-28 14:35:07 +01:00
wernsaar 42a4dff056 added optimized ztrmm kernel for ARMV6 2013-11-28 13:41:06 +01:00
wernsaar 5bc322a66c optimized strmm kernel for ARMV6 2013-11-28 12:45:38 +01:00
wernsaar dec7ad0dfd optimized dtrmm kernel for ARMV7 2013-11-28 12:32:12 +01:00
wernsaar 274304bd03 add optimized cgemm kernel for ARMV6 2013-11-28 11:54:38 +01:00
wernsaar 5007a534c4 optimized zgemm kernel for ARMV6 2013-11-28 10:04:43 +01:00
wernsaar a537d7d8d7 optimized zgemm_kernel_2x2_vfp.S 2013-11-28 08:33:44 +01:00
wernsaar b42145834f optimized sgemm kernel for ARMV6 2013-11-28 08:08:08 +01:00
wernsaar 3d5e792c72 optimized sgemm kernel for ARMV6 2013-11-27 18:38:32 +01:00
wernsaar a9bd12da2c optimized dgemm kernel for ARMV6 2013-11-27 17:37:38 +01:00
wernsaar 697e198e8a added zgemm_kernel for ARMV6 2013-11-27 16:15:06 +01:00
wernsaar 36b0f7fe1d added optimized gemv_t kernel for ARMV6 2013-11-25 19:31:27 +01:00
wernsaar d2b20c5c51 add optimized axpy kernel 2013-11-25 12:25:58 +01:00
wernsaar fe5f46c330 added experimental support for ARMV8 2013-11-24 15:47:00 +01:00
wernsaar 25c6050593 add single and double precision gemv_n kernel for ARMV6 2013-11-24 12:03:28 +01:00
wernsaar 12e02a00e0 added ncopy kernels for ARMV6 2013-11-24 08:46:47 +01:00
wernsaar 29a3196f56 added optimized sgemm and strmm kernel for ARMV6 2013-11-23 18:09:41 +01:00
wernsaar 8776a73773 added optimized dgemm and dtrmm kernel for ARMV6 2013-11-23 16:24:52 +01:00
wernsaar 7e84acd3e8 fixed bug in SAVE macros, that are not found by any test routine 2013-11-23 14:35:19 +01:00
wernsaar 33d3ab6e09 small optimizations for zgemv kernels 2013-11-23 12:35:31 +01:00
wernsaar 9a0f978929 added nrm2 kernel for ARMV6 2013-11-22 17:21:10 +01:00
wernsaar 7f210587f0 renamed some ncopy and tcopy files 2013-11-22 00:20:25 +01:00
wernsaar 9f0a3a35b3 removed obsolete file sdot_vfpv3.S 2013-11-21 23:42:54 +01:00
wernsaar dbae93110b added sdot_vfp.S 2013-11-21 23:34:51 +01:00
wernsaar 19cd5c64a2 renamed swap_vfpv3.S to swap_vfp.S 2013-11-21 23:19:32 +01:00
wernsaar 9adf87495e renamed some dot kernels 2013-11-21 23:07:51 +01:00