Commit Graph

  • 5118a7f4d1 small optimizations on dgemm_kernel for Piledriver wernsaar 2013-10-31 11:53:26 +0100
  • e172b70ea2 added cgemm_kernel for Piledriver wernsaar 2013-10-31 08:38:17 +0100
  • 1cf4b974b2 added zgemm_kernel for Piledriver wernsaar 2013-10-30 09:12:17 +0100
  • 7bccff1512 added sgemm_kernel for PILEDRIVER wernsaar 2013-10-29 22:53:04 +0100
  • afe44b0241 tests and code cleanup of gemm_kernels for HASWELL wernsaar 2013-10-28 14:23:48 +0100
  • a77c71eaf5 added highly optimized dgemm_kernel for HASWELL wernsaar 2013-10-28 10:23:47 +0100
  • b2219b3478 Merge pull request #311 from loladiro/patch-1 Zhang Xianyi 2013-10-24 23:41:22 -0700
  • f5a0038bad Use FC instead of CC to link the dynamic library on OS X Keno Fischer 2013-10-23 18:43:00 -0400
  • c937090121 Added gfortran dependency for LSB/lsbcc. Zhang Xianyi 2013-10-22 13:24:47 +0800
  • fe8c5666f9 optimized dgemm_kernel for HASWELL wernsaar 2013-10-20 16:52:26 +0200
  • f6b50057e2 corrected and testet FMA3 Code wernsaar 2013-10-19 10:52:20 +0200
  • 2840d56aeb added dgemm_kernel for Piledriver wernsaar 2013-10-19 09:47:15 +0200
  • 2d49db2f5b moved compiler flags from Makefile.rule to Makefile.arm wernsaar 2013-10-16 19:04:42 +0200
  • 04391e6d9c optimized param.h wernsaar 2013-10-16 18:04:34 +0200
  • 85484a42df added kernels for cgemm, ctrmm, zgemm and ztrmm wernsaar 2013-10-16 18:00:41 +0200
  • 3983011f0b added sgemm- and strmm_kernel wernsaar 2013-10-14 08:22:27 +0200
  • 2a1515c9dd added dgemm_ncopy_4_vfpv3.S wernsaar 2013-10-12 16:48:29 +0200
  • 31f51e78bc minor optimizations on dgemm_kernel wernsaar 2013-10-12 09:42:18 +0200
  • beffee7d91 Fixed buffer overflow bug in kernel/x86_64/dgemv_t.S file. wangqian 2013-10-11 03:20:20 +0800
  • a35f4343fa Merge pull request #301 from yieldthought/develop Zhang Xianyi 2013-10-09 00:46:49 -0700
  • 43457f552d Merge ce5626a384 into 16eb780e13 yieldthought 2013-10-08 07:38:28 -0700
  • ce5626a384 Remove -Wl,--retain-symbols-file from dynamic library linking to fix tool support yieldthought 2013-10-08 16:37:17 +0200
  • e0b968c3a7 Changed kernels for dgemm and dtrmm wernsaar 2013-10-05 12:59:44 +0200
  • 93f1074dd4 changed some values for arm wernsaar 2013-09-30 18:03:56 +0200
  • 1c63180bb6 updated dgemm_kernel_8x2_vfpv3.S wernsaar 2013-09-30 17:31:23 +0200
  • 22a8fcc4b7 add modified c_check perl program wernsaar 2013-09-29 19:42:33 +0200
  • 9965d48005 added Makefile.arm wernsaar 2013-09-29 18:55:21 +0200
  • 4a474ea7dc changed dgemm_kernel to use fused multiply add wernsaar 2013-09-29 17:46:23 +0200
  • 69ce737cc5 modified Makefile.L3 for ARM wernsaar 2013-09-28 19:13:47 +0200
  • d13788d1b4 common files modified for ARM wernsaar 2013-09-28 19:10:32 +0200
  • 70411af888 initial checkin of kernel/arm wernsaar 2013-09-28 19:02:25 +0200
  • 16eb780e13 Refs #262. Fixed compatibility issues of GNU stack markings with PathScale EKOPath(tm) Compiler Suite: Version 4.0.12.1 Zhang Xianyi 2013-09-22 09:37:59 +0800
  • 5729088994 Initial checkin of port for ARM wernsaar 2013-09-16 14:41:37 +0200
  • a746724e84 Added backers. Zhang Xianyi 2013-09-05 15:39:45 +0800
  • 8cd23f206c Merge baff5d6ba6 into 3f7b0cd994 Lars Buitinck 2013-08-28 09:36:56 -0700
  • 3f7b0cd994 Merge pull request #290 from larsmans/missing-threshold Lars Buitinck 2013-08-28 17:20:16 +0200
  • cc6db2ecfe Merge pull request #291 from larsmans/fix-makefile-prefix Zhang Xianyi 2013-08-28 09:26:16 -0700
  • 01e4c13543 Merge a29e6592da into 3175be4b3d Lars Buitinck 2013-08-28 09:25:33 -0700
  • 3175be4b3d Merge pull request #289 from larsmans/no-noconst Zhang Xianyi 2013-08-28 09:25:23 -0700
  • a29e6592da fix default prefix handling in makefiles Lars Buitinck 2013-08-28 17:39:54 +0200
  • baff5d6ba6 check if GEMM_MULTITHREAD_THRESHOLD defined in gemm.c Lars Buitinck 2013-08-28 17:20:16 +0200
  • 342af78706 Merge 212463dce9 into 037bd82bef Lars Buitinck 2013-08-28 07:55:46 -0700
  • 212463dce9 get rid of the generated cblas_noconst.h file Lars Buitinck 2013-08-28 16:52:24 +0200
  • 037bd82bef Merge pull request #288 from sebastien-villemot/develop Zhang Xianyi 2013-08-28 06:26:37 -0700
  • bdff0a9502 Merge eae4cfa3f6 into fe98de2f68 Sébastien Villemot 2013-08-28 05:34:53 -0700
  • eae4cfa3f6 Avoid failure on qemu guests declaring an Athlon CPU without 3dnow! Sébastien Villemot 2013-08-28 14:27:59 +0200
  • 6c4a7d0828 Import AMD Piledriver DGEMM kernel generated by AUGEM. So far, this kernel doesn't deal with edge. Zhang Xianyi 2013-08-25 10:16:01 -0300
  • fe98de2f68 Merge branch 'bulldozer' into develop Zhang Xianyi 2013-08-24 11:46:18 -0300
  • db389b5915 Refs #281. Detect __CYGWIN__ macro for Cygwin x86_64. Zhang Xianyi 2013-08-24 13:09:49 +0800
  • 52f587db7f Refs #281. Detect _WIN32 macro for Windows API. Zhang Xianyi 2013-08-24 01:10:02 +0800
  • 067e8417fd removed unnessesary instructions from zgemm_kernel_2x2_bulldozer.S wernsaar 2013-08-17 06:46:17 +0200
  • a82da3d069 removed unnessesary instructions wernsaar 2013-08-16 20:23:34 +0200
  • 1569bf14f8 Refs #282. Fixed zgemv_n typo bug on Win64. Zhang Xianyi 2013-08-23 16:27:17 +0800
  • df554aebd2 Merge pull request #280 from ViralBShah/develop Zhang Xianyi 2013-08-21 08:21:51 -0700
  • fe4ca7e036 Merge eae6920f2d into c92ae012a6 Viral B. Shah 2013-08-21 06:45:44 -0700
  • eae6920f2d Patch LAPACK XLASD4.f as discussed in JuliaLang/julia#2340 Viral B. Shah 2013-08-21 19:14:07 +0530
  • c92ae012a6 Refs #279. Provide ONLY_CBLAS flag. If you only need CBLAS without a fortran compiler, please try make ONLY_CBLAS=1. Zhang Xianyi 2013-08-21 00:03:25 +0800
  • f51a849d91 Merge pull request #278 from wernsaar/haswell Zhang Xianyi 2013-08-17 08:24:37 -0700
  • 333c1b8431 removed unnessesary instructions from zgemm_kernel_2x2_bulldozer.S wernsaar 2013-08-17 06:46:17 +0200
  • fa3f1cd125 removed unnessesary instructions wernsaar 2013-08-16 20:23:34 +0200
  • 035605ffe1 Merge 44ef70420c into 2638370844 wernsaar 2013-08-16 10:32:31 -0700
  • 44ef70420c added cgemm_kernel_8x2_haswell.S wernsaar 2013-08-16 18:54:56 +0200
  • d488b1b1aa added zgemm_kernel_4x2_haswell.S wernsaar 2013-08-16 10:29:47 +0200
  • 4070d9a123 added dgemm_kernel_16x2_haswell.S wernsaar 2013-08-15 19:17:20 +0200
  • 0b90c0ec64 added sgemm_kernel_16x4_haswell.S wernsaar 2013-08-15 18:46:14 +0200
  • 2b8ab8f55b sgemm_kernel_16x4_haswell.S minor changes wernsaar 2013-08-14 01:44:41 +0200
  • 1cb9579cd0 added zgemm_kernel_4x2_haswell.S and fixed a bug in sgemm_kernel_16x4_haswell.S wernsaar 2013-08-14 01:23:15 +0200
  • 2638370844 Init code base for Intel Haswell. Zhang Xianyi 2013-08-13 00:54:59 +0800
  • 89637f87c8 added sgemm- and dgemm-kernel for HASWELL processor wernsaar 2013-08-12 18:04:10 +0200
  • c0b1e41bec Merge branch 'bulldozer' into develop Zhang Xianyi 2013-08-12 23:22:10 +0800
  • 49faee1a51 Fixed #276. Merge branch 'wernsaar-develop' into bulldozer Zhang Xianyi 2013-08-09 10:49:44 +0800
  • c0159d44a3 Merge branch 'develop' of https://github.com/wernsaar/OpenBLAS into wernsaar-develop Zhang Xianyi 2013-08-09 10:48:46 +0800
  • b3220e63e2 Merge c17a850c1c into 79ba52115d wernsaar 2013-08-08 09:03:35 -0700
  • c17a850c1c modified KERNEL.BULLDOZER wernsaar 2013-08-08 17:49:30 +0200
  • 099853fff6 added dtrsm_kernel_RN_8x2_bulldozer.S wernsaar 2013-08-08 07:14:08 +0200
  • 44d23881b5 dtrsm_kernel_LT_8x2_bulldozer.S performance optimization wernsaar 2013-08-05 11:27:16 +0200
  • 2905042c6a Refs #270 #268. Merge branch 'wernsaar-develop' into bulldozer Zhang Xianyi 2013-08-05 16:17:15 +0800
  • 32fb6b9bb2 Merge branch 'develop' of https://github.com/wernsaar/OpenBLAS into wernsaar-develop Zhang Xianyi 2013-08-05 16:09:47 +0800
  • 7871dc133b Merge aaeb8eaecd into 79ba52115d wernsaar 2013-08-05 01:09:28 -0700
  • 673e453b3f Enable bulldozer kernels. Zhang Xianyi 2013-08-05 16:07:54 +0800
  • 143cca4dd5 Merge branch 'develop' into bulldozer Zhang Xianyi 2013-08-05 15:51:53 +0800
  • aaeb8eaecd modified dtrsm_kernel_LT_8x2_bulldozer.S wernsaar 2013-08-04 12:16:12 +0200
  • 8aeec32ea0 modified dtrsm_kernel_LT_8x2_bulldozer.S wernsaar 2013-08-04 10:15:33 +0200
  • 87fc9de572 added dtrsm_kernel_LT_8x2_bulldozer.S wernsaar 2013-08-04 09:54:40 +0200
  • 564aa60fec removed dtrsm_kernel_LT_8x2_bulldozer.S wernsaar 2013-08-03 15:40:51 +0200
  • f645665dd6 fixed bug in dgemv_t_bulldozer.S wernsaar 2013-08-03 12:19:29 +0200
  • e45a347cd2 repaired trmm bug in sgemm_kernel_16x2_bulldozer.S wernsaar 2013-08-03 11:43:25 +0200
  • 99727ac013 repaired trmm bug in cgemm_kernel_4x2_bulldozer.S wernsaar 2013-08-03 10:32:51 +0200
  • 6e0a2fbc0c repaired trmm bug in zgemm_kernel_2x2_bulldozer.S wernsaar 2013-08-03 10:17:08 +0200
  • 0a22f99c58 repaired trmm bug in dgemm_kernel_8x2_bulldozer.S wernsaar 2013-08-03 09:35:39 +0200
  • 79ba52115d Merge branch 'hotfix-v0.2.8' into develop Zhang Xianyi 2013-08-01 23:57:19 +0800
  • 835293cc1a Merge branch 'hotfix-v0.2.8' v0.2.8 Zhang Xianyi 2013-08-01 23:53:12 +0800
  • b736aa8110 Update the doc for 0.2.8 version. Zhang Xianyi 2013-08-01 23:52:43 +0800
  • ae521ecc3e OpenBLAS 0.2.8 rc1. Zhang Xianyi 2013-07-31 14:49:16 +0800
  • 36adfe8d64 Merge branch 'hotfix-v0.2.8' into develop Zhang Xianyi 2013-07-31 14:46:56 +0800
  • a07cc39571 Refs #266. Fixed the compiling bug with Open64 5.0. Zhang Xianyi 2013-07-31 14:41:39 +0800
  • cff70a666d added generic trmm kernels and modified Makefile.L3 wernsaar 2013-07-30 20:18:57 +0200
  • b5c2ac4fd6 Fixed #264 the memory leak bug in dtrtri_U. Zhang Xianyi 2013-07-29 23:21:10 +0800
  • 749f45ffc8 Fixed the FMA3 detection bug. Zhang Xianyi 2013-07-27 22:37:57 +0800
  • 534c5ec919 Fixed #261. Use strncmp instead of a comparing trick. Zhang Xianyi 2013-07-26 23:43:54 +0800