Commit Graph

  • 960b0c88a7 Refs #227. Detected LLVM/Clang compiler. Zhang Xianyi 2013-06-06 23:43:40 +0800
  • 65ffead0cf Refs #124. Check XSAVE flag on x86 CPU. Zhang Xianyi 2013-06-06 22:50:43 +0800
  • f2fb8c7035 Change LIBSUFFIX from .lib to .a on windows. Zhang Xianyi 2013-06-04 16:05:28 +0800
  • 9f59f384d8 Refs #223. Fixed s/dgemv bug on windows. Zhang Xianyi 2013-06-04 16:01:05 +0800
  • 23965f164c Fixed overflow internal buffer bug of (s/d/c/z)gemv on x86_64. wangqian 2013-05-29 19:48:31 +0800
  • 6a72840945 Fixed overflow internal buffer bug of (s/d/c/z)gemv on x86. wangqian 2013-05-29 13:23:12 +0800
  • 947457fb7c Fixed the bug about testing the exist of lapack tar package. Zhang Xianyi 2013-05-24 15:52:35 +0800
  • 79120bf9a0 Refs #205. Merge boegel's codes about downloading LAPACK. Zhang Xianyi 2013-05-24 15:29:10 +0800
  • acb11905d5 Fixed #199. Saved USE_THREAD switch for make install. Zhang Xianyi 2013-05-24 15:15:52 +0800
  • 109500178c Refs #220. Support Power7 by old Power6 kernels. Zhang Xianyi 2013-05-21 22:59:45 +0800
  • e50a664865 Refs #215. Fixed the compatible between <complex.h> and <complex> in C++. Zhang Xianyi 2013-05-17 16:41:05 +0800
  • 357078b93e Refs #216. Revert the default value of GEMM_MULTITHREAD_THRESHOLD to 4. Zhang Xianyi 2013-05-03 09:08:54 +0800
  • 731220f870 changed DGEMM_DEFAULT_P and DGEMM_DEFAULT_Q to 248 for BULLDOZER 64bit wernsaar 2013-04-30 10:07:17 +0200
  • 69aa6c8fb1 bad performance with some data wernsaar 2013-04-28 11:14:23 +0200
  • 60b263f3d2 removed trsm_kernel_RT_4x4_bulldozer.S. wrong results wernsaar 2013-04-27 17:23:08 +0200
  • 7ac306e0da added trsm_kernel_RT_4x4_bulldozer.S wernsaar 2013-04-27 16:48:48 +0200
  • 4cb454cdf2 added trsm_kernel_LT_4x4_bulldozer.S wernsaar 2013-04-27 14:30:00 +0200
  • 19ad2fb128 prefetch improved. Defined 2 different kernels for inner loop wernsaar 2013-04-27 13:40:49 +0200
  • 5d96e4f224 Refs #210. Disable checking /lib/libpthread.so*. Zhang Xianyi 2013-04-27 15:02:04 +0800
  • 6821677489 minor improvements and code cleanup wernsaar 2013-04-26 20:05:42 +0200
  • dbbda55e67 Updated the mailing list for OpenBLAS. Xianyi Zhang 2013-04-25 00:45:42 +0800
  • 6c34a7f43c Updated the mailing list for OpenBLAS. Xianyi Zhang 2013-04-25 00:44:22 +0800
  • 3326f3152c Merge pull request #213 from wernsaar/develop Zhang Xianyi 2013-04-17 23:56:09 -0700
  • c7fdc692c9 Merge 7641f6e253 into 48bdc1ad3b wernsaar 2013-04-16 10:15:33 -0700
  • 7641f6e253 Merged some improvements into dgemm_kernel_4x4_bulldozer.S. Changed the copy functions to generic to solve prefetch conflicts wernsaar 2013-04-16 19:05:06 +0200
  • 48bdc1ad3b Added NO_PARALLEL_MAKE flag to disable parallel make. Zhang Xianyi 2013-04-15 21:37:30 +0800
  • 3ad29452d1 Merge pull request #211 from wernsaar/develop Zhang Xianyi 2013-04-15 00:20:55 -0700
  • e2fc2344ce Merge 6e3f6f25a5 into a068d54981 wernsaar 2013-04-12 09:11:40 -0700
  • 6e3f6f25a5 New version of dgemm_kernel_4x4_bulldozer.S The peak performance with 8 cores is now 90 GFlops wernsaar 2013-04-12 17:55:51 +0200
  • 986d542acb Merge branch 'loongson3a' into loongson3b Xianyi Zhang 2013-04-11 16:07:59 +0800
  • 990efcab6e Merge branch 'loongson3b' into loongson3a Zhang Xianyi 2013-04-11 16:11:03 +0000
  • 75a5dc3975 Added the configure for the host loongcc compiling on Loongson3. Zhang Xianyi 2013-04-11 16:10:47 +0000
  • 6958c1a1aa Fixed the SEGFAULT bug with Loongcc and Loongson3. Xianyi Zhang 2013-04-11 15:33:43 +0800
  • a068d54981 Refs #209. Export the missing cblas_cdotc_sub functions. Zhang Xianyi 2013-04-08 23:21:28 +0800
  • d692ee07f7 Merge branch 'loongson3a' into loongson3b Xianyi Zhang 2013-04-08 14:56:39 +0800
  • 1a57717b1a Added the configuration of Loongcc compiler for Loongson 3 CPU. Xianyi Zhang 2013-04-07 15:42:07 +0800
  • 6b01d58712 Disable the optimization of muli-threading gemm on the Loongson3A. Xianyi Zhang 2013-03-30 20:12:43 +0000
  • 35b943f17f Merge branch 'develop' into loongson3a Xianyi Zhang 2013-03-27 14:36:15 +0000
  • e029242870 Merge pull request #206 from wlbksy/patch-1 Zhang Xianyi 2013-03-23 09:57:41 -0700
  • f8c8895295 Merge 7a9b94b519 into f4846afbad wlbksy 2013-03-22 23:41:47 -0700
  • 7a9b94b519 Fix #204 wlbksy 2013-03-23 14:41:26 +0800
  • e3c21da90a Merge 66b919d99f into f4846afbad Kenneth Hoste 2013-03-22 11:47:05 -0700
  • 66b919d99f adjusted Makefile to allow for provided required LAPACK source files rather than downloading them Kenneth Hoste 2013-03-22 19:45:11 +0100
  • f4846afbad Merge pull request #201 from Explorer09/develop Zhang Xianyi 2013-03-18 07:31:30 -0700
  • 17176ae7e2 Merge 53588bc786 into d831b2ff8b Explorer09 2013-03-17 08:16:26 -0700
  • 53588bc786 getarch.c: Minor re-ordering of architecture list Explorer09 2013-03-17 23:09:23 +0800
  • b47f13ee4c getarch.c: Minor re-ordering of architecture list Explorer09 2013-03-17 23:07:48 +0800
  • 309f90e563 TargetList.txt: minor re-ordering Explorer09 2013-03-17 23:03:05 +0800
  • 773c01f496 Typo correction in README.md Explorer09 2013-03-17 22:48:24 +0800
  • d831b2ff8b Override CFLAGS in LAPACK make.in. Zhang Xianyi 2013-03-10 01:01:16 +0800
  • 724ae159ce Fixed the Windows x86_64 ABI bug in s/daxpy kernels. Zhang Xianyi 2013-03-08 22:28:34 +0800
  • 2c9a203bd1 Merge pull request #198 from wernsaar/develop Zhang Xianyi 2013-03-06 13:39:53 -0800
  • 65e54956d8 Merge f300ce3df5 into e2c7c75715 wernsaar 2013-03-06 09:04:11 -0800
  • f300ce3df5 new optimization of dgemm kernel for bulldozer: 10% performance increase wernsaar 2013-03-06 17:26:03 +0100
  • e2c7c75715 Merge pull request #197 from wernsaar/develop Zhang Xianyi 2013-03-06 01:11:08 -0800
  • 059e985dbb Merge 66e64131ed into 5900b1462e wernsaar 2013-03-05 10:59:43 -0800
  • 66e64131ed optimized again bulldozer dgemm kernel wernsaar 2013-03-05 19:51:37 +0100
  • 5900b1462e Merge pull request #195 from wernsaar/develop Zhang Xianyi 2013-03-05 05:35:42 -0800
  • 901230f0df Merge 9405f26f4b into 529f1b5006 wernsaar 2013-03-04 08:59:38 -0800
  • 9405f26f4b new dgemm_kernel for bulldozer wernsaar 2013-03-04 17:37:38 +0100
  • 54e7b37630 Merge branch 'develop' v0.2.6 Zhang Xianyi 2013-03-02 14:42:06 +0800
  • 529f1b5006 Refs#194. Export the missing LAPACK s/dlamc3 functions. Zhang Xianyi 2013-03-02 14:41:18 +0800
  • e5ac3007e0 Merge branch 'develop' Zhang Xianyi 2013-03-02 14:24:23 +0800
  • 0d0405b434 Updated the doc for 0.2.6 version. Zhang Xianyi 2013-03-02 14:22:27 +0800
  • f1ce74ffdd Improved the print when OS don't support AVX. Zhang Xianyi 2013-03-02 14:15:54 +0800
  • d744c9590a In OpenMP threading, preallocate the thread buffer instead of allocating the buffer every time. This patch improved the performance slightly. Zhang Xianyi 2013-03-01 14:36:47 +0800
  • 3cc6ae793e Refs #174. Return sb pointer when OpenMP or Windows. Zhang Xianyi 2013-02-26 00:48:21 +0800
  • 4c2123c334 Fixed the overflowing bug in single thread cholesky factorization. Zhang Xianyi 2013-02-23 12:51:13 +0800
  • 5155e3f509 Refs #174. Fixed the overflowing buffer bug of multithreading hbmv and sbmv. Zhang Xianyi 2013-02-13 16:05:58 +0800
  • 5c8bf6ae0e Merge branch 'bulldozer' into develop Zhang Xianyi 2013-02-10 01:19:42 +0800
  • 6ae2f868fd Set the affinity. Only use 1 core of each module on bulldozer. Zhang Xianyi 2013-02-09 18:18:55 +0100
  • a1ead62f28 Disable the warning of sgemm bulldozer kernel. Zhang Xianyi 2013-02-09 17:03:13 +0100
  • 0133580148 Used sgemm bulldozer kernel on 64 bit. Zhang Xianyi 2013-02-09 16:29:14 +0100
  • 274246651d Merge branch 'bulldozer' of git://github.com/wernsaar/OpenBLAS into bulldozer Zhang Xianyi 2013-02-09 16:25:07 +0100
  • 299b5a44dc Merge branch 'develop' of github.com:xianyi/OpenBLAS into bulldozer Zhang Xianyi 2013-02-09 16:22:04 +0100
  • a9500d0079 Missing line continuation -- follow-up to last commit (64ad8b9809). Zaheer Chothia 2013-02-01 09:34:12 +0100
  • 64ad8b9809 Refs #193. Don't use C99 complex numbers when building C++ code. Zaheer Chothia 2013-02-01 09:24:44 +0100
  • 875d520ccf Refs #193. cblas: move #include out of extern "C" block. Zaheer Chothia 2013-01-31 08:48:27 +0100
  • d311236dfd Refs #189. Fixed the bug of s/cdot about invalid reading NAN on x86_64. Zhang Xianyi 2013-01-25 16:18:27 +0800
  • 36e0982966 Refs #187. Use perl to generate cblas_noconst.h instead of sed. Zhang Xianyi 2013-01-22 00:29:54 +0800
  • 8cdb795438 Refs #187. Use binary code for xgetbv, which is compatible with old compiler. Zhang Xianyi 2013-01-22 00:18:21 +0800
  • 4db6660de4 Refs #185. Add missing 'const' to declarations in <cblas.h>. Thanks to Dan Povey! Zaheer Chothia 2013-01-20 21:53:52 +0100
  • 0b08f7479e Refs #154. Fixed gemv_t bug about overflow 16MB buffer on x86. Zhang Xianyi 2013-01-20 21:22:12 +0800
  • 200e4acf15 cblas: typedef enums for improved compatibility with Intel MKL. Zaheer Chothia 2012-06-25 13:51:46 +0200
  • 99d1978df7 Fixed #180. the typos in kernel/x86_64/sgemv_t.S Zhang Xianyi 2013-01-12 12:31:14 +0800
  • 08bf6674d5 Refs #177. Fixed sgemv_t compiling bug on Win64. Zhang Xianyi 2013-01-05 11:36:39 +0800
  • 8b122ff9dc Refs #176. Fixed make.inc overriding RANLIB bug when cross-compiling LAPACK. Zhang Xianyi 2013-01-03 01:47:31 +0800
  • 69200884e1 Refs #173. Fixed overflow internal buffer bug of gemv_n on x86 Zhang Xianyi 2012-12-25 09:27:49 +0800
  • 0d1518add9 Refs #173. Fixed overflow internal buffer bug of sgemv_t on x86 Zhang Xianyi 2012-12-25 09:10:17 +0800
  • 91ed4e4450 Refs #171. Prevent loading the dirty number from the buffer in sgemv_t x86 kernel. Zhang Xianyi 2012-12-23 23:14:17 +0800
  • fd3046b32a Refs #173. Fixed overflow internal buffer bug of gemv_t on x86. Zhang Xianyi 2012-12-23 21:47:22 +0800
  • a4ee6f3915 Fixed #172. Support Intel Xeon E7540. Zhang Xianyi 2012-12-18 08:57:46 +0800
  • a0363e9b48 Merge branch 'master' into develop Zhang Xianyi 2012-12-18 08:51:30 +0800
  • 13589ab54e Merge 7269ff6580 into b471d52e61 Julian Taylor 2012-12-17 13:30:32 -0800
  • 7269ff6580 support Xeon E7540 Julian Taylor 2012-12-17 19:55:57 +0100
  • b471d52e61 Merge pull request #170 from juliantaylor/athlon-defaults Zhang Xianyi 2012-12-15 15:50:02 -0800
  • 1607e80948 Merge 9fb341a9f8 into 97f68f7f3a Julian Taylor 2012-12-15 07:07:17 -0800
  • 9fb341a9f8 set parameters for CORE_ATHLON Julian Taylor 2012-12-15 16:05:33 +0100
  • fba6b590f2 Merge branch 'master' into develop Zhang Xianyi 2012-12-15 22:49:37 +0800
  • 97f68f7f3a Merge pull request #169 from juliantaylor/sanity-check-cpu Zhang Xianyi 2012-12-15 06:46:48 -0800