Commit Graph

512 Commits

Author SHA1 Message Date
Zhang Xianyi
cc522aa21d Use quiet make for Travis CI. 2013-07-05 14:52:57 +08:00
Zhang Xianyi
9c78fad721 Install gfortran in Travis CI. 2013-07-05 11:11:18 +08:00
Zhang Xianyi
6028232ad1 Added travis.yml file. 2013-07-04 23:30:53 +08:00
Zhang Xianyi
feb9a3889a Improved make clean on Mac OS X. 2013-07-02 14:37:30 +08:00
Zhang Xianyi
32dbeb636d Refs #221. Set stack limit to 16MB to prevent a SEGFAULT bug on Mac OS X with DYNAMIC_ARCH=1 & NUM_THREADS=256. 2013-07-02 14:17:55 +08:00
Zhang Xianyi
57944538b6 Use ALIGN_5 instead of .algin 32 in assembly kernel. Added ALIGN_5 for 32-bit OSX. 2013-07-01 16:09:05 +08:00
Zhang Xianyi
3ce2c62b0b Merge pull request #242 from danluu/readme.haswell
Update README to reflect Haswell support, etc.
2013-06-30 09:40:32 -07:00
Dan Luu
50464997a3 Fix miscellaneous typos 2013-06-30 11:36:13 -05:00
Zhang Xianyi
8e7cad1650 Fixed #217 openblas_config.h bug on Windows 64. 2013-07-01 00:35:14 +08:00
Dan Luu
590e6aeafc Add Haswell support 2013-06-30 11:35:00 -05:00
Dan Luu
88ef307cef Refs #241. Add Haswell support (using sandybridge optimizations) 2013-06-30 22:35:14 +08:00
Zhang Xianyi
6e8501c8a1 Fixed #239 bug in param.h about BARCELONA and BULLDOZER. 2013-06-29 10:36:01 +08:00
Zhang Xianyi
fa916a0fac Fixed #238 bug in lsame on x86. 2013-06-28 22:43:41 +08:00
Zhang Xianyi
fb298b34ae Merge pull request #235 from wernsaar/develop
Added ddot, daxpy, dcopy kernels for AMD bulldozer.
2013-06-21 17:59:26 -07:00
wernsaar
16012767f4 added dcopy_bulldozer.S 2013-06-21 16:06:51 +02:00
wernsaar
bcbac31b47 added ddot_bulldozer.S 2013-06-20 16:15:09 +02:00
wernsaar
8dc0c72583 added daxpy_bulldozer.S 2013-06-20 14:07:54 +02:00
wernsaar
89405a1a0b cleanup of dgemm_ncopy_8_bulldozer.S 2013-06-19 19:31:38 +02:00
wernsaar
4f2b12b8a8 added dgemv_t_bulldozer.S 2013-06-19 17:32:42 +02:00
Zhang Xianyi
646e168d26 Merge pull request #233 from wernsaar/develop
added dgemv_n and some faster gemm_copy routines to BULLDOZER.
2013-06-18 20:02:36 -07:00
wernsaar
93dbbe1fb8 added dgemm_ncopy_8_bulldozer.S 2013-06-18 13:29:23 +02:00
wernsaar
a135f5d9ed added gemm_tcopy_2_bulldozer.S 2013-06-18 11:01:33 +02:00
wernsaar
d0b6299b13 added dgemm_tcopy_8_bulldozer.S 2013-06-17 14:19:09 +02:00
wernsaar
9e58dd509e added gemm_ncopy_2_bulldozer.S 2013-06-17 12:55:12 +02:00
wernsaar
7c8227101b cleanup of dgemv_n_bulldozer.S and optimization of inner loop 2013-06-16 12:50:45 +02:00
wernsaar
f67fa62851 added dgemv_n_bulldozer.S 2013-06-15 16:42:37 +02:00
Zhang Xianyi
cd1d473ba0 Merge pull request #230 from wernsaar/develop
Refs #230. New dgemm and sgemm Kernel for BULLDOZER
2013-06-13 07:29:27 -07:00
Zhang Xianyi
56f160134d Refs #231. Change the default C compiler to clang on Mac OSX. 2013-06-13 22:15:19 +08:00
wernsaar
0ded1fcc1c performance optimizations in sgemm_kernel_16x2_bulldozer.S 2013-06-13 11:35:15 +02:00
wernsaar
a789b588cd added cgemm_kernel_4x2_bulldozer.S 2013-06-12 15:55:27 +02:00
wernsaar
8eaa04acbb added zgemm_kernel_2x2_bulldozer.S 2013-06-11 12:00:49 +02:00
wernsaar
d854b30ae6 Added UNROLL values for 3M to getarch_2nd.c, Makefile.system and Makefile.L3 2013-06-09 17:26:42 +02:00
wernsaar
d65bbec99b added new sgemm kernel for BULLDOZER 2013-06-09 15:57:42 +02:00
wernsaar
e4c39c7c26 changed stack touching 2013-06-08 10:43:08 +02:00
wernsaar
ba800f0883 correct GEMM_THREAD in param.h 2013-06-08 10:03:59 +02:00
wernsaar
25491e42f9 New dgemm kernel for BULLDOZER: dgemm_kernel_8x2_bulldozer.S 2013-06-08 09:40:17 +02:00
Zhang Xianyi
960b0c88a7 Refs #227. Detected LLVM/Clang compiler. 2013-06-06 23:43:40 +08:00
Zhang Xianyi
65ffead0cf Refs #124. Check XSAVE flag on x86 CPU. 2013-06-06 22:50:43 +08:00
Zhang Xianyi
f2fb8c7035 Change LIBSUFFIX from .lib to .a on windows. 2013-06-04 16:05:28 +08:00
Zhang Xianyi
9f59f384d8 Refs #223. Fixed s/dgemv bug on windows. 2013-06-04 16:01:05 +08:00
wangqian
23965f164c Fixed overflow internal buffer bug of (s/d/c/z)gemv on x86_64. 2013-05-29 19:48:31 +08:00
wangqian
6a72840945 Fixed overflow internal buffer bug of (s/d/c/z)gemv on x86. 2013-05-29 13:23:12 +08:00
Zhang Xianyi
947457fb7c Fixed the bug about testing the exist of lapack tar package. 2013-05-24 15:52:35 +08:00
Zhang Xianyi
79120bf9a0 Refs #205. Merge boegel's codes about downloading LAPACK. 2013-05-24 15:29:10 +08:00
Zhang Xianyi
acb11905d5 Fixed #199. Saved USE_THREAD switch for make install. 2013-05-24 15:15:52 +08:00
Zhang Xianyi
109500178c Refs #220. Support Power7 by old Power6 kernels. 2013-05-21 22:59:45 +08:00
Zhang Xianyi
e50a664865 Refs #215. Fixed the compatible between <complex.h> and <complex> in C++. 2013-05-17 16:41:05 +08:00
Zhang Xianyi
357078b93e Refs #216. Revert the default value of GEMM_MULTITHREAD_THRESHOLD to 4. 2013-05-03 09:08:54 +08:00
wernsaar
731220f870 changed DGEMM_DEFAULT_P and DGEMM_DEFAULT_Q to 248 for BULLDOZER 64bit 2013-04-30 10:07:17 +02:00
wernsaar
69aa6c8fb1 bad performance with some data 2013-04-28 11:14:23 +02:00