Zhang Xianyi
|
0c4074e10b
|
Added Travis CI status image.
|
2013-07-05 15:28:41 +08:00 |
Zhang Xianyi
|
cc522aa21d
|
Use quiet make for Travis CI.
|
2013-07-05 14:52:57 +08:00 |
Zhang Xianyi
|
9c78fad721
|
Install gfortran in Travis CI.
|
2013-07-05 11:11:18 +08:00 |
Zhang Xianyi
|
6028232ad1
|
Added travis.yml file.
|
2013-07-04 23:30:53 +08:00 |
Zhang Xianyi
|
feb9a3889a
|
Improved make clean on Mac OS X.
|
2013-07-02 14:37:30 +08:00 |
Zhang Xianyi
|
32dbeb636d
|
Refs #221. Set stack limit to 16MB to prevent a SEGFAULT bug on Mac OS X with DYNAMIC_ARCH=1 & NUM_THREADS=256.
|
2013-07-02 14:17:55 +08:00 |
Zhang Xianyi
|
57944538b6
|
Use ALIGN_5 instead of .algin 32 in assembly kernel. Added ALIGN_5 for 32-bit OSX.
|
2013-07-01 16:09:05 +08:00 |
Zhang Xianyi
|
3ce2c62b0b
|
Merge pull request #242 from danluu/readme.haswell
Update README to reflect Haswell support, etc.
|
2013-06-30 09:40:32 -07:00 |
Dan Luu
|
50464997a3
|
Fix miscellaneous typos
|
2013-06-30 11:36:13 -05:00 |
Zhang Xianyi
|
8e7cad1650
|
Fixed #217 openblas_config.h bug on Windows 64.
|
2013-07-01 00:35:14 +08:00 |
Dan Luu
|
590e6aeafc
|
Add Haswell support
|
2013-06-30 11:35:00 -05:00 |
Dan Luu
|
88ef307cef
|
Refs #241. Add Haswell support (using sandybridge optimizations)
|
2013-06-30 22:35:14 +08:00 |
Zhang Xianyi
|
6e8501c8a1
|
Fixed #239 bug in param.h about BARCELONA and BULLDOZER.
|
2013-06-29 10:36:01 +08:00 |
Zhang Xianyi
|
fa916a0fac
|
Fixed #238 bug in lsame on x86.
|
2013-06-28 22:43:41 +08:00 |
Zhang Xianyi
|
fb298b34ae
|
Merge pull request #235 from wernsaar/develop
Added ddot, daxpy, dcopy kernels for AMD bulldozer.
|
2013-06-21 17:59:26 -07:00 |
wernsaar
|
16012767f4
|
added dcopy_bulldozer.S
|
2013-06-21 16:06:51 +02:00 |
wernsaar
|
bcbac31b47
|
added ddot_bulldozer.S
|
2013-06-20 16:15:09 +02:00 |
wernsaar
|
8dc0c72583
|
added daxpy_bulldozer.S
|
2013-06-20 14:07:54 +02:00 |
wernsaar
|
89405a1a0b
|
cleanup of dgemm_ncopy_8_bulldozer.S
|
2013-06-19 19:31:38 +02:00 |
wernsaar
|
4f2b12b8a8
|
added dgemv_t_bulldozer.S
|
2013-06-19 17:32:42 +02:00 |
Zhang Xianyi
|
646e168d26
|
Merge pull request #233 from wernsaar/develop
added dgemv_n and some faster gemm_copy routines to BULLDOZER.
|
2013-06-18 20:02:36 -07:00 |
wernsaar
|
93dbbe1fb8
|
added dgemm_ncopy_8_bulldozer.S
|
2013-06-18 13:29:23 +02:00 |
wernsaar
|
a135f5d9ed
|
added gemm_tcopy_2_bulldozer.S
|
2013-06-18 11:01:33 +02:00 |
wernsaar
|
d0b6299b13
|
added dgemm_tcopy_8_bulldozer.S
|
2013-06-17 14:19:09 +02:00 |
wernsaar
|
9e58dd509e
|
added gemm_ncopy_2_bulldozer.S
|
2013-06-17 12:55:12 +02:00 |
wernsaar
|
7c8227101b
|
cleanup of dgemv_n_bulldozer.S and optimization of inner loop
|
2013-06-16 12:50:45 +02:00 |
wernsaar
|
f67fa62851
|
added dgemv_n_bulldozer.S
|
2013-06-15 16:42:37 +02:00 |
Zhang Xianyi
|
cd1d473ba0
|
Merge pull request #230 from wernsaar/develop
Refs #230. New dgemm and sgemm Kernel for BULLDOZER
|
2013-06-13 07:29:27 -07:00 |
Zhang Xianyi
|
56f160134d
|
Refs #231. Change the default C compiler to clang on Mac OSX.
|
2013-06-13 22:15:19 +08:00 |
wernsaar
|
0ded1fcc1c
|
performance optimizations in sgemm_kernel_16x2_bulldozer.S
|
2013-06-13 11:35:15 +02:00 |
wernsaar
|
a789b588cd
|
added cgemm_kernel_4x2_bulldozer.S
|
2013-06-12 15:55:27 +02:00 |
wernsaar
|
8eaa04acbb
|
added zgemm_kernel_2x2_bulldozer.S
|
2013-06-11 12:00:49 +02:00 |
wernsaar
|
d854b30ae6
|
Added UNROLL values for 3M to getarch_2nd.c, Makefile.system and Makefile.L3
|
2013-06-09 17:26:42 +02:00 |
wernsaar
|
d65bbec99b
|
added new sgemm kernel for BULLDOZER
|
2013-06-09 15:57:42 +02:00 |
wernsaar
|
e4c39c7c26
|
changed stack touching
|
2013-06-08 10:43:08 +02:00 |
wernsaar
|
ba800f0883
|
correct GEMM_THREAD in param.h
|
2013-06-08 10:03:59 +02:00 |
wernsaar
|
25491e42f9
|
New dgemm kernel for BULLDOZER: dgemm_kernel_8x2_bulldozer.S
|
2013-06-08 09:40:17 +02:00 |
Zhang Xianyi
|
960b0c88a7
|
Refs #227. Detected LLVM/Clang compiler.
|
2013-06-06 23:43:40 +08:00 |
Zhang Xianyi
|
65ffead0cf
|
Refs #124. Check XSAVE flag on x86 CPU.
|
2013-06-06 22:50:43 +08:00 |
Zhang Xianyi
|
f2fb8c7035
|
Change LIBSUFFIX from .lib to .a on windows.
|
2013-06-04 16:05:28 +08:00 |
Zhang Xianyi
|
9f59f384d8
|
Refs #223. Fixed s/dgemv bug on windows.
|
2013-06-04 16:01:05 +08:00 |
wangqian
|
23965f164c
|
Fixed overflow internal buffer bug of (s/d/c/z)gemv on x86_64.
|
2013-05-29 19:48:31 +08:00 |
wangqian
|
6a72840945
|
Fixed overflow internal buffer bug of (s/d/c/z)gemv on x86.
|
2013-05-29 13:23:12 +08:00 |
Zhang Xianyi
|
947457fb7c
|
Fixed the bug about testing the exist of lapack tar package.
|
2013-05-24 15:52:35 +08:00 |
Zhang Xianyi
|
79120bf9a0
|
Refs #205. Merge boegel's codes about downloading LAPACK.
|
2013-05-24 15:29:10 +08:00 |
Zhang Xianyi
|
acb11905d5
|
Fixed #199. Saved USE_THREAD switch for make install.
|
2013-05-24 15:15:52 +08:00 |
Zhang Xianyi
|
109500178c
|
Refs #220. Support Power7 by old Power6 kernels.
|
2013-05-21 22:59:45 +08:00 |
Zhang Xianyi
|
e50a664865
|
Refs #215. Fixed the compatible between <complex.h> and <complex> in C++.
|
2013-05-17 16:41:05 +08:00 |
Zhang Xianyi
|
357078b93e
|
Refs #216. Revert the default value of GEMM_MULTITHREAD_THRESHOLD to 4.
|
2013-05-03 09:08:54 +08:00 |
wernsaar
|
731220f870
|
changed DGEMM_DEFAULT_P and DGEMM_DEFAULT_Q to 248 for BULLDOZER 64bit
|
2013-04-30 10:07:17 +02:00 |