Zhang Xianyi
75a5dc3975
Added the configure for the host loongcc compiling on Loongson3.
2013-04-11 16:10:47 +00:00
Xianyi Zhang
986d542acb
Merge branch 'loongson3a' into loongson3b
2013-04-11 16:07:59 +08:00
Xianyi Zhang
6958c1a1aa
Fixed the SEGFAULT bug with Loongcc and Loongson3.
2013-04-11 15:33:43 +08:00
Zhang Xianyi
a068d54981
Refs #209 . Export the missing cblas_cdotc_sub functions.
2013-04-08 23:21:28 +08:00
Xianyi Zhang
d692ee07f7
Merge branch 'loongson3a' into loongson3b
2013-04-08 14:56:39 +08:00
Xianyi Zhang
1a57717b1a
Added the configuration of Loongcc compiler for Loongson 3 CPU.
2013-04-07 15:42:07 +08:00
Xianyi Zhang
6b01d58712
Disable the optimization of muli-threading gemm on the Loongson3A.
2013-03-30 20:12:43 +00:00
Xianyi Zhang
35b943f17f
Merge branch 'develop' into loongson3a
2013-03-27 14:36:15 +00:00
Zhang Xianyi
e029242870
Merge pull request #206 from wlbksy/patch-1
...
Fix #204 wget in mingw/msys sometimes download file with trailing name,
2013-03-23 09:57:41 -07:00
wlbksy
7a9b94b519
Fix #204
2013-03-23 14:41:26 +08:00
Kenneth Hoste
66b919d99f
adjusted Makefile to allow for provided required LAPACK source files rather than downloading them
2013-03-22 19:45:11 +01:00
Zhang Xianyi
f4846afbad
Merge pull request #201 from Explorer09/develop
2013-03-18 07:31:30 -07:00
Explorer09
53588bc786
getarch.c: Minor re-ordering of architecture list
2013-03-17 23:09:23 +08:00
Explorer09
b47f13ee4c
getarch.c: Minor re-ordering of architecture list
2013-03-17 23:07:48 +08:00
Explorer09
309f90e563
TargetList.txt: minor re-ordering
2013-03-17 23:03:05 +08:00
Explorer09
773c01f496
Typo correction in README.md
2013-03-17 22:48:24 +08:00
Zhang Xianyi
d831b2ff8b
Override CFLAGS in LAPACK make.in.
2013-03-10 01:01:16 +08:00
Zhang Xianyi
724ae159ce
Fixed the Windows x86_64 ABI bug in s/daxpy kernels.
2013-03-08 22:28:34 +08:00
Zhang Xianyi
2c9a203bd1
Merge pull request #198 from wernsaar/develop
...
new optimization of dgemm kernel for bulldozer: 10% performance increase
2013-03-06 13:39:53 -08:00
wernsaar
f300ce3df5
new optimization of dgemm kernel for bulldozer: 10% performance increase
2013-03-06 17:26:03 +01:00
Zhang Xianyi
e2c7c75715
Merge pull request #197 from wernsaar/develop
...
optimized again bulldozer dgemm kernel
2013-03-06 01:11:08 -08:00
wernsaar
66e64131ed
optimized again bulldozer dgemm kernel
2013-03-05 19:51:37 +01:00
Zhang Xianyi
5900b1462e
Merge pull request #195 from wernsaar/develop
...
Develop dgemm for bullozer
2013-03-05 05:35:42 -08:00
wernsaar
9405f26f4b
new dgemm_kernel for bulldozer
2013-03-04 17:37:38 +01:00
Zhang Xianyi
54e7b37630
Merge branch 'develop'
2013-03-02 14:42:06 +08:00
Zhang Xianyi
529f1b5006
Refs#194. Export the missing LAPACK s/dlamc3 functions.
2013-03-02 14:41:18 +08:00
Zhang Xianyi
e5ac3007e0
Merge branch 'develop'
2013-03-02 14:24:23 +08:00
Zhang Xianyi
0d0405b434
Updated the doc for 0.2.6 version.
2013-03-02 14:22:27 +08:00
Zhang Xianyi
f1ce74ffdd
Improved the print when OS don't support AVX.
2013-03-02 14:15:54 +08:00
Zhang Xianyi
d744c9590a
In OpenMP threading, preallocate the thread buffer instead of allocating the buffer every time. This patch improved the performance slightly.
2013-03-01 14:36:47 +08:00
Zhang Xianyi
3cc6ae793e
Refs #174 . Return sb pointer when OpenMP or Windows.
2013-02-26 00:48:21 +08:00
Zhang Xianyi
4c2123c334
Fixed the overflowing bug in single thread cholesky factorization.
2013-02-23 13:00:52 +08:00
Zhang Xianyi
5155e3f509
Refs #174 . Fixed the overflowing buffer bug of multithreading hbmv and sbmv.
...
Instead of using thread 0 buffer, each thread uses its own sb buffer.
Thus, it can avoid overflowing thread 0 buffer.
2013-02-13 16:05:58 +08:00
Zhang Xianyi
5c8bf6ae0e
Merge branch 'bulldozer' into develop
2013-02-10 01:19:42 +08:00
Zhang Xianyi
6ae2f868fd
Set the affinity. Only use 1 core of each module on bulldozer.
2013-02-09 18:19:02 +01:00
Zhang Xianyi
a1ead62f28
Disable the warning of sgemm bulldozer kernel.
2013-02-09 17:03:13 +01:00
Zhang Xianyi
0133580148
Used sgemm bulldozer kernel on 64 bit.
2013-02-09 16:29:14 +01:00
Zhang Xianyi
274246651d
Merge branch 'bulldozer' of git://github.com/wernsaar/OpenBLAS into bulldozer
2013-02-09 16:25:07 +01:00
Zhang Xianyi
299b5a44dc
Merge branch 'develop' of github.com:xianyi/OpenBLAS into bulldozer
2013-02-09 16:22:04 +01:00
Zaheer Chothia
a9500d0079
Missing line continuation -- follow-up to last commit ( 64ad8b9809
).
2013-02-01 09:34:12 +01:00
Zaheer Chothia
64ad8b9809
Refs #193 . Don't use C99 complex numbers when building C++ code.
2013-02-01 09:24:44 +01:00
Zaheer Chothia
875d520ccf
Refs #193 . cblas: move #include out of extern "C" block.
...
Standard headers may contain C++ templates which are not permitted inside an
extern "C" block. This might be the case when we include <complex.h>.
2013-01-31 08:48:27 +01:00
Zhang Xianyi
d311236dfd
Refs #189 . Fixed the bug of s/cdot about invalid reading NAN on x86_64.
2013-01-25 20:56:14 +08:00
Zhang Xianyi
36e0982966
Refs #187 . Use perl to generate cblas_noconst.h instead of sed.
...
Thank Dan Povey's patch. https://github.com/xianyi/OpenBLAS/issues/187
2013-01-22 00:29:54 +08:00
Zhang Xianyi
8cdb795438
Refs #187 . Use binary code for xgetbv, which is compatible with old compiler.
2013-01-22 00:25:08 +08:00
Zaheer Chothia
4db6660de4
Refs #185 . Add missing 'const' to declarations in <cblas.h>. Thanks to Dan Povey!
...
The 'const' modifications were done automatically using this scripts:
https://kaldi.svn.sourceforge.net/svnroot/kaldi/sandbox/dan/tools/for_openblas
2013-01-20 22:52:51 +01:00
Zhang Xianyi
0b08f7479e
Refs #154 . Fixed gemv_t bug about overflow 16MB buffer on x86.
2013-01-20 21:22:12 +08:00
Zaheer Chothia
200e4acf15
cblas: typedef enums for improved compatibility with Intel MKL.
...
Netlib style:
enum CBLAS_XYZ {X=1, Y=2, Z=3};
Intel MKL style:
typedef enum {X=1, Y=2, Z=3} CBLAS_XYZ;
With this hybrid style, code written in the latter form won't need any
modifications to be built with OpenBLAS. This change should not affect existing
code, although a warning may be emitted for C code which does the following
(does not occur with C++):
typedef enum CBLAS_XYZ CBLAS_XYZ;
warning: redefinition of typedef 'CBLAS_XYZ' [-pedantic]
2013-01-19 22:57:13 +01:00
Zhang Xianyi
99d1978df7
Fixed #180 . the typos in kernel/x86_64/sgemv_t.S
2013-01-12 12:31:14 +08:00
Zhang Xianyi
08bf6674d5
Refs #177 . Fixed sgemv_t compiling bug on Win64.
2013-01-05 11:36:39 +08:00