Commit Graph

1318 Commits

Author SHA1 Message Date
Zhang Xianyi 8cdb795438 Refs #187. Use binary code for xgetbv, which is compatible with old compiler. 2013-01-22 00:25:08 +08:00
Zaheer Chothia 4db6660de4 Refs #185. Add missing 'const' to declarations in <cblas.h>. Thanks to Dan Povey!
The 'const' modifications were done automatically using this scripts:
https://kaldi.svn.sourceforge.net/svnroot/kaldi/sandbox/dan/tools/for_openblas
2013-01-20 22:52:51 +01:00
Zhang Xianyi 0b08f7479e Refs #154. Fixed gemv_t bug about overflow 16MB buffer on x86. 2013-01-20 21:22:12 +08:00
Zaheer Chothia 200e4acf15 cblas: typedef enums for improved compatibility with Intel MKL.
Netlib style:
    enum CBLAS_XYZ {X=1, Y=2, Z=3};

Intel MKL style:
    typedef enum {X=1, Y=2, Z=3} CBLAS_XYZ;

With this hybrid style, code written in the latter form won't need any
modifications to be built with OpenBLAS.  This change should not affect existing
code, although a warning may be emitted for C code which does the following
(does not occur with C++):
    typedef enum CBLAS_XYZ CBLAS_XYZ;
    warning: redefinition of typedef 'CBLAS_XYZ' [-pedantic]
2013-01-19 22:57:13 +01:00
Zhang Xianyi 99d1978df7 Fixed #180. the typos in kernel/x86_64/sgemv_t.S 2013-01-12 12:31:14 +08:00
Zhang Xianyi 08bf6674d5 Refs #177. Fixed sgemv_t compiling bug on Win64. 2013-01-05 11:36:39 +08:00
Zhang Xianyi 8b122ff9dc Refs #176. Fixed make.inc overriding RANLIB bug when cross-compiling LAPACK. 2013-01-03 01:47:31 +08:00
Zhang Xianyi 69200884e1 Refs #173. Fixed overflow internal buffer bug of gemv_n on x86 2012-12-25 09:27:49 +08:00
Zhang Xianyi 0d1518add9 Refs #173. Fixed overflow internal buffer bug of sgemv_t on x86 2012-12-25 09:10:17 +08:00
Zhang Xianyi 91ed4e4450 Refs #171. Prevent loading the dirty number from the buffer in sgemv_t x86 kernel. 2012-12-23 23:14:17 +08:00
Zhang Xianyi fd3046b32a Refs #173. Fixed overflow internal buffer bug of gemv_t on x86. 2012-12-23 21:47:22 +08:00
Zhang Xianyi a4ee6f3915 Fixed #172. Support Intel Xeon E7540. 2012-12-18 08:57:46 +08:00
Zhang Xianyi a0363e9b48 Merge branch 'master' into develop 2012-12-18 08:51:30 +08:00
Zhang Xianyi b471d52e61 Merge pull request #170 from juliantaylor/athlon-defaults
set parameters for CORE_ATHLON
2012-12-15 15:50:02 -08:00
Julian Taylor 9fb341a9f8 set parameters for CORE_ATHLON
else dgemm_p is set to zero leading to a segfault in alloc_mmap due to
allocsize being zero
2012-12-15 16:05:33 +01:00
Zhang Xianyi fba6b590f2 Merge branch 'master' into develop 2012-12-15 22:49:37 +08:00
Zhang Xianyi 97f68f7f3a Merge pull request #169 from juliantaylor/sanity-check-cpu
add a sanity check on the detected cpu type
2012-12-15 06:46:48 -08:00
Julian Taylor 1138817dd2 add a sanity check on the detected cpu type
if we have 64 bit pointers we can't have a 32 bit cpu, so fall back to
the 64bit cpu fallback (prescott)
E.g. the cpu detection fails in amd qemu64 emulation (family 6 model 2)
causing it to use the uninitialized gotoblas_ATHLON
2012-12-15 13:29:46 +01:00
Zhang Xianyi 13f8fc0b1a Write FMA4 flag to the configure file. 2012-12-11 10:55:10 +01:00
Zhang Xianyi bdf8d9411e Refs #163. Obtain the build configure on runtime.
openblas_get_config function returns the configure string.
So far, it supports USE64BITINT, NO_CBLAS, NO_LAPACK, NO_LAPACKE,
DYNAMIC_ARCH, NO_AFFINITY.

Example:
 #include <stdio.h>
extern char * openblas_get_config();
void main()
{
  printf("%s\n",openblas_get_config());
  return;
}
2012-12-10 15:52:51 +08:00
Zhang Xianyi bb10cb8442 Refs #165. fall back of DTB_DEFAULT_ENTRIES for some virtual machines. 2012-12-10 11:51:39 +08:00
wernsaar d48cff8cf1 Added optimized sgemm_kernel 2012-12-08 18:50:53 +01:00
Zhang Xianyi f19af5ecc0 Refs #54. Added AMD Bulldozer x86_64 dgemm kernel developed by Werner Saar <wernsaar at googlemail.com>
Based on the dgemm kernel for AMD Barcelona, he used AVX and FMA4 instructions.
Thank Werner Saar!
2012-12-07 01:05:11 +08:00
Zhang Xianyi bfaaa975e6 Added BULLDOZER target. So far it uses barcelona kernels. 2012-12-07 00:53:31 +08:00
Zhang Xianyi b7c0fa6bd2 Init AMD Bulldozer codebase. 2012-12-06 07:29:54 -05:00
Zhang Xianyi 7110d17146 Added -lgomp for generating DLL on Windows. 2012-11-28 12:52:28 +08:00
Zhang Xianyi e01b3d4b54 Merge branch 'develop' 2012-11-27 07:24:53 +08:00
Zhang Xianyi cea1a885b5 Refs #154. Fixed the build bug of dgemv_t on MinW64. 2012-11-27 07:24:04 +08:00
Zhang Xianyi f78eb335d6 Merge branch 'develop' 2012-11-26 17:32:56 +08:00
Zhang Xianyi 2345bdec68 Update the doc for 0.2.5 version. 2012-11-26 17:32:25 +08:00
Zhang Xianyi 5f0117385e Refs #154. Fixed a SEGFAULT bug of dgemv_t when m is very large.
It overflowed the internal buffer. Thus, we split vector x into blocks when m is very large.

Thank @wangqian for this patch.
2012-11-19 22:32:27 +08:00
Zhang Xianyi 6caf1bab73 Fixed #160. Merge branch 'master' of https://github.com/sebastien-villemot/OpenBLAS into develop 2012-11-15 18:21:04 +08:00
Sébastien Villemot 01e3c984ce Fix compilation with TARGET=GENERIC
Patch applied to Debian package
2012-11-14 21:04:05 +01:00
Zhang Xianyi 6751f7b9a7 Fixed #157. Only detect the number of physical CPU cores on Mac OSX. 2012-11-13 15:48:57 +08:00
Zhang Xianyi d5717a97ea Compile lapacke with ILP64 modle when INTERFACE64=1 2012-11-13 00:54:20 +08:00
Zhang Xianyi b45d43d295 Added the patch for lapacke example. 2012-11-13 00:53:26 +08:00
Zhang Xianyi dcfb69c2b5 Merge branch 'master' of https://github.com/alnsn/OpenBLAS into develop 2012-11-12 11:17:04 +08:00
Alexander Nasonov e85549ee11 Fix NetBSD build. 2012-11-10 23:20:44 +00:00
Zhang Xianyi 789f205177 Improved Makefile.rule for cross compiler. 2012-11-09 00:14:20 +08:00
Zhang Xianyi 378acfe826 Added NO_SHARED flag to disable generating the shared library. 2012-11-09 00:14:15 +08:00
Zhang Xianyi 538c764d2b Refs #153. Restore the original CPU affinity when calling openblas_set_num_threads(1).
Please read the issue on github.com for the detail.
2012-11-06 18:21:46 +08:00
Zaheer Chothia 0f26a21624 Alternative approach to avoid command-line length while archiving lapacke -- Thanks Michel! 2012-10-15 22:26:18 +02:00
Zaheer Chothia 5c1efa1149 Fix installation step on Windows (regression from e8306f623a)
Since the DLL now has a fixed name there is no need to install a versioned alias too.
2012-10-15 22:13:37 +02:00
Zaheer Chothia ca4136cf41 Fixed #147: LAPACK symbols were not being exported for version 3.4.2 2012-10-12 23:44:23 +02:00
Zhang Xianyi 3a26470fb7 Merge branch 'develop' 2012-10-09 20:08:28 +08:00
Zhang Xianyi 6c5899dff5 Don't use xgetbv instruction when NO_AVX=1 2012-10-09 14:52:35 +08:00
Zhang Xianyi 2df2878dfc Merge branch 'develop' 2012-10-08 13:38:03 +08:00
Zhang Xianyi 0b719945c5 Updated the doc for 0.2.4 version. 2012-10-08 13:37:44 +08:00
Zhang Xianyi b1a54a0107 Fixed #141. make f77blas.h compatible with compilers which lack C99 complex number.
Apply the patch from Tony @tonyhill. Thank you.
2012-10-08 12:48:20 +08:00
Zhang Xianyi 08c177ca36 Refs #145. Update LAPACK to 3.4.2 version. 2012-09-29 23:14:39 +08:00