Commit Graph

555 Commits

Author SHA1 Message Date
Sebastien Fabbro 9f0fb6e662 Respect user's LDFLAGS 2013-07-25 14:08:37 -07:00
Zhang Xianyi 63f14189e3 Refs #259. Fixed missing LAPACK functions in shared library. 2013-07-26 01:32:32 +08:00
Zhang Xianyi c5437149c0 Merge pull request #257 from staticfloat/develop
Add in return value for `interface/trtri.c`
2013-07-22 22:35:29 -07:00
Elliot Saba 6f5b395009 Fix xianyi/OpenBLAS#256 2013-07-22 17:02:06 -07:00
Zhang Xianyi d4f9571818 Refs #255. Didn't use f77 compiler. 2013-07-22 11:34:43 +08:00
Zhang Xianyi 937d838619 Update CONTRIBUTORS.md 2013-07-20 23:32:23 +08:00
Zhang Xianyi 6209c8fc44 Fixed #253. Update doc for v0.2.7 version. 2013-07-20 23:05:12 +08:00
Zhang Xianyi 238ceb4ac0 Merge branch 'loongson3b' into develop 2013-07-20 22:33:35 +08:00
Zhang Xianyi 77b572fa0b Merge branch 'loongson3a' into develop
Conflicts:
	Makefile.system
2013-07-20 22:33:17 +08:00
Zhang Xianyi f69f89b846 Fixed #254. Added the date of changes in contributors file. 2013-07-20 11:35:27 +08:00
Zhang Xianyi c77032b0cc create contributor file. 2013-07-19 08:38:03 +08:00
wangqian 1b3b9e841d Fixed a computational error in zgemm_kernel_4x4_sandy.S file. 2013-07-18 20:23:21 +08:00
Zhang Xianyi b67252c2e4 Ensure the correct stack alignment on Win32. 2013-07-17 15:19:07 +08:00
Zhang Xianyi c69e73b868 Fixed typo in generating shared library on x86_64. 2013-07-16 23:18:18 +08:00
Zhang Xianyi b51e2ba1ee Modified Makefile to avoid redundant echo. 2013-07-16 22:44:27 +08:00
Zhang Xianyi 9c0a834f98 Modified Makefile.install 2013-07-16 17:45:00 +08:00
Zhang Xianyi 2a7503e563 Refs #225. Fixed a bug in GEMM OpenMP threading. 2013-07-15 09:56:19 +08:00
Zhang Xianyi fd0c388681 Refs #191. A walk around for dtrtri_U single thread bug.
This function caused the failure of ERKALE serial test.
I replaced it with LAPACK source code.
2013-07-14 22:16:30 +08:00
Zhang Xianyi 61a9582987 Changed makefile for lapack. 2013-07-14 10:41:54 +08:00
Zhang Xianyi b681064c6c Updated travis. 2013-07-12 21:41:12 +08:00
Zhang Xianyi e80e285928 Update build matrix for Travis CI. 2013-07-11 23:49:29 +08:00
Zhang Xianyi 2ed0f6ab60 Fixed the typo. 2013-07-11 23:47:07 +08:00
Zhang Xianyi 5448643557 Fixed generating dll bug in last commit. 2013-07-11 22:24:50 +08:00
Zhang Xianyi 824c3c4df3 Fixed #251. Merge branch 'grisuthedragon-develop' into develop 2013-07-11 21:42:04 +08:00
grisuthedragon c19a488af2 create openblas_get_parallel to retrieve information which
parallelization model is used by OpenBLAS.
2013-07-11 21:39:19 +08:00
Zhang Xianyi 32d2ca3035 Refs #214, #221, #246. Fixed the getrf overflow bug on Windows.
I used a smaller threshold since the stack size is 1MB on windows.
2013-07-11 03:20:02 +08:00
Zhang Xianyi 6df39ad9e7 Refs #248. Support LAPACK and LAPACKE with lsbcc.
For LAPACKE, use LAPACK_COMPLEX_STRUCTURE.
The reson is lsbcc didn't define complex I in complex.h.
2013-07-10 16:02:27 +08:00
Zhang Xianyi 3a96e4cbcb Merge pull request #249 from wernsaar/develop
replaced defined(DOUBLE) by !defined(XDOUBLE)
2013-07-10 01:01:03 -07:00
wernsaar 6f008abcef replaced defined(DOUBLE) by !defined(XDOUBLE) 2013-07-09 18:17:50 +02:00
Zhang Xianyi 3eb5af1955 Refs #247. Included lapack source codes. Avoid downloading tar.gz from netlib.org
Based on 3.4.2 version, apply patch.for_lapack-3.4.2.
2013-07-09 18:13:48 +08:00
Zhang Xianyi fbb75e58b1 Fixed the typo in getarch.c 2013-07-09 16:26:59 +08:00
Zhang Xianyi f54f5bac9e Refs #248. Fixed the LSB compatiable issue for BLAS only.
For example, make CC=lsbcc NO_LAPACK=1.
2013-07-09 15:38:03 +08:00
Zhang Xianyi 5d3312142a Refs #221 #246. Fixed the overflowing stack bug in mutlithreading BLAS3.
When NUM_THREADS(MAX_CPU_NUNBERS) is very large ,e.g. 256.

typedef struct {
  volatile BLASLONG working[MAX_CPU_NUMBER][CACHE_LINE_SIZE * DIVIDE_RATE];
} job_t;

job_t          job[MAX_CPU_NUMBER];

The job array is equal 8MB.

Thus, We use malloc instead of stack allocation.
2013-07-08 01:07:05 +08:00
Zhang Xianyi 886cbaf4e4 Support AMD Piledriver by bulldozer kernels. 2013-07-06 12:06:43 -03:00
Zhang Xianyi 0c4074e10b Added Travis CI status image. 2013-07-05 15:28:41 +08:00
Zhang Xianyi cc522aa21d Use quiet make for Travis CI. 2013-07-05 14:52:57 +08:00
Zhang Xianyi 9c78fad721 Install gfortran in Travis CI. 2013-07-05 11:11:18 +08:00
Zhang Xianyi 6028232ad1 Added travis.yml file. 2013-07-04 23:30:53 +08:00
Zhang Xianyi feb9a3889a Improved make clean on Mac OS X. 2013-07-02 14:37:30 +08:00
Zhang Xianyi 32dbeb636d Refs #221. Set stack limit to 16MB to prevent a SEGFAULT bug on Mac OS X with DYNAMIC_ARCH=1 & NUM_THREADS=256. 2013-07-02 14:17:55 +08:00
Zhang Xianyi 57944538b6 Use ALIGN_5 instead of .algin 32 in assembly kernel. Added ALIGN_5 for 32-bit OSX. 2013-07-01 16:09:05 +08:00
Zhang Xianyi 3ce2c62b0b Merge pull request #242 from danluu/readme.haswell
Update README to reflect Haswell support, etc.
2013-06-30 09:40:32 -07:00
Dan Luu 50464997a3 Fix miscellaneous typos 2013-06-30 11:36:13 -05:00
Zhang Xianyi 8e7cad1650 Fixed #217 openblas_config.h bug on Windows 64. 2013-07-01 00:35:14 +08:00
Dan Luu 590e6aeafc Add Haswell support 2013-06-30 11:35:00 -05:00
Dan Luu 88ef307cef Refs #241. Add Haswell support (using sandybridge optimizations) 2013-06-30 22:35:14 +08:00
Zhang Xianyi 6e8501c8a1 Fixed #239 bug in param.h about BARCELONA and BULLDOZER. 2013-06-29 10:36:01 +08:00
Zhang Xianyi fa916a0fac Fixed #238 bug in lsame on x86. 2013-06-28 22:43:41 +08:00
Zhang Xianyi fb298b34ae Merge pull request #235 from wernsaar/develop
Added ddot, daxpy, dcopy kernels for AMD bulldozer.
2013-06-21 17:59:26 -07:00
wernsaar 16012767f4 added dcopy_bulldozer.S 2013-06-21 16:06:51 +02:00