Zhang Xianyi
b5c2ac4fd6
Fixed #264 the memory leak bug in dtrtri_U.
2013-07-29 23:21:10 +08:00
Zhang Xianyi
749f45ffc8
Fixed the FMA3 detection bug.
2013-07-29 16:48:53 +08:00
Zhang Xianyi
534c5ec919
Fixed #261 . Use strncmp instead of a comparing trick.
2013-07-29 16:48:35 +08:00
Zhang Xianyi
bd2da90e13
Fixed typo in getarch_2nd.c.
2013-07-29 15:42:00 +08:00
wernsaar
84bd0aabaa
added dtrsm_kernel_LT_8x2_bulldozer.S
2013-07-28 16:47:58 +02:00
Zhang Xianyi
5b504d6c23
Refs #263 . Rollback bulldozer and piledriver kernels to barcelona kernels.
2013-07-28 17:39:24 +08:00
Zhang Xianyi
72b1edaf1b
Merge branch 'develop' into bulldozer
...
Conflicts:
kernel/x86_64/KERNEL.BULLDOZER
2013-07-28 06:38:25 +02:00
Zhang Xianyi
a2930664f4
Refs #262 . Added executable stack markings.
2013-07-28 00:09:40 +08:00
Zhang Xianyi
6e0db36373
Merge branch 'sfabbro-ldflags' into develop
2013-07-27 23:03:07 +08:00
Zhang Xianyi
1e1250b703
Fixed #260 . Fixed generating 32-bit shared library on previous commit.
2013-07-27 23:01:36 +08:00
Zhang Xianyi
23186d9f21
Fixed the FMA3 detection bug.
2013-07-27 22:37:57 +08:00
Zhang Xianyi
e6ebbfd314
Merge branch 'ldflags' of https://github.com/sfabbro/OpenBLAS into sfabbro-ldflags
2013-07-27 22:19:54 +08:00
Zhang Xianyi
4471c77905
Fixed #261 . Use strncmp instead of a comparing trick.
2013-07-26 23:43:54 +08:00
Sebastien Fabbro
9f0fb6e662
Respect user's LDFLAGS
2013-07-25 14:08:37 -07:00
Zhang Xianyi
f26b7a08aa
Merge branch 'develop'
2013-07-26 01:34:45 +08:00
Zhang Xianyi
63f14189e3
Refs #259 . Fixed missing LAPACK functions in shared library.
2013-07-26 01:32:32 +08:00
Zhang Xianyi
e39384432b
Merge branch 'develop'
2013-07-23 13:40:08 +08:00
Zhang Xianyi
c5437149c0
Merge pull request #257 from staticfloat/develop
...
Add in return value for `interface/trtri.c`
2013-07-22 22:35:29 -07:00
Elliot Saba
6f5b395009
Fix xianyi/OpenBLAS#256
2013-07-22 17:02:06 -07:00
Zhang Xianyi
d4f9571818
Refs #255 . Didn't use f77 compiler.
2013-07-22 11:34:43 +08:00
Zhang Xianyi
937d838619
Update CONTRIBUTORS.md
2013-07-20 23:32:23 +08:00
Zhang Xianyi
a8f9b6a665
Merge branch 'develop'
2013-07-20 23:05:36 +08:00
Zhang Xianyi
6209c8fc44
Fixed #253 . Update doc for v0.2.7 version.
2013-07-20 23:05:12 +08:00
Zhang Xianyi
238ceb4ac0
Merge branch 'loongson3b' into develop
2013-07-20 22:33:35 +08:00
Zhang Xianyi
77b572fa0b
Merge branch 'loongson3a' into develop
...
Conflicts:
Makefile.system
2013-07-20 22:33:17 +08:00
Zhang Xianyi
f69f89b846
Fixed #254 . Added the date of changes in contributors file.
2013-07-20 11:35:27 +08:00
Zhang Xianyi
c77032b0cc
create contributor file.
2013-07-19 08:38:03 +08:00
wangqian
1b3b9e841d
Fixed a computational error in zgemm_kernel_4x4_sandy.S file.
2013-07-18 20:23:21 +08:00
Zhang Xianyi
b67252c2e4
Ensure the correct stack alignment on Win32.
2013-07-17 15:19:07 +08:00
Zhang Xianyi
c69e73b868
Fixed typo in generating shared library on x86_64.
2013-07-16 23:18:18 +08:00
Zhang Xianyi
b51e2ba1ee
Modified Makefile to avoid redundant echo.
2013-07-16 22:44:27 +08:00
Zhang Xianyi
9c0a834f98
Modified Makefile.install
2013-07-16 17:45:00 +08:00
Zhang Xianyi
2a7503e563
Refs #225 . Fixed a bug in GEMM OpenMP threading.
2013-07-15 09:56:19 +08:00
Zhang Xianyi
fd0c388681
Refs #191 . A walk around for dtrtri_U single thread bug.
...
This function caused the failure of ERKALE serial test.
I replaced it with LAPACK source code.
2013-07-14 22:16:30 +08:00
Zhang Xianyi
61a9582987
Changed makefile for lapack.
2013-07-14 10:41:54 +08:00
Zhang Xianyi
b681064c6c
Updated travis.
2013-07-12 21:41:12 +08:00
Zhang Xianyi
e80e285928
Update build matrix for Travis CI.
2013-07-11 23:49:29 +08:00
Zhang Xianyi
2ed0f6ab60
Fixed the typo.
2013-07-11 23:47:07 +08:00
Zhang Xianyi
5448643557
Fixed generating dll bug in last commit.
2013-07-11 22:24:50 +08:00
Zhang Xianyi
824c3c4df3
Fixed #251 . Merge branch 'grisuthedragon-develop' into develop
2013-07-11 21:42:04 +08:00
grisuthedragon
c19a488af2
create openblas_get_parallel to retrieve information which
...
parallelization model is used by OpenBLAS.
2013-07-11 21:39:19 +08:00
Zhang Xianyi
32d2ca3035
Refs #214 , #221 , #246 . Fixed the getrf overflow bug on Windows.
...
I used a smaller threshold since the stack size is 1MB on windows.
2013-07-11 03:20:02 +08:00
Zhang Xianyi
6df39ad9e7
Refs #248 . Support LAPACK and LAPACKE with lsbcc.
...
For LAPACKE, use LAPACK_COMPLEX_STRUCTURE.
The reson is lsbcc didn't define complex I in complex.h.
2013-07-10 16:02:27 +08:00
Zhang Xianyi
3a96e4cbcb
Merge pull request #249 from wernsaar/develop
...
replaced defined(DOUBLE) by !defined(XDOUBLE)
2013-07-10 01:01:03 -07:00
wernsaar
6f008abcef
replaced defined(DOUBLE) by !defined(XDOUBLE)
2013-07-09 18:17:50 +02:00
Zhang Xianyi
3eb5af1955
Refs #247 . Included lapack source codes. Avoid downloading tar.gz from netlib.org
...
Based on 3.4.2 version, apply patch.for_lapack-3.4.2.
2013-07-09 18:13:48 +08:00
Zhang Xianyi
fbb75e58b1
Fixed the typo in getarch.c
2013-07-09 16:26:59 +08:00
Zhang Xianyi
f54f5bac9e
Refs #248 . Fixed the LSB compatiable issue for BLAS only.
...
For example, make CC=lsbcc NO_LAPACK=1.
2013-07-09 15:38:03 +08:00
Zhang Xianyi
5d3312142a
Refs #221 #246 . Fixed the overflowing stack bug in mutlithreading BLAS3.
...
When NUM_THREADS(MAX_CPU_NUNBERS) is very large ,e.g. 256.
typedef struct {
volatile BLASLONG working[MAX_CPU_NUMBER][CACHE_LINE_SIZE * DIVIDE_RATE];
} job_t;
job_t job[MAX_CPU_NUMBER];
The job array is equal 8MB.
Thus, We use malloc instead of stack allocation.
2013-07-08 01:07:05 +08:00
Zhang Xianyi
886cbaf4e4
Support AMD Piledriver by bulldozer kernels.
2013-07-06 12:06:43 -03:00