Zhang Xianyi
2fb02626da
Update organization info.
2014-11-25 15:28:58 +08:00
Zhang Xianyi
695e0fa649
#463 fixed a compiling bug on AIX.
2014-11-10 14:39:56 +08:00
wernsaar
a64fe9bcc9
added optimized sgemv_n kernel for sandybridge
2014-09-06 08:41:53 +02:00
wernsaar
2021d0f9d6
experimentally removed expensive function calls
2014-09-05 15:05:53 +02:00
Isaac Dunham
f7eb81a846
Fix link error on Linux/musl.
...
get_nprocs() is a GNU convenience function equivalent to POSIX2008
sysconf(_SC_NPROCESSORS_ONLN); the latter should be available in unistd.h
on any current *nix. (OS X supports this call since 10.5, and FreeBSD
currently supports it. But this commit does not change FreeBSD or OS X
versions.)
2014-08-03 15:06:30 -07:00
wernsaar
793175be3a
added experimental support for big numa machines
2014-08-02 13:40:16 +02:00
Zhang Xianyi
c94762bb56
Refs #401 . Added NO_AVX2 flag for old binutils (e.g. RHEL6)
2014-07-16 08:38:25 +08:00
Zhang Xianyi
552119c484
Fixed #407 . Support outputing the CPU corename on runtime.
...
The user can use char * openblas_get_config() or char * openblas_get_corename().
2014-07-08 12:48:08 +08:00
wernsaar
50e99a52ea
added definitions for PILEDRIVER and HASWELL
2014-07-06 12:08:27 +02:00
Zhang Xianyi
7a8949e0ce
Merge branch 'develop' of https://github.com/TimothyGu/OpenBLAS into TimothyGu-develop
...
Conflicts:
driver/others/memory.c
2014-06-28 20:51:31 +08:00
Timothy Gu
6c2ead30f0
Remove all trailing whitespace except lapack-netlib
...
Signed-off-by: Timothy Gu <timothygu99@gmail.com>
2014-06-27 12:05:18 -07:00
Jameson Nash
f41f03ab83
fix #394 . this cleans up some handles after using them, and doesn't disable ALL process privileges upon success
2014-06-27 12:16:57 -04:00
wernsaar
438002204d
Ref #393 : fix for INTERFACE64=0 and ARCH_X86 in divtable
2014-06-21 12:29:23 +02:00
wernsaar
53bfa51ee0
Ref #385 : fixed warnings in dynamic.c
2014-06-12 18:17:08 +02:00
wernsaar
a86d349a51
Ref #380 : enhancements for dynamic_arch
2014-06-12 14:20:03 +02:00
wernsaar
a35a1a9ae7
changed makefiles for lapack development
2014-05-07 11:33:02 +02:00
Olivier Grisel
2c556f093a
Add cast to function pointer to remove warning
2014-02-25 11:08:32 +01:00
Olivier Grisel
3b027d2528
Do not reference pthread_atfork in non-SMP_SERVER mode
2014-02-25 11:08:32 +01:00
Olivier Grisel
49bd98f410
Do not reference pthread_atfork under windows
2014-02-19 19:25:48 +01:00
Olivier Grisel
138a841390
FIX #294 : make OpenBLAS thread-pool resilient to fork via pthread_atfork
2014-02-19 19:01:15 +01:00
Olivier Grisel
046e4013cb
Revert "Refs #294 . Used pthread_atfork to avoid hang after a Unix fork."
...
This reverts commit 3617c22a56
.
2014-02-19 18:32:54 +01:00
Zhang Xianyi
3617c22a56
Refs #294 . Used pthread_atfork to avoid hang after a Unix fork.
...
The problem is the mutex we used in blas_server. Thus, we must clear
the mutex before the fork and re-init them at parent and child process.
If you used OpenMP, GOMP has the same problem by now. Please try other OpenMP
implemantation.
2014-02-18 15:36:04 +08:00
Zhang Xianyi
8c7687b419
Refs #338 . Added OPENBLAS_VERBOSE environment variable on runtime
...
By default, OpenBLAS doesn't output the warning message. You can set
OPENBLAS_VERBOSE (e.g. export OPENBLAS_VERBOSE=1) to enable the warning
message on runtime.
2014-01-24 02:05:59 +08:00
Zhang Xianyi
ab69443bd4
Refs #332 . Added addtional Intel Ivy Bridge and Haswell CPU-id.
2014-01-05 23:44:29 +08:00
Zhang Xianyi
b263e096af
Refs #307 . Delete debug printf.
2013-12-31 15:53:13 +08:00
wernsaar
0b6e13b689
Merge remote branch 'origin/develop' into haswell
2013-12-01 13:38:11 +01:00
wernsaar
5c648a8984
Merge remote branch 'origin/develop' into haswell
2013-12-01 11:25:33 +01:00
Zhang Xianyi
5048a80032
Refs #283 . Fixed the incorrect usage of long data type for Windows 64.
2013-11-14 13:46:42 +08:00
Zhang Xianyi
a2942456ef
Refs #307 . Fixed the hang bug when free OpenBLAS dll in Windows.
2013-11-13 10:00:18 +08:00
Sébastien Villemot
eae4cfa3f6
Avoid failure on qemu guests declaring an Athlon CPU without 3dnow!
...
The present patch verifies that, on machines declaring an Athlon CPU model and
family, the 3dnow and 3dnowext feature flags are indeed present. If they are
not, it fallbacks on the most generic x86 kernel. This prevents crashes due to
illegal instruction on qemu guests with a weird configuration.
Closes #272
2013-08-28 14:29:42 +02:00
Zhang Xianyi
2638370844
Init code base for Intel Haswell.
2013-08-13 00:54:59 +08:00
Zhang Xianyi
673e453b3f
Enable bulldozer kernels.
2013-08-05 16:07:54 +08:00
Zhang Xianyi
143cca4dd5
Merge branch 'develop' into bulldozer
2013-08-05 15:51:53 +08:00
Zhang Xianyi
534c5ec919
Fixed #261 . Use strncmp instead of a comparing trick.
2013-07-29 16:48:35 +08:00
Zhang Xianyi
5b504d6c23
Refs #263 . Rollback bulldozer and piledriver kernels to barcelona kernels.
2013-07-28 17:39:24 +08:00
Zhang Xianyi
72b1edaf1b
Merge branch 'develop' into bulldozer
...
Conflicts:
kernel/x86_64/KERNEL.BULLDOZER
2013-07-28 06:38:25 +02:00
Zhang Xianyi
4471c77905
Fixed #261 . Use strncmp instead of a comparing trick.
2013-07-26 23:43:54 +08:00
Zhang Xianyi
2a7503e563
Refs #225 . Fixed a bug in GEMM OpenMP threading.
2013-07-15 09:56:19 +08:00
grisuthedragon
c19a488af2
create openblas_get_parallel to retrieve information which
...
parallelization model is used by OpenBLAS.
2013-07-11 21:39:19 +08:00
Zhang Xianyi
f54f5bac9e
Refs #248 . Fixed the LSB compatiable issue for BLAS only.
...
For example, make CC=lsbcc NO_LAPACK=1.
2013-07-09 15:38:03 +08:00
Zhang Xianyi
5d3312142a
Refs #221 #246 . Fixed the overflowing stack bug in mutlithreading BLAS3.
...
When NUM_THREADS(MAX_CPU_NUNBERS) is very large ,e.g. 256.
typedef struct {
volatile BLASLONG working[MAX_CPU_NUMBER][CACHE_LINE_SIZE * DIVIDE_RATE];
} job_t;
job_t job[MAX_CPU_NUMBER];
The job array is equal 8MB.
Thus, We use malloc instead of stack allocation.
2013-07-08 01:07:05 +08:00
Zhang Xianyi
886cbaf4e4
Support AMD Piledriver by bulldozer kernels.
2013-07-06 12:06:43 -03:00
Zhang Xianyi
32dbeb636d
Refs #221 . Set stack limit to 16MB to prevent a SEGFAULT bug on Mac OS X with DYNAMIC_ARCH=1 & NUM_THREADS=256.
2013-07-02 14:17:55 +08:00
Dan Luu
88ef307cef
Refs #241 . Add Haswell support (using sandybridge optimizations)
2013-06-30 22:35:14 +08:00
Zhang Xianyi
65ffead0cf
Refs #124 . Check XSAVE flag on x86 CPU.
2013-06-06 22:50:43 +08:00
Zhang Xianyi
f1ce74ffdd
Improved the print when OS don't support AVX.
2013-03-02 14:15:54 +08:00
Zhang Xianyi
d744c9590a
In OpenMP threading, preallocate the thread buffer instead of allocating the buffer every time. This patch improved the performance slightly.
2013-03-01 14:36:47 +08:00
Zhang Xianyi
3cc6ae793e
Refs #174 . Return sb pointer when OpenMP or Windows.
2013-02-26 00:48:21 +08:00
Zhang Xianyi
5155e3f509
Refs #174 . Fixed the overflowing buffer bug of multithreading hbmv and sbmv.
...
Instead of using thread 0 buffer, each thread uses its own sb buffer.
Thus, it can avoid overflowing thread 0 buffer.
2013-02-13 16:05:58 +08:00
Zhang Xianyi
5c8bf6ae0e
Merge branch 'bulldozer' into develop
2013-02-10 01:19:42 +08:00