Hank Anderson
8d9b196e0d
Moved loop over define combos into a function.
...
This function takes a set of sources and a set of preprocessor
definitions. It will iterate over the sources and build an object
file for each combination of preprocessor definitions for each
source file.
2015-01-30 12:14:44 -06:00
Hank Anderson
a6cf8aafc0
Updated level3/CMakeLists with correct defines using all combos.
2015-01-30 11:21:50 -06:00
Hank Anderson
dbdca7bf0c
Added first pass at driver/level3 Makefile conversion.
...
Added a rather convoluted CMake function to find all combinations
of a given list. This will be useful for the object files that are
compiled multiple times with different combinations of preprocessor
definitions.
2015-01-29 22:53:11 -06:00
Hank Anderson
dabaecb2bc
Moved getarch parsing code into a function.
2015-01-29 09:30:47 -06:00
Zhang Xianyi
07ff001981
Merge pull request #495 from jeromerobert/develop
...
Fix a segfault in gemv when MAX_STACK_ALLOC is set
2015-01-29 18:23:50 +08:00
Jerome Robert
b17ccb4c5c
Fix a segfault in gemv when MAX_STACK_ALLOC is set
...
* stack_alloc_size is needed after the implementation call
but it may be overwritten if it's optimized to a register,
because some gemv implementation (ex: dgemv_n.S) do not
restore all register (ex: r10).
* do the same in ger.c for the same reasons even if the bug
has not been observed.
2015-01-29 09:55:57 +01:00
Hank Anderson
8c23965da3
prebuild.cmake now reads the output from getarch into CMake vars.
2015-01-28 22:57:44 -06:00
Hank Anderson
61f21b5d03
getarch_2nd now appends its output to config.h/config_kernel.h
2015-01-28 22:20:15 -06:00
Hank Anderson
8ede4a8da4
getarch now compiles and sets config.h defines properly.
...
Still isn't parsed into CMake variables, and getarch_2 needs to
get the same treatment.
2015-01-28 17:18:26 -06:00
Hank Anderson
1c5b6bb4f7
Added CORE define to config.h in prebuild.cmake (temporarily).
2015-01-28 16:33:48 -06:00
Hank Anderson
c5f5c7a076
Updated c_check OS/compiler/bits detection.
2015-01-28 15:47:47 -06:00
Hank Anderson
9a508abdc7
Added first pass at driver/level2 makefile conversion.
2015-01-28 14:52:15 -06:00
Hank Anderson
5eefe18ae4
Added CMakeLists.txt for the first of the BLAS folders.
...
It only does the double precision compile currently.
I realized I didn't finish converting Makefile.system yet, so I made
a note of that.
2015-01-27 16:17:17 -06:00
Hank Anderson
1e8bb0e0e0
Fixed architecture detection when AMD64 in c_check.
2015-01-27 14:03:46 -06:00
Hank Anderson
864b8b31de
Fixed incorrect case in OS_ definition in c_check.
2015-01-27 13:54:29 -06:00
Hank Anderson
d2d15e522f
Started converting lib target to CMake.
...
The main part of this target is looping through the BLAS subfolders
and calling make on them. Need to add CMakeLists.txt for each of these
subfolders.
2015-01-27 12:23:35 -06:00
Hank Anderson
f4d1e7a265
Hardcoded NUM_CORES to get system.cmake working.
2015-01-27 11:37:39 -06:00
Zhang Xianyi
63c6fcfa0a
Merge pull request #490 from eschnett/develop
...
Move #include statements outside extern "C" blocks
2015-01-13 15:43:56 +08:00
Erik Schnetter
29cb47fc06
Move #include statements outside extern "C" blocks
2015-01-12 21:27:52 -05:00
Zhang Xianyi
4e6c4046f7
Fix cortex-a15 detecting bug.
2015-01-12 09:35:16 +00:00
Zhang Xianyi
229ce2ccd1
Add cortex-a9 and cortex-a15 targets.
2015-01-12 08:55:29 +00:00
Zhang Xianyi
ef75be0e51
Merge pull request #487 from kortschak/dromtg-test
...
Add test for drotmg bug fixed by 692b14c
2015-01-07 14:13:11 +08:00
kortschak
5344f335a8
Add test for drotmg bug fixed by 692b14c
...
Test requested in issue xianyi/OpenBLAS#484 .
Run tests by applying the following change and then make:
diff --git a/Makefile.rule b/Makefile.rule
index bea1fe1..9852ff3 100644
--- a/Makefile.rule
+++ b/Makefile.rule
@@ -140,7 +140,7 @@ NO_AFFINITY = 1
-# UTEST_CHECK = 1
+UTEST_CHECK = 1
2015-01-07 10:06:55 +10:30
Hank Anderson
0f6bec0a32
cmake.prebuild now compiles getarch.
...
Doesn't actually run it yet.
2015-01-01 21:03:17 -06:00
Hank Anderson
92cdac5f87
Added MSVC functions to cpuid_x86.c to replace gcc-specific ASM.
2015-01-01 21:02:48 -06:00
Hank Anderson
1a41022e3e
Added MSVC defines to cpuid.h and getarch.c.
2015-01-01 21:01:28 -06:00
Zhang Xianyi
5cb5af9333
Add configuration options.
2015-01-02 02:42:32 +08:00
Zhang Xianyi
41aad0407f
Merge pull request #482 from jeromerobert/develop
...
Allow to do gemv and ger buffer allocation on the stack
2015-01-02 02:26:17 +08:00
Hank Anderson
e5c47e44f6
First pass at converting a few makefiles to CMake.
2014-12-30 21:53:00 -06:00
Zhang Xianyi
f8f2e84659
Merge pull request #486 from wernsaar/develop
...
Optimizations for steamroller
2014-12-31 02:36:23 +08:00
Werner Saar
34633fef01
Merge branch 'develop' of github.com:wernsaar/OpenBLAS into develop
2014-12-30 20:16:53 +08:00
Werner Saar
ddf983d643
added optimizations for steamroller
2014-12-30 20:14:45 +08:00
Zhang Xianyi
17b9db20f1
Merge pull request #483 from wernsaar/develop
...
added Steamroller as a cpu target
2014-12-29 12:00:16 +08:00
Werner Saar
0dc559ed30
bugfix in dynamic.c
2014-12-28 17:15:42 +01:00
Werner Saar
9566f5fdb0
added Steamroller as a target processor
2014-12-28 13:45:19 +01:00
Werner Saar
4319769b79
added target processor STEAMROLLER
2014-12-28 20:16:46 +08:00
Jerome Robert
e9d9a8eae3
Allow to do gemv and ger buffer allocation on the stack
...
ger and gemv call blas_memory_alloc/free which in their turn
call blas_lock. blas_lock create thread contention when matrices
are small and the number of thread is high enough. We avoid
call blas_memory_alloc by replacing it with stack allocation.
This can be enabled with:
make -DMAX_STACK_ALLOC=2048
The given size (in byte) must be high enough to avoid thread contention
and small enough to avoid stack overflow.
Fix #478
2014-12-27 14:33:12 +01:00
Zhang Xianyi
cbb3ab80e7
Merge pull request #481 from eschnett/develop
...
Correct ilaver C declaration
2014-12-26 10:09:19 +08:00
Erik Schnetter
cd9868b1b4
Correct ilaver C declaration
2014-12-25 17:41:17 -05:00
Zhang Xianyi
eb738148fe
Merge pull request #479 from wernsaar/develop
...
workaround for sandybridge zgemm kernel
2014-12-23 00:59:41 +08:00
Werner Saar
587e16fba3
Ref #458 : Backport, sandybrigde uses nehalem zgemm kernel
2014-12-22 17:01:18 +01:00
Werner Saar
4de7b9ae47
increased NMAX to 128
2014-12-22 14:04:27 +01:00
Werner Saar
887aed634d
modified sources for OS Darwin
2014-12-19 12:40:46 +01:00
Werner Saar
6261342de3
small optimization on dgemm_kernel for N=1
2014-12-18 20:35:51 +01:00
Werner Saar
1e566223ed
added code for the size of n
2014-12-17 15:02:11 +01:00
Werner Saar
113b48ca22
modified makefile for acml6.1
2014-12-17 14:12:21 +01:00
Zhang Xianyi
3e81c99b6b
Fixed installation bug on Mac OSX.
2014-12-13 13:05:06 +08:00
Werner Saar
ec85c4a51d
Increased the Threshold value in sep.in
2014-12-11 14:57:41 +01:00
Werner Saar
97de657d38
added tests to sep.as as workaround for gfortran-4.8.x
2014-12-11 13:53:59 +01:00
Zhang Xianyi
71966eba6c
Merge pull request #475 from xantares/patch-2
...
add OpenBLAS_VERSION to cmake config file
2014-12-09 17:57:43 +08:00