Commit Graph

1874 Commits

Author SHA1 Message Date
Hank Anderson 8d9b196e0d Moved loop over define combos into a function.
This function takes a set of sources and a set of preprocessor
definitions. It will iterate over the sources and build an object
file for each combination of preprocessor definitions for each
source file.
2015-01-30 12:14:44 -06:00
Hank Anderson a6cf8aafc0 Updated level3/CMakeLists with correct defines using all combos. 2015-01-30 11:21:50 -06:00
Hank Anderson dbdca7bf0c Added first pass at driver/level3 Makefile conversion.
Added a rather convoluted CMake function to find all combinations
of a given list. This will be useful for the object files that are
compiled multiple times with different combinations of preprocessor
definitions.
2015-01-29 22:53:11 -06:00
Hank Anderson dabaecb2bc Moved getarch parsing code into a function. 2015-01-29 09:30:47 -06:00
Zhang Xianyi 07ff001981 Merge pull request #495 from jeromerobert/develop
Fix a segfault in gemv when MAX_STACK_ALLOC is set
2015-01-29 18:23:50 +08:00
Jerome Robert b17ccb4c5c Fix a segfault in gemv when MAX_STACK_ALLOC is set
* stack_alloc_size is needed after the implementation call
but it may be overwritten if it's optimized to a register,
because some gemv implementation (ex: dgemv_n.S) do not
restore all register (ex: r10).
* do the same in ger.c for the same reasons even if the bug
has not been observed.
2015-01-29 09:55:57 +01:00
Hank Anderson 8c23965da3 prebuild.cmake now reads the output from getarch into CMake vars. 2015-01-28 22:57:44 -06:00
Hank Anderson 61f21b5d03 getarch_2nd now appends its output to config.h/config_kernel.h 2015-01-28 22:20:15 -06:00
Hank Anderson 8ede4a8da4 getarch now compiles and sets config.h defines properly.
Still isn't parsed into CMake variables, and getarch_2 needs to
get the same treatment.
2015-01-28 17:18:26 -06:00
Hank Anderson 1c5b6bb4f7 Added CORE define to config.h in prebuild.cmake (temporarily). 2015-01-28 16:33:48 -06:00
Hank Anderson c5f5c7a076 Updated c_check OS/compiler/bits detection. 2015-01-28 15:47:47 -06:00
Hank Anderson 9a508abdc7 Added first pass at driver/level2 makefile conversion. 2015-01-28 14:52:15 -06:00
Hank Anderson 5eefe18ae4 Added CMakeLists.txt for the first of the BLAS folders.
It only does the double precision compile currently.

I realized I didn't finish converting Makefile.system yet, so I made
a note of that.
2015-01-27 16:17:17 -06:00
Hank Anderson 1e8bb0e0e0 Fixed architecture detection when AMD64 in c_check. 2015-01-27 14:03:46 -06:00
Hank Anderson 864b8b31de Fixed incorrect case in OS_ definition in c_check. 2015-01-27 13:54:29 -06:00
Hank Anderson d2d15e522f Started converting lib target to CMake.
The main part of this target is looping through the BLAS subfolders
and calling make on them. Need to add CMakeLists.txt for each of these
subfolders.
2015-01-27 12:23:35 -06:00
Hank Anderson f4d1e7a265 Hardcoded NUM_CORES to get system.cmake working. 2015-01-27 11:37:39 -06:00
Zhang Xianyi 63c6fcfa0a Merge pull request #490 from eschnett/develop
Move #include statements outside extern "C" blocks
2015-01-13 15:43:56 +08:00
Erik Schnetter 29cb47fc06 Move #include statements outside extern "C" blocks 2015-01-12 21:27:52 -05:00
Zhang Xianyi 4e6c4046f7 Fix cortex-a15 detecting bug. 2015-01-12 09:35:16 +00:00
Zhang Xianyi 229ce2ccd1 Add cortex-a9 and cortex-a15 targets. 2015-01-12 08:55:29 +00:00
Zhang Xianyi ef75be0e51 Merge pull request #487 from kortschak/dromtg-test
Add test for drotmg bug fixed by 692b14c
2015-01-07 14:13:11 +08:00
kortschak 5344f335a8 Add test for drotmg bug fixed by 692b14c
Test requested in issue xianyi/OpenBLAS#484.

Run tests by applying the following change and then make:

	diff --git a/Makefile.rule b/Makefile.rule
	index bea1fe1..9852ff3 100644
	--- a/Makefile.rule
	+++ b/Makefile.rule
	@@ -140,7 +140,7 @@ NO_AFFINITY = 1

	-# UTEST_CHECK = 1
	+UTEST_CHECK = 1
2015-01-07 10:06:55 +10:30
Hank Anderson 0f6bec0a32 cmake.prebuild now compiles getarch.
Doesn't actually run it yet.
2015-01-01 21:03:17 -06:00
Hank Anderson 92cdac5f87 Added MSVC functions to cpuid_x86.c to replace gcc-specific ASM. 2015-01-01 21:02:48 -06:00
Hank Anderson 1a41022e3e Added MSVC defines to cpuid.h and getarch.c. 2015-01-01 21:01:28 -06:00
Zhang Xianyi 5cb5af9333 Add configuration options. 2015-01-02 02:42:32 +08:00
Zhang Xianyi 41aad0407f Merge pull request #482 from jeromerobert/develop
Allow to do gemv and ger buffer allocation on the stack
2015-01-02 02:26:17 +08:00
Hank Anderson e5c47e44f6 First pass at converting a few makefiles to CMake. 2014-12-30 21:53:00 -06:00
Zhang Xianyi f8f2e84659 Merge pull request #486 from wernsaar/develop
Optimizations for steamroller
2014-12-31 02:36:23 +08:00
Werner Saar 34633fef01 Merge branch 'develop' of github.com:wernsaar/OpenBLAS into develop 2014-12-30 20:16:53 +08:00
Werner Saar ddf983d643 added optimizations for steamroller 2014-12-30 20:14:45 +08:00
Zhang Xianyi 17b9db20f1 Merge pull request #483 from wernsaar/develop
added Steamroller as a  cpu target
2014-12-29 12:00:16 +08:00
Werner Saar 0dc559ed30 bugfix in dynamic.c 2014-12-28 17:15:42 +01:00
Werner Saar 9566f5fdb0 added Steamroller as a target processor 2014-12-28 13:45:19 +01:00
Werner Saar 4319769b79 added target processor STEAMROLLER 2014-12-28 20:16:46 +08:00
Jerome Robert e9d9a8eae3 Allow to do gemv and ger buffer allocation on the stack
ger and gemv call blas_memory_alloc/free which in their turn
call blas_lock. blas_lock create thread contention when matrices
are small and the number of thread is high enough. We avoid
call blas_memory_alloc by replacing it with stack allocation.
This can be enabled with:
make -DMAX_STACK_ALLOC=2048
The given size (in byte) must be high enough to avoid thread contention
and small enough to avoid stack overflow.

Fix #478
2014-12-27 14:33:12 +01:00
Zhang Xianyi cbb3ab80e7 Merge pull request #481 from eschnett/develop
Correct ilaver C declaration
2014-12-26 10:09:19 +08:00
Erik Schnetter cd9868b1b4 Correct ilaver C declaration 2014-12-25 17:41:17 -05:00
Zhang Xianyi eb738148fe Merge pull request #479 from wernsaar/develop
workaround for sandybridge zgemm kernel
2014-12-23 00:59:41 +08:00
Werner Saar 587e16fba3 Ref #458: Backport, sandybrigde uses nehalem zgemm kernel 2014-12-22 17:01:18 +01:00
Werner Saar 4de7b9ae47 increased NMAX to 128 2014-12-22 14:04:27 +01:00
Werner Saar 887aed634d modified sources for OS Darwin 2014-12-19 12:40:46 +01:00
Werner Saar 6261342de3 small optimization on dgemm_kernel for N=1 2014-12-18 20:35:51 +01:00
Werner Saar 1e566223ed added code for the size of n 2014-12-17 15:02:11 +01:00
Werner Saar 113b48ca22 modified makefile for acml6.1 2014-12-17 14:12:21 +01:00
Zhang Xianyi 3e81c99b6b Fixed installation bug on Mac OSX. 2014-12-13 13:05:06 +08:00
Werner Saar ec85c4a51d Increased the Threshold value in sep.in 2014-12-11 14:57:41 +01:00
Werner Saar 97de657d38 added tests to sep.as as workaround for gfortran-4.8.x 2014-12-11 13:53:59 +01:00
Zhang Xianyi 71966eba6c Merge pull request #475 from xantares/patch-2
add OpenBLAS_VERSION to cmake config file
2014-12-09 17:57:43 +08:00