640cccc2b1 
								
							 
						 
						
							
							
								
								Refs  #697 . Fixed gemv bug for Windows.  
							
							... 
							
							
							
							Thank matzeri's patch. 
							
						 
						
							2015-11-30 15:19:45 -06:00  
				
					
						
							
							
								 
						
							
								55a0b27c01 
								
							 
						 
						
							
							
								
								Minor C code fixes in interface/  
							
							
							
						 
						
							2015-11-09 14:15:49 +05:30  
				
					
						
							
							
								 
						
							
								2feef49fa8 
								
							 
						 
						
							
							
								
								Merge branch 'develop' into cmake  
							
							... 
							
							
							
							Conflicts:
	driver/others/memory.c 
							
						 
						
							2015-10-26 14:54:34 -05:00  
				
					
						
							
							
								 
						
							
								5a291606ad 
								
							 
						 
						
							
							
								
								Refs  #671 . the return of i?max cannot larger than N.  
							
							
							
						 
						
							2015-10-24 01:16:34 +08:00  
				
					
						
							
							
								 
						
							
								8fade093aa 
								
							 
						 
						
							
							
								
								Fixed cmake bug on Visual Studio.  
							
							
							
						 
						
							2015-10-20 14:37:22 -05:00  
				
					
						
							
							
								 
						
							
								94b125255f 
								
							 
						 
						
							
							
								
								Merge branch 'develop' into cmake  
							
							... 
							
							
							
							Conflicts:
	driver/others/memory.c 
							
						 
						
							2015-10-13 04:46:08 +08:00  
				
					
						
							
							
								 
						
							
								baec8f5cac 
								
							 
						 
						
							
							
								
								Refs  #638 . Fixed compiling bug with clang on Mac OS X.  
							
							
							
						 
						
							2015-09-10 10:32:07 -05:00  
				
					
						
							
							
								 
						
							
								711ca33bc6 
								
							 
						 
						
							
							
								
								Improved Ximatcopy when lda==ldb.  
							
							... 
							
							
							
							The Ximatcopy functions create a copy of the input matrix
although they seem to work inplace. The new routines
XIMATCOPY_K_YY perform the operations inplace if the leading
dimension does not change. 
							
						 
						
							2015-09-07 14:36:16 +02:00  
				
					
						
							
							
								 
						
							
								f874465bb8 
								
							 
						 
						
							
							
								
								Use cmake to build OpenBLAS GENERIC Target on MSVC x86 64-bit.  
							
							... 
							
							
							
							Disable CBLAS and LAPACK. 
							
						 
						
							2015-08-10 14:10:44 -05:00  
				
					
						
							
							
								 
						
							
								dcd5ba4443 
								
							 
						 
						
							
							
								
								Merge branch 'cmake' of  https://github.com/hpanderson/OpenBLAS  into hpanderson_cmake  
							
							
							
						 
						
							2015-07-22 04:06:39 +08:00  
				
					
						
							
							
								 
						
							
								f8f2e261fe 
								
							 
						 
						
							
							
								
								use only 1 thread if m or n < 2*GEMM_MULTITHREAD_THRESHOLD  
							
							
							
						 
						
							2015-05-06 10:41:53 +02:00  
				
					
						
							
							
								 
						
							
								ab567d8443 
								
							 
						 
						
							
							
								
								gemv: Ensure stack buffer is large enough to handle memory alignment  
							
							... 
							
							
							
							Ref #478  
							
						 
						
							2015-04-24 10:12:49 +02:00  
				
					
						
							
							
								 
						
							
								847e19c04e 
								
							 
						 
						
							
							
								
								Refs #478,#482, Enable stack alloc for s/dgemv_t.(revert 9798491)  
							
							
							
						 
						
							2015-04-20 23:22:40 -05:00  
				
					
						
							
							
								 
						
							
								fd9fd42936 
								
							 
						 
						
							
							
								
								Refs  #478 ,  #482 . Fixed bug on previous commit.  
							
							
							
						 
						
							2015-04-13 23:22:27 -05:00  
				
					
						
							
							
								 
						
							
								9798481979 
								
							 
						 
						
							
							
								
								Refs  #478 ,  #482 . Fix segfault bug for gemv_t with MAX_ALLOC_STACK flag.  
							
							... 
							
							
							
							For gemv_t, directly use malloc to create the buffer. 
							
						 
						
							2015-04-13 19:45:27 -05:00  
				
					
						
							
							
								 
						
							
								cdefdb21cd 
								
							 
						 
						
							
							
								
								Refs  #492 . Fixed c/zsyr bug with negative incx.  
							
							
							
						 
						
							2015-02-26 06:37:03 +08:00  
				
					
						
							
							
								 
						
							
								0d8e227ea7 
								
							 
						 
						
							
							
								
								Changed strategy for setting preprocessor definitions.  
							
							... 
							
							
							
							Instead of generating separate object files for each permutation of
defines for a source file, GenerateNamedObjects now writes an entirely
new source file and inserts the defines as #define c statements.
This solves a problem I ran into with ar.exe where it was refusing to
link objects that had the same filename despite having different paths. 
							
						 
						
							2015-02-24 12:26:33 -06:00  
				
					
						
							
							
								 
						
							
								b2284647a3 
								
							 
						 
						
							
							
								
								More complex objects.  
							
							
							
						 
						
							2015-02-23 07:51:05 -06:00  
				
					
						
							
							
								 
						
							
								a6116e5859 
								
							 
						 
						
							
							
								
								Added some more complex-only objects.  
							
							
							
						 
						
							2015-02-22 17:49:28 -06:00  
				
					
						
							
							
								 
						
							
								67e39bd8fb 
								
							 
						 
						
							
							
								
								Added mangled complex filenames to interface and lapack CMakeLists.txt.  
							
							
							
						 
						
							2015-02-17 13:12:30 -06:00  
				
					
						
							
							
								 
						
							
								9eb1499095 
								
							 
						 
						
							
							
								
								Added another param to GenerateNamedObjects to mangle complex source names.  
							
							... 
							
							
							
							There are a lot of sources for complex float types that are the same
names as the real sources, except with z prepended. 
							
						 
						
							2015-02-17 10:30:28 -06:00  
				
					
						
							
							
								 
						
							
								39cc6b21d3 
								
							 
						 
						
							
							
								
								Add ATLAS-style ?geadd function  
							
							
							
						 
						
							2015-02-16 13:46:20 +01:00  
				
					
						
							
							
								 
						
							
								4662a0b13a 
								
							 
						 
						
							
							
								
								Changed generate functions to iterate through a list of float types.  
							
							... 
							
							
							
							This will generate obj files for SINGLE/DOUBLE/COMPLEX/DOUBLE COMPLEX. 
							
						 
						
							2015-02-15 17:44:37 -06:00  
				
					
						
							
							
								 
						
							
								e74462a3f5 
								
							 
						 
						
							
							
								
								Moved declarations to start of functions to satisfy MSVC C89 implementation.  
							
							
							
						 
						
							2015-02-11 11:16:57 -06:00  
				
					
						
							
							
								 
						
							
								e8c39138c6 
								
							 
						 
						
							
							
								
								Removed return value from GenerateNamedObjects.  
							
							... 
							
							
							
							It sets DBLAS_OBJS directly to save a bunch of list appending in the
CMakeLists.txt files. 
							
						 
						
							2015-02-09 12:28:09 -06:00  
				
					
						
							
							
								 
						
							
								58cff2fed8 
								
							 
						 
						
							
							
								
								Added CBLAS define/naming convention to GenerateNamedObjects.  
							
							
							
						 
						
							2015-02-04 11:30:15 -06:00  
				
					
						
							
							
								 
						
							
								5690cf3f0e 
								
							 
						 
						
							
							
								
								Added override for function names in GenerateNamedObjects.  
							
							... 
							
							
							
							The BLAS interface folder should now be generated the correct objects
for the DOUBLE case. 
							
						 
						
							2015-02-04 10:52:19 -06:00  
				
					
						
							
							
								 
						
							
								a0aeda6187 
								
							 
						 
						
							
							
								
								Added function to set defines for the object names (e.g. -DNAME=dgemm).  
							
							
							
						 
						
							2015-02-04 10:37:34 -06:00  
				
					
						
							
							
								 
						
							
								20e593a44a 
								
							 
						 
						
							
							
								
								Added cblas_ objects to interface CMakeLists.  
							
							... 
							
							
							
							Naming isn't right, though, not seeing cblas_xxxx exports in the
resulting library. 
							
						 
						
							2015-02-02 16:25:30 -06:00  
				
					
						
							
							
								 
						
							
								9e154aba58 
								
							 
						 
						
							
							
								
								Added LAPACK object files to interface CMakeLists.  
							
							
							
						 
						
							2015-02-02 12:31:15 -06:00  
				
					
						
							
							
								 
						
							
								5057a4b4df 
								
							 
						 
						
							
							
								
								Added openblas add_library call that uses DBLAS_OBJS ojbects.  
							
							
							
						 
						
							2015-01-30 15:21:21 -06:00  
				
					
						
							
							
								 
						
							
								a6cf8aafc0 
								
							 
						 
						
							
							
								
								Updated level3/CMakeLists with correct defines using all combos.  
							
							
							
						 
						
							2015-01-30 11:21:50 -06:00  
				
					
						
							
							
								 
						
							
								b17ccb4c5c 
								
							 
						 
						
							
							
								
								Fix a segfault in gemv when MAX_STACK_ALLOC is set  
							
							... 
							
							
							
							* stack_alloc_size is needed after the implementation call
but it may be overwritten if it's optimized to a register,
because some gemv implementation (ex: dgemv_n.S) do not
restore all register (ex: r10).
* do the same in ger.c for the same reasons even if the bug
has not been observed. 
							
						 
						
							2015-01-29 09:55:57 +01:00  
				
					
						
							
							
								 
						
							
								5eefe18ae4 
								
							 
						 
						
							
							
								
								Added CMakeLists.txt for the first of the BLAS folders.  
							
							... 
							
							
							
							It only does the double precision compile currently.
I realized I didn't finish converting Makefile.system yet, so I made
a note of that. 
							
						 
						
							2015-01-27 16:17:17 -06:00  
				
					
						
							
							
								 
						
							
								e9d9a8eae3 
								
							 
						 
						
							
							
								
								Allow to do gemv and ger buffer allocation on the stack  
							
							... 
							
							
							
							ger and gemv call blas_memory_alloc/free which in their turn
call blas_lock. blas_lock create thread contention when matrices
are small and the number of thread is high enough. We avoid
call blas_memory_alloc by replacing it with stack allocation.
This can be enabled with:
make -DMAX_STACK_ALLOC=2048
The given size (in byte) must be high enough to avoid thread contention
and small enough to avoid stack overflow.
Fix  #478  
							
						 
						
							2014-12-27 14:33:12 +01:00  
				
					
						
							
							
								 
						
							
								9e829ce98f 
								
							 
						 
						
							
							
								
								enabled cblas gemm3m functions  
							
							
							
						 
						
							2014-09-20 17:20:02 +02:00  
				
					
						
							
							
								 
						
							
								d49fd33885 
								
							 
						 
						
							
							
								
								disabled SYMM3M and HEMM3M functions because segment violations  
							
							
							
						 
						
							2014-09-20 15:27:40 +02:00  
				
					
						
							
							
								 
						
							
								7aae4a62e7 
								
							 
						 
						
							
							
								
								enabled use of GEMM3M functions  
							
							
							
						 
						
							2014-09-20 14:27:10 +02:00  
				
					
						
							
							
								 
						
							
								3300f5ebff 
								
							 
						 
						
							
							
								
								optimized multithreading lower limits  
							
							
							
						 
						
							2014-09-15 11:38:25 +02:00  
				
					
						
							
							
								 
						
							
								fd2478c9e2 
								
							 
						 
						
							
							
								
								optimized interface/zgemv.c for multithreading  
							
							
							
						 
						
							2014-09-12 19:18:23 +02:00  
				
					
						
							
							
								 
						
							
								1cba8e7b11 
								
							 
						 
						
							
							
								
								Merge pull request  #446  from grisuthedragon/cblas_matcopy  
							
							... 
							
							
							
							Add a CBLAS interface for the BLAS extension s/d/c/z*matcopy routines. 
							
						 
						
							2014-09-10 16:31:31 +08:00  
				
					
						
							
							
								 
						
							
								a057e5434d 
								
							 
						 
						
							
							
								
								add CBLAS interface for s/d/c/zimatcopy  
							
							
							
						 
						
							2014-09-09 09:52:13 +02:00  
				
					
						
							
							
								 
						
							
								7794766d3c 
								
							 
						 
						
							
							
								
								Add cblas_(s/d/c/z)omatcopy in order to have cblas interface for them.  
							
							
							
						 
						
							2014-09-08 17:57:44 +02:00  
				
					
						
							
							
								 
						
							
								f511807fc0 
								
							 
						 
						
							
							
								
								modified multithreading threshold  
							
							
							
						 
						
							2014-09-08 12:27:32 +02:00  
				
					
						
							
							
								 
						
							
								d1800397f5 
								
							 
						 
						
							
							
								
								optimized interface/gemv.c for multithreading  
							
							
							
						 
						
							2014-09-02 17:36:07 +02:00  
				
					
						
							
							
								 
						
							
								f4ff889491 
								
							 
						 
						
							
							
								
								updated interface/gemv.c for multithreading  
							
							
							
						 
						
							2014-09-02 16:30:04 +02:00  
				
					
						
							
							
								 
						
							
								51413925bd 
								
							 
						 
						
							
							
								
								adjust number of threads for small size in cgemv and zgemv  
							
							
							
						 
						
							2014-07-15 16:27:02 +02:00  
				
					
						
							
							
								 
						
							
								b985cea65d 
								
							 
						 
						
							
							
								
								adjust number of threads for sgemv and dgemv  
							
							
							
						 
						
							2014-07-15 16:04:46 +02:00  
				
					
						
							
							
								 
						
							
								d286daa2ba 
								
							 
						 
						
							
							
								
								adjusted number of threads for small size  
							
							
							
						 
						
							2014-07-15 14:41:35 +02:00  
				
					
						
							
							
								 
						
							
								cedc1f4b14 
								
							 
						 
						
							
							
								
								Ref  #410 : disabled optimized potri functions ( single threading bug)  
							
							
							
						 
						
							2014-07-10 13:42:32 +02:00