36fcb52094 
								
							 
						 
						
							
							
								
								Fix logic - we want real OR imaginary part of X to be nonzero here  
							
							
							
						 
						
							2023-04-01 00:02:54 +02:00  
				
					
						
							
							
								 
						
							
								f2659516ef 
								
							 
						 
						
							
							
								
								remove unqualified ifdef's for NO_LAPACK(E)  
							
							
							
						 
						
							2023-03-28 19:01:31 +11:00  
				
					
						
							
							
								 
						
							
								7f0b11fbc1 
								
							 
						 
						
							
							
								
								Exclude some complex drivers when NO_LAPACK is set  
							
							
							
						 
						
							2022-01-27 22:00:39 +01:00  
				
					
						
							
							
								 
						
							
								5f6a609253 
								
							 
						 
						
							
							
								
								Add sbgemv  
							
							
							
						 
						
							2021-09-14 16:13:57 +02:00  
				
					
						
							
							
								 
						
							
								a7b1f9b1bb 
								
							 
						 
						
							
							
								
								Implementation of BF16 based gemv  
							
							... 
							
							
							
							1. Add a new API -- sbgemv to support bfloat16 based gemv
2. Implement a generic kernel for sbgemv
3. Implement an avx512-bf16 based kernel for sbgemv
Signed-off-by: Chen, Guobing <guobing.chen@intel.com> 
							
						 
						
							2020-10-29 02:08:23 +08:00  
				
					
						
							
							
								 
						
							
								887e00fd7f 
								
							 
						 
						
							
							
								
								Adapt for supporting only a subset of variable types  
							
							
							
						 
						
							2020-10-11 14:58:57 +02:00  
				
					
						
							
							
								 
						
							
								3287848c8f 
								
							 
						 
						
							
							
								
								Support building only seleced types  
							
							
							
						 
						
							2020-09-22 23:20:51 +02:00  
				
					
						
							
							
								 
						
							
								806f89166e 
								
							 
						 
						
							
							
								
								Make ARMV7 compile with xcode and add a CI job for it ( #2537 )  
							
							... 
							
							
							
							* Add an ARMV7 iOS build on Travis
* thread_local appears to be unavailable on ARMV7 iOS
* Add no-thumb option for ARMV7 IOS build to get it to accept DMB ISH
* Make local labels in macros of nrm2_vfpv3.S compatible with the xcode assembler 
							
						 
						
							2020-04-02 10:30:37 +02:00  
				
					
						
							
							
								 
						
							
								8617d75548 
								
							 
						 
						
							
							
								
								Revert "Avoid taking root of negative number in symv_thread.c"  
							
							
							
						 
						
							2019-10-01 23:50:41 +02:00  
				
					
						
							
							
								 
						
							
								6355c25dde 
								
							 
						 
						
							
							
								
								Avoid taking root of negative number in symv_thread.c  
							
							... 
							
							
							
							This is similar to fixes in gh-1929, but there was one remaining
occurance of this type of pattern in the driver/level2/*_thread.c
files. 
							
						 
						
							2019-09-29 22:03:12 -07:00  
				
					
						
							
							
								 
						
							
								45333d5793 
								
							 
						 
						
							
							
								
								Fix error introduced during cleanup  
							
							
							
						 
						
							2019-02-19 22:16:33 +01:00  
				
					
						
							
							
								 
						
							
								78d9910236 
								
							 
						 
						
							
							
								
								Correct range_n limiting  
							
							... 
							
							
							
							same bug as seen in #1388 , somehow missed in corresponding PR #1389  
							
						 
						
							2019-02-19 20:59:48 +01:00  
				
					
						
							
							
								 
						
							
								5a720cf9ca 
								
							 
						 
						
							
							
								
								Re-enable loop unrolling in trmv and remove the scary warning  
							
							... 
							
							
							
							fixes  #1748  as that half of the fix for #1332  appears to have been an overreaction on my part. 
						
							2018-12-30 15:22:37 +01:00  
				
					
						
							
							
								 
						
							
								368d14f8c8 
								
							 
						 
						
							
							
								
								Fix harmless typo  
							
							... 
							
							
							
							fixes  #1872  
						
							2018-11-16 14:58:28 +01:00  
				
					
						
							
							
								 
						
							
								0427277cef 
								
							 
						 
						
							
							
								
								Allow optimization for small m, large n only if it can be made threadsafe  
							
							... 
							
							
							
							otherwise the introduction of a static array in 8e5a108#532  breaks concurrent calls from multiple threads as seen in #1844  
							
						 
						
							2018-11-10 15:45:54 +01:00  
				
					
						
							
							
								 
						
							
								cc9500db41 
								
							 
						 
						
							
							
								
								Merge pull request  #1403  from brada4/develop  
							
							... 
							
							
							
							Address few more warnings 
							
						 
						
							2017-12-30 14:51:34 +01:00  
				
					
						
							
							
								 
						
							
								bfc2a88594 
								
							 
						 
						
							
							
								
								remove unused buffer  
							
							
							
						 
						
							2017-12-22 00:55:40 +01:00  
				
					
						
							
							
								 
						
							
								177b78c8b4 
								
							 
						 
						
							
							
								
								Issue1388 ( #1389 )  
							
							... 
							
							
							
							* Calculation of chunk range limits was ignoring num_cpu
bug introduced by me in #1262  - should fix  #1388 
* Calculation of range limits was ignoring num_cpu
bug introduced by me in #1262 
* Calculation of chunk range limits was ignoring num_cpu
bug introduced by me in #1262 
* Calculation of chunk range limits was ignoring num_cpu
bug introduced by me in #1262 
* Calculation of chunk range limits was ignoring num_cpu
bug introduced by me in #1262 
* Calculation of chunk range limits was ignoring num_cpu
bug introduced by me in #1262  
							
						 
						
							2017-12-09 22:29:03 +01:00  
				
					
						
							
							
								 
						
							
								281a2b952f 
								
							 
						 
						
							
							
								
								warning cleanup ( #1380 )  
							
							... 
							
							
							
							* dead increments in driver/level2
* dead increments in kernel/generic
* part dead increments in kernel/x86_64 
							
						 
						
							2017-12-05 19:54:10 +01:00  
				
					
						
							
							
								 
						
							
								b414283f48 
								
							 
						 
						
							
							
								
								Disable gemv unrolling  
							
							... 
							
							
							
							as a (hopefully temporary) workaround for #1332  
							
						 
						
							2017-12-03 22:41:54 +01:00  
				
					
						
							
							
								 
						
							
								e14d50d86e 
								
							 
						 
						
							
							
								
								eliminate Wunused-const gcc7 warning  
							
							
							
						 
						
							2017-11-24 19:13:24 +01:00  
				
					
						
							
							
								 
						
							
								37858d1146 
								
							 
						 
						
							
							
								
								Fix threading usage in CMake: s/SMP/USE_THREAD/  
							
							
							
						 
						
							2017-08-19 15:07:42 +10:00  
				
					
						
							
							
								 
						
							
								719fcc56b0 
								
							 
						 
						
							
							
								
								Merge pull request  #1262  from martin-frbg/xmv_thread-splitting  
							
							... 
							
							
							
							Make sure that range limit of last thread never exceeds data size 
							
						 
						
							2017-08-06 14:11:44 +02:00  
				
					
						
							
							
								 
						
							
								0ba64cee60 
								
							 
						 
						
							
							
								
								Update trmv_thread.c  
							
							
							
						 
						
							2017-08-02 12:03:54 +02:00  
				
					
						
							
							
								 
						
							
								c4e5ba1bfe 
								
							 
						 
						
							
							
								
								Make sure that range_n of last thread never exceeds the actual data size when splitting the workload  
							
							
							
						 
						
							2017-08-02 00:37:58 +02:00  
				
					
						
							
							
								 
						
							
								a6f533b248 
								
							 
						 
						
							
							
								
								Revert "Fix calculated range limit exceeding actual data size for last thread"  
							
							
							
						 
						
							2017-08-01 19:28:08 +02:00  
				
					
						
							
							
								 
						
							
								d245caa49a 
								
							 
						 
						
							
							
								
								Support out-of-source build  
							
							
							
						 
						
							2017-08-01 15:16:14 +05:30  
				
					
						
							
							
								 
						
							
								585c0010a5 
								
							 
						 
						
							
							
								
								Fix range limit exceeding actual data size in last step  
							
							
							
						 
						
							2017-07-28 00:27:02 +02:00  
				
					
						
							
							
								 
						
							
								857f61bc5d 
								
							 
						 
						
							
							
								
								Fix range limit exceeding data size in last step  
							
							
							
						 
						
							2017-07-28 00:21:53 +02:00  
				
					
						
							
							
								 
						
							
								9332042d5f 
								
							 
						 
						
							
							
								
								Fix range exceeding actual data size in quick_divide  
							
							
							
						 
						
							2017-07-28 00:13:24 +02:00  
				
					
						
							
							
								 
						
							
								529bfc36ec 
								
							 
						 
						
							
							
								
								Fix write past fixed size buffer  
							
							
							
						 
						
							2017-07-12 00:59:30 +02:00  
				
					
						
							
							
								 
						
							
								053044ae4d 
								
							 
						 
						
							
							
								
								Replace CMAKE_SOURCE_DIR/CMAKE_BINARY_DIR with PROJECT_SOURCE_DIR/PROJECT_BINARY_DIR  
							
							... 
							
							
							
							If OpenBLAS is built using add_subdirectory(OpenBlas) as part of another project
then the paths set by CMAKE_XXX_DIR are relative to the parent project
and not the OpenBLAS project. 
							
						 
						
							2016-05-25 09:13:28 +02:00  
				
					
						
							
							
								 
						
							
								53ba1a77c8 
								
							 
						 
						
							
							
								
								ztrmv_L.c: no longer need a 4kB buffer  
							
							... 
							
							
							
							Fix  #786  
						
							2016-03-05 19:07:03 +01:00  
				
					
						
							
							
								 
						
							
								78dcf5c3d5 
								
							 
						 
						
							
							
								
								Improve performances of ztrmv on small matrices  
							
							... 
							
							
							
							* Use stack allocation
* Disable multi-threading
* Ref #727  
							
						 
						
							2016-02-08 11:25:02 +01:00  
				
					
						
							
							
								 
						
							
								fbc21266e6 
								
							 
						 
						
							
							
								
								Minor C code fixes in driver/  
							
							
							
						 
						
							2015-11-09 14:15:49 +05:30  
				
					
						
							
							
								 
						
							
								d8392c1245 
								
							 
						 
						
							
							
								
								Fixe cmake config bugs.  
							
							
							
						 
						
							2015-10-20 04:30:55 +08:00  
				
					
						
							
							
								 
						
							
								f8eba3d548 
								
							 
						 
						
							
							
								
								Fixed cmake build bugs on Linux.  
							
							
							
						 
						
							2015-08-11 16:25:16 -05:00  
				
					
						
							
							
								 
						
							
								f874465bb8 
								
							 
						 
						
							
							
								
								Use cmake to build OpenBLAS GENERIC Target on MSVC x86 64-bit.  
							
							... 
							
							
							
							Disable CBLAS and LAPACK. 
							
						 
						
							2015-08-10 14:10:44 -05:00  
				
					
						
							
							
								 
						
							
								dcd5ba4443 
								
							 
						 
						
							
							
								
								Merge branch 'cmake' of  https://github.com/hpanderson/OpenBLAS  into hpanderson_cmake  
							
							
							
						 
						
							2015-07-22 04:06:39 +08:00  
				
					
						
							
							
								 
						
							
								8e5a1083bb 
								
							 
						 
						
							
							
								
								Refs  #532 . Improve gemv paralel with small m and large n case.  
							
							... 
							
							
							
							Splite the matrix and reduction. 
							
						 
						
							2015-05-08 05:33:17 +08:00  
				
					
						
							
							
								 
						
							
								ab7043373f 
								
							 
						 
						
							
							
								
								Fixed bug generating trmv complex source names.  
							
							
							
						 
						
							2015-02-24 15:18:41 -06:00  
				
					
						
							
							
								 
						
							
								0553476fba 
								
							 
						 
						
							
							
								
								Added TRANS defines for complex sources in lapack.  
							
							
							
						 
						
							2015-02-24 14:30:35 -06:00  
				
					
						
							
							
								 
						
							
								2416d9dbac 
								
							 
						 
						
							
							
								
								Fixed TRANSA defines for complex sources in driver/level2.  
							
							
							
						 
						
							2015-02-24 13:18:07 -06:00  
				
					
						
							
							
								 
						
							
								0d8e227ea7 
								
							 
						 
						
							
							
								
								Changed strategy for setting preprocessor definitions.  
							
							... 
							
							
							
							Instead of generating separate object files for each permutation of
defines for a source file, GenerateNamedObjects now writes an entirely
new source file and inserts the defines as #define c statements.
This solves a problem I ran into with ar.exe where it was refusing to
link objects that had the same filename despite having different paths. 
							
						 
						
							2015-02-24 12:26:33 -06:00  
				
					
						
							
							
								 
						
							
								1b7f427401 
								
							 
						 
						
							
							
								
								Added conj gemv objects for complex build.  
							
							
							
						 
						
							2015-02-23 10:24:31 -06:00  
				
					
						
							
							
								 
						
							
								fb5d5bb971 
								
							 
						 
						
							
							
								
								Added defines for complex trmv.  
							
							
							
						 
						
							2015-02-21 12:39:03 -06:00  
				
					
						
							
							
								 
						
							
								33c5e8db7f 
								
							 
						 
						
							
							
								
								Added a helper function for setting the L1 kernel defaults.  
							
							... 
							
							
							
							Added loop to build objects with different KERNEL defines. 
							
						 
						
							2015-02-17 21:36:23 -06:00  
				
					
						
							
							
								 
						
							
								4662a0b13a 
								
							 
						 
						
							
							
								
								Changed generate functions to iterate through a list of float types.  
							
							... 
							
							
							
							This will generate obj files for SINGLE/DOUBLE/COMPLEX/DOUBLE COMPLEX. 
							
						 
						
							2015-02-15 17:44:37 -06:00  
				
					
						
							
							
								 
						
							
								e8c39138c6 
								
							 
						 
						
							
							
								
								Removed return value from GenerateNamedObjects.  
							
							... 
							
							
							
							It sets DBLAS_OBJS directly to save a bunch of list appending in the
CMakeLists.txt files. 
							
						 
						
							2015-02-09 12:28:09 -06:00  
				
					
						
							
							
								 
						
							
								2f59135eb6 
								
							 
						 
						
							
							
								
								Added gemv to level2 CMakeLists.txt.  
							
							
							
						 
						
							2015-02-07 21:15:21 -06:00