5c58994eb2 
								
							 
						 
						
							
							
								
								Add fallback warning  
							
							
							
						 
						
							2023-07-19 18:27:41 +02:00  
				
					
						
							
							
								 
						
							
								ca7199f249 
								
							 
						 
						
							
							
								
								Treat newer Neoverse as N1 if SVE unavailable (may be disabled in container/cloud env)  
							
							
							
						 
						
							2023-07-19 14:48:42 +02:00  
				
					
						
							
							
								 
						
							
								616fdea82a 
								
							 
						 
						
							
							
								
								Revert "Improve Windows threading performance scaling"  
							
							
							
						 
						
							2023-06-28 09:45:17 +02:00  
				
					
						
							
							
								 
						
							
								d6991dd230 
								
							 
						 
						
							
							
								
								fix missing #endif  
							
							
							
						 
						
							2023-06-24 15:43:32 -07:00  
				
					
						
							
							
								 
						
							
								7783a9af02 
								
							 
						 
						
							
							
								
								attempt to fix old mingw gcc issue  
							
							
							
						 
						
							2023-06-24 14:35:11 -07:00  
				
					
						
							
							
								 
						
							
								8caabc5982 
								
							 
						 
						
							
							
								
								fix   #4063  remove unused pool_lock  
							
							
							
						 
						
							2023-06-23 19:45:16 -07:00  
				
					
						
							
							
								 
						
							
								d301649430 
								
							 
						 
						
							
							
								
								fix   #4063  threading perf issues on Windows  
							
							
							
						 
						
							2023-06-23 19:42:27 -07:00  
				
					
						
							
							
								 
						
							
								9e80a194d6 
								
							 
						 
						
							
							
								
								Fix dynamic_list build and gcc version check error  
							
							
							
						 
						
							2023-05-21 19:52:58 +08:00  
				
					
						
							
							
								 
						
							
								0b83088887 
								
							 
						 
						
							
							
								
								spr dynamic arch support  
							
							
							
						 
						
							2023-05-19 10:48:18 +08:00  
				
					
						
							
							
								 
						
							
								e5538a62cb 
								
							 
						 
						
							
							
								
								Add suggestions to NUM_THREADS/auxiliary buffer message  
							
							
							
						 
						
							2023-05-04 22:56:39 +02:00  
				
					
						
							
							
								 
						
							
								437c0bf2b4 
								
							 
						 
						
							
							
								
								Merge pull request  #3843  from Mousius/switch-ratio  
							
							... 
							
							
							
							Propagate SWITCH_RATIO to DYNAMIC_ARCH builds 
							
						 
						
							2023-04-19 11:51:54 +02:00  
				
					
						
							
							
								 
						
							
								32f2fafde7 
								
							 
						 
						
							
							
								
								Propagate SWITCH_RATIO to DYNAMIC_ARCH builds  
							
							... 
							
							
							
							Previously dynamic builds were either using the default SWITCH_RATIO
or one from the higher level architecture; this patch ensures the
dynamic builds can use this parameter as well. 
							
						 
						
							2023-04-17 15:34:12 +01:00  
				
					
						
							
							
								 
						
							
								36fcb52094 
								
							 
						 
						
							
							
								
								Fix logic - we want real OR imaginary part of X to be nonzero here  
							
							
							
						 
						
							2023-04-01 00:02:54 +02:00  
				
					
						
							
							
								 
						
							
								f2659516ef 
								
							 
						 
						
							
							
								
								remove unqualified ifdef's for NO_LAPACK(E)  
							
							
							
						 
						
							2023-03-28 19:01:31 +11:00  
				
					
						
							
							
								 
						
							
								579bc86671 
								
							 
						 
						
							
							
								
								remove call to omp_set_num_threads  
							
							
							
						 
						
							2023-03-21 20:58:56 +01:00  
				
					
						
							
							
								 
						
							
								e298d613fa 
								
							 
						 
						
							
							
								
								initialize status variable for openblas_set_num_threads  
							
							
							
						 
						
							2023-03-08 23:43:15 +01:00  
				
					
						
							
							
								 
						
							
								05aa88268f 
								
							 
						 
						
							
							
								
								add status variable for openblas_set_num_threads  
							
							
							
						 
						
							2023-03-08 23:41:57 +01:00  
				
					
						
							
							
								 
						
							
								e38ab079a0 
								
							 
						 
						
							
							
								
								Fix OpenMP thread counting returning places rather than cores  
							
							
							
						 
						
							2023-03-08 19:17:33 +01:00  
				
					
						
							
							
								 
						
							
								d4868babbc 
								
							 
						 
						
							
							
								
								Fix typos  
							
							
							
						 
						
							2022-12-29 23:07:55 +01:00  
				
					
						
							
							
								 
						
							
								18c99d3e63 
								
							 
						 
						
							
							
								
								Update dynamic_arm64.c  
							
							
							
						 
						
							2022-12-25 13:31:38 +01:00  
				
					
						
							
							
								 
						
							
								186a310f92 
								
							 
						 
						
							
							
								
								Update dynamic_arm64.c  
							
							
							
						 
						
							2022-12-25 12:22:48 +01:00  
				
					
						
							
							
								 
						
							
								da6e426b13 
								
							 
						 
						
							
							
								
								fix Cooperlake not selectable via environment variable  
							
							
							
						 
						
							2022-11-03 18:13:35 +01:00  
				
					
						
							
							
								 
						
							
								4989e039a5 
								
							 
						 
						
							
							
								
								Define SBGEMM_ALIGN_K for DYNAMIC_ARCH build  
							
							
							
						 
						
							2022-10-27 14:10:26 +08:00  
				
					
						
							
							
								 
						
							
								b00d5b9746 
								
							 
						 
						
							
							
								
								New sbgemm implementation for Neoverse N2  
							
							... 
							
							
							
							1. Use UZP instructions but not gather load and scatter store instructions to get lower latency.
    2. Padding k to a power of 4. 
							
						 
						
							2022-10-26 15:09:41 +08:00  
				
					
						
							
							
								 
						
							
								ab6009b0b6 
								
							 
						 
						
							
							
								
								Merge pull request  #3773  from staticfloat/sf/openblas_default_num_threads  
							
							... 
							
							
							
							Add `OPENBLAS_DEFAULT_NUM_THREADS` 
							
						 
						
							2022-10-13 14:15:14 +02:00  
				
					
						
							
							
								 
						
							
								db50ab4a72 
								
							 
						 
						
							
							
								
								Add BUILD_vartype defines  
							
							
							
						 
						
							2022-10-01 15:14:51 +02:00  
				
					
						
							
							
								 
						
							
								d2ce93179f 
								
							 
						 
						
							
							
								
								Add `OPENBLAS_DEFAULT_NUM_THREADS`  
							
							... 
							
							
							
							This allows Julia to set a default number of threads (usually `1`) to be
used when no other thread counts are specified [0], to short-circuit the
default OpenBLAS thread initialization routine that spins up a different
number of threads than Julia would otherwise choose.
The reason to add a new environment variable is that we want to be able
to configure OpenBLAS to avoid performing its initial memory
allocation/thread startup, as that can consume significant amounts of
memory, but we still want to be sensitive to legacy codebases that set
things like `OMP_NUM_THREADS` or `GOTOBLAS_NUM_THREADS`.  Creating a new
environment variable that is openblas-specific and is not already
publicly used to control the overall number of threads of programs like
Julia seems to be the best way forward.
[0] https://github.com/JuliaLang/julia/pull/46844  
							
						 
						
							2022-09-30 01:21:44 +00:00  
				
					
						
							
							
								 
						
							
								84453b924f 
								
							 
						 
						
							
							
								
								Support CONSISTENT_FPCSR on AARCH64  
							
							
							
						 
						
							2022-09-22 00:20:40 +09:00  
				
					
						
							
							
								 
						
							
								9402df5604 
								
							 
						 
						
							
							
								
								Fix missing external declaration  
							
							
							
						 
						
							2022-09-14 21:44:34 +02:00  
				
					
						
							
							
								 
						
							
								bd30120ba7 
								
							 
						 
						
							
							
								
								Merge pull request  #3720  from FlyGoat/mips64  
							
							... 
							
							
							
							Make it work on general MIPS64 processors 
							
						 
						
							2022-08-19 20:24:27 +02:00  
				
					
						
							
							
								 
						
							
								fae9368f14 
								
							 
						 
						
							
							
								
								Implement DYNAMIC_LIST for MIPS64  
							
							... 
							
							
							
							Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com> 
							
						 
						
							2022-08-12 13:13:31 +01:00  
				
					
						
							
							
								 
						
							
								a50b29c540 
								
							 
						 
						
							
							
								
								Provide a fallback MIPS64_GENERIC target  
							
							... 
							
							
							
							It is really dangerous to fallback to Loongson core on other
MIPS64 processors.
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com> 
							
						 
						
							2022-08-12 13:13:28 +01:00  
				
					
						
							
							
								 
						
							
								b633eb79f2 
								
							 
						 
						
							
							
								
								Use $at as temporary register for mips/loongson CPUCFG read  
							
							... 
							
							
							
							Some compilers (namely LLVM) are not happy with clobbering
registers in inline assembly.
Use $at as temporary register and explicitly use noat
hint.
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com> 
							
						 
						
							2022-08-07 13:22:32 +01:00  
				
					
						
							
							
								 
						
							
								19fefd100e 
								
							 
						 
						
							
							
								
								Merge pull request  #3703  from martin-frbg/omp_adaptive  
							
							... 
							
							
							
							Add env variable OMP_ADAPTIVE to control OMP threadpool behaviour 
							
						 
						
							2022-08-03 15:38:39 +02:00  
				
					
						
							
							
								 
						
							
								19d4f90c44 
								
							 
						 
						
							
							
								
								Use auvx to detect CPUCFG on mips/loongson  
							
							... 
							
							
							
							It's safer and easier than SIGILL.
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com> 
							
						 
						
							2022-07-31 19:41:59 +01:00  
				
					
						
							
							
								 
						
							
								d0ba257de0 
								
							 
						 
						
							
							
								
								Merge pull request  #3704  from XiWeiGu/loongarch64_dynamic_arch  
							
							... 
							
							
							
							LoongArch64: Add DYNAMIC_ARCH support 
							
						 
						
							2022-07-28 20:31:20 +02:00  
				
					
						
							
							
								 
						
							
								fbfe1daf6e 
								
							 
						 
						
							
							
								
								LoongArch64: Add DYNAMIC_ARCH support  
							
							
							
						 
						
							2022-07-28 14:28:45 +08:00  
				
					
						
							
							
								 
						
							
								80cdfed7b2 
								
							 
						 
						
							
							
								
								Use OMP_ADAPTIVE setting to choose between static and dynamic OMP threadpool size  
							
							
							
						 
						
							2022-07-27 23:43:20 +02:00  
				
					
						
							
							
								 
						
							
								08e3754b39 
								
							 
						 
						
							
							
								
								Add environment variable OMP_ADAPTIVE  
							
							
							
						 
						
							2022-07-27 23:41:47 +02:00  
				
					
						
							
							
								 
						
							
								30473b6a9d 
								
							 
						 
						
							
							
								
								add openblas_getaffinity()  
							
							
							
						 
						
							2022-07-27 19:15:18 +02:00  
				
					
						
							
							
								 
						
							
								daca01622b 
								
							 
						 
						
							
							
								
								fix detection of Neoverse V1 and user-enforced selection of N2 in ARM64 DYNAMIC_ARCH ( #3700 )  
							
							... 
							
							
							
							* fix detection of Neoverse V1 and user-enforced selection of N2 
							
						 
						
							2022-07-27 09:17:43 +02:00  
				
					
						
							
							
								 
						
							
								d5ca477f42 
								
							 
						 
						
							
							
								
								Neoverse N2: DYNAMIC_ARCH  
							
							
							
						 
						
							2022-07-12 00:50:45 +08:00  
				
					
						
							
							
								 
						
							
								69148ae795 
								
							 
						 
						
							
							
								
								Guard against sysconf returning zero processors  
							
							
							
						 
						
							2022-07-06 17:22:18 +02:00  
				
					
						
							
							
								 
						
							
								e9260f5451 
								
							 
						 
						
							
							
								
								Guard against system call returning zero processors  
							
							
							
						 
						
							2022-07-06 17:21:10 +02:00  
				
					
						
							
							
								 
						
							
								2c62096fce 
								
							 
						 
						
							
							
								
								Expand cpu mapping for future Zen cpus and use feature-based fallback for unknown AMD family codes  
							
							
							
						 
						
							2022-05-18 15:35:30 +02:00  
				
					
						
							
							
								 
						
							
								69f2ac4ea2 
								
							 
						 
						
							
							
								
								Fix broken elif in dynamic.c  
							
							... 
							
							
							
							This fixes compilation in the following case:
$(MAKE) USE_OPENMP=1 USE_THREAD=1 NO_LAPACK=0 DYNAMIC_ARCH=1 \
DYNAMIC_LIST="HASWELL SKYLAKEX ATOM COOPERLAKE SAPPHIRERAPIDS ZEN" 
							
						 
						
							2022-03-17 20:04:37 -04:00  
				
					
						
							
							
								 
						
							
								8d5a9c2f98 
								
							 
						 
						
							
							
								
								Merge pull request  #3565  from jonaszhou1/develop  
							
							... 
							
							
							
							Support Zhaoxin/Centaur kh40000 as ZEN 
							
						 
						
							2022-03-11 14:29:30 +01:00  
				
					
						
							
							
								 
						
							
								bf4642eb7e 
								
							 
						 
						
							
							
								
								Report USE_TLS if set  
							
							
							
						 
						
							2022-03-10 16:19:29 +01:00  
				
					
						
							
							
								 
						
							
								2d0ad89b0d 
								
							 
						 
						
							
							
								
								Support Zhaoxin/Centaur kh40000 as ZEN  
							
							... 
							
							
							
							Signed-off-by: JonasZhou <JonasZhou@zhaoxin.com> 
							
						 
						
							2022-03-10 15:08:38 +08:00  
				
					
						
							
							
								 
						
							
								fa3e9f25e6 
								
							 
						 
						
							
							
								
								Support AVX512-enabled Alder Lake  
							
							
							
						 
						
							2022-02-07 00:00:56 +01:00