Martin Kroeker
							
						 
						
							 
							
							
								
								
							
							
							
								
							
							
								e12aaed13d 
								
							 
						 
						
							
							
								
								Fix unwanted fallthrough from Intel Family 6 to 15 in case of identification failure  
							
							 
							
							
							
						 
						
							2023-10-18 16:28:54 +02:00  
						
					 
				
					
						
							
							
								 
								Honglin Zhu
							
						 
						
							 
							
							
							
							
								
							
							
								9e80a194d6 
								
							 
						 
						
							
							
								
								Fix dynamic_list build and gcc version check error  
							
							 
							
							
							
						 
						
							2023-05-21 19:52:58 +08:00  
						
					 
				
					
						
							
							
								 
								Honglin Zhu
							
						 
						
							 
							
							
							
							
								
							
							
								0b83088887 
								
							 
						 
						
							
							
								
								spr dynamic arch support  
							
							 
							
							
							
						 
						
							2023-05-19 10:48:18 +08:00  
						
					 
				
					
						
							
							
								 
								Martin Kroeker
							
						 
						
							 
							
							
								
								
							
							
							
								
							
							
								da6e426b13 
								
							 
						 
						
							
							
								
								fix Cooperlake not selectable via environment variable  
							
							 
							
							
							
						 
						
							2022-11-03 18:13:35 +01:00  
						
					 
				
					
						
							
							
								 
								Martin Kroeker
							
						 
						
							 
							
							
								
								
							
							
							
								
							
							
								2c62096fce 
								
							 
						 
						
							
							
								
								Expand cpu mapping for future Zen cpus and use feature-based fallback for unknown AMD family codes  
							
							 
							
							
							
						 
						
							2022-05-18 15:35:30 +02:00  
						
					 
				
					
						
							
							
								 
								Adam Niederer
							
						 
						
							 
							
							
							
							
								
							
							
								69f2ac4ea2 
								
							 
						 
						
							
							
								
								Fix broken elif in dynamic.c  
							
							 
							
							... 
							
							
							
							This fixes compilation in the following case:
$(MAKE) USE_OPENMP=1 USE_THREAD=1 NO_LAPACK=0 DYNAMIC_ARCH=1 \
DYNAMIC_LIST="HASWELL SKYLAKEX ATOM COOPERLAKE SAPPHIRERAPIDS ZEN" 
							
						 
						
							2022-03-17 20:04:37 -04:00  
						
					 
				
					
						
							
							
								 
								JonasZhou
							
						 
						
							 
							
							
							
							
								
							
							
								2d0ad89b0d 
								
							 
						 
						
							
							
								
								Support Zhaoxin/Centaur kh40000 as ZEN  
							
							 
							
							... 
							
							
							
							Signed-off-by: JonasZhou <JonasZhou@zhaoxin.com> 
							
						 
						
							2022-03-10 15:08:38 +08:00  
						
					 
				
					
						
							
							
								 
								Martin Kroeker
							
						 
						
							 
							
							
								
								
							
							
							
								
							
							
								fa3e9f25e6 
								
							 
						 
						
							
							
								
								Support AVX512-enabled Alder Lake  
							
							 
							
							
							
						 
						
							2022-02-07 00:00:56 +01:00  
						
					 
				
					
						
							
							
								 
								Martin Kroeker
							
						 
						
							 
							
							
								
								
							
							
							
								
							
							
								6ed52576f8 
								
							 
						 
						
							
							
								
								Add feature-based fallback for unknown x86_64 cpus  
							
							 
							
							
							
						 
						
							2021-12-16 22:02:49 +01:00  
						
					 
				
					
						
							
							
								 
								Martin Kroeker
							
						 
						
							 
							
							
								
								
							
							
							
								
							
							
								08f8bb66c0 
								
							 
						 
						
							
							
								
								Add CPUIDs for Alder Lake and other recent Intel cpus  
							
							 
							
							
							
						 
						
							2021-11-04 20:36:39 +01:00  
						
					 
				
					
						
							
							
								 
								Martin Kroeker
							
						 
						
							 
							
							
								
								
							
							
							
								
							
							
								22a616bd8f 
								
							 
						 
						
							
							
								
								Add model number for Tiger Lake H (mobile variant)  
							
							 
							
							
							
						 
						
							2021-10-27 22:17:58 +02:00  
						
					 
				
					
						
							
							
								 
								JonasZhou
							
						 
						
							 
							
							
							
							
								
							
							
								0fca36c8c3 
								
							 
						 
						
							
							
								
								Add cpu detection support for Zhaoxin processors  
							
							 
							
							... 
							
							
							
							Signed-off-by: JonasZhou <JonasZhou@zhaoxin.com> 
							
						 
						
							2021-07-12 13:43:45 +08:00  
						
					 
				
					
						
							
							
								 
								Martin Kroeker
							
						 
						
							 
							
							
								
								
							
							
							
								
							
							
								8f22ac552b 
								
							 
						 
						
							
							
								
								Add vendor string Shanghai as successor to Centaur  
							
							 
							
							
							
						 
						
							2021-07-08 18:28:49 +02:00  
						
					 
				
					
						
							
							
								 
								Martin Kroeker
							
						 
						
							 
							
							
								
								
							
							
							
								
							
							
								eb2fdd3af0 
								
							 
						 
						
							
							
								
								Recognize newer Zhaoxin/Centaur processors as Nehalem  
							
							 
							
							
							
						 
						
							2021-07-08 12:23:15 +02:00  
						
					 
				
					
						
							
							
								 
								Martin Kroeker
							
						 
						
							 
							
							
								
								
							
							
							
								
							
							
								cbfd3c87e1 
								
							 
						 
						
							
							
								
								Recognize Intel Ice Lake SP as Cooper Lake  
							
							 
							
							
							
						 
						
							2021-05-14 20:44:06 +02:00  
						
					 
				
					
						
							
							
								 
								Martin Kroeker
							
						 
						
							 
							
							
								
								
							
							
							
								
							
							
								e4e5042e38 
								
							 
						 
						
							
							
								
								Recognize Intel Tiger Lake as SkylakeX  
							
							 
							
							
							
						 
						
							2021-02-11 20:17:11 +01:00  
						
					 
				
					
						
							
							
								 
								Martin Kroeker
							
						 
						
							 
							
							
								
								
							
							
							
								
							
							
								865676682d 
								
							 
						 
						
							
							
								
								Add Intel Rocket Lake  
							
							 
							
							
							
						 
						
							2020-12-14 22:40:23 +01:00  
						
					 
				
					
						
							
							
								 
								Guillaume Horel
							
						 
						
							 
							
							
							
							
								
							
							
								1f564d729b 
								
							 
						 
						
							
							
								
								fix avx2 detection  
							
							 
							
							... 
							
							
							
							reword commits to make it clearer 
							
						 
						
							2020-10-31 10:00:48 -04:00  
						
					 
				
					
						
							
							
								 
								Chen, Guobing
							
						 
						
							 
							
							
							
							
								
							
							
								deaeb6c5b8 
								
							 
						 
						
							
							
								
								Add bfloat16 based dot and conversion with single/double  
							
							 
							
							... 
							
							
							
							1. Added bfloat16 based dot as new API: shdot
2. Implemented generic kernel and cooperlake-specific (AVX512-BF16) kernel for shdot
3. Added 4 conversion APIs for bfloat16 data type <=> single/double: shstobf16 shdtobf16 sbf16tos dbf16tod
     shstobf16 -- convert single float array to bfloat16 array
     shdtobf16 -- convert double float array to bfloat16 array
     sbf16tos  -- convert bfloat16 array to single float array
     dbf16tod  -- convert bfloat16 array to double float array
4. Implemented generic kernels for all 4 conversion APIs, and cooperlake-specific kernel for shstobf16 and shdtobf16
5. Update level1 thread facilitate functions and macros to support multi-threading for these new APIs
6. Fix Cooperlake platform detection/specify issue when under dynamic-arch building
7. Change the typedef of bfloat16 from unsigned short to more strict uint16_t
Signed-off-by: Chen, Guobing <guobing.chen@intel.com> 
							
						 
						
							2020-09-04 02:31:25 +08:00  
						
					 
				
					
						
							
							
								 
								Martin Kroeker
							
						 
						
							 
							
							
								
								
							
							
							
								
							
							
								12918358aa 
								
							 
						 
						
							
							
								
								Add AMD Renoir/Matisse and preliminary support for Zen3 as Zen2  
							
							 
							
							... 
							
							
							
							also support AMD family 22 Jaguar/Puma as Bobcat 
							
						 
						
							2020-07-28 13:53:17 +00:00  
						
					 
				
					
						
							
							
								 
								Martin Kroeker
							
						 
						
							 
							
							
								
								
							
							
							
								
							
							
								584ef8d4ae 
								
							 
						 
						
							
							
								
								Add support for Comet Lake H & S  
							
							 
							
							
							
						 
						
							2020-06-27 14:36:37 +02:00  
						
					 
				
					
						
							
							
								 
								Matthew Treinish
							
						 
						
							 
							
							
								
								
							
							
							
								
							
							
								f37e941d52 
								
							 
						 
						
							
							
								
								Add support to driver/others/dynamic.c too  
							
							 
							
							
							
						 
						
							2020-06-25 11:56:49 -04:00  
						
					 
				
					
						
							
							
								 
								User User-User
							
						 
						
							 
							
							
							
							
								
							
							
								e6b9275034 
								
							 
						 
						
							
							
								
								address vs2019 C4293  
							
							 
							
							
							
						 
						
							2020-06-24 09:12:23 +03:00  
						
					 
				
					
						
							
							
								 
								Martin Kroeker
							
						 
						
							 
							
							
								
								
							
							
							
								
							
							
								007d9f97d7 
								
							 
						 
						
							
							
								
								Make gotoblas_corename report the name of the selected TARGET rather than its aliases  
							
							 
							
							
							
						 
						
							2020-06-13 19:25:28 +02:00  
						
					 
				
					
						
							
							
								 
								Martin Kroeker
							
						 
						
							 
							
							
								
								
							
							
							
								
							
							
								3518617f5b 
								
							 
						 
						
							
							
								
								Add Intel Goldmont+ cpuid  
							
							 
							
							... 
							
							
							
							was originally in #2228  but that PR had misplaced the file in the toplevel directory 
							
						 
						
							2019-12-03 08:32:29 +01:00  
						
					 
				
					
						
							
							
								 
								Martin Kroeker
							
						 
						
							 
							
							
								
								
							
							
							
								
							
							
								f95989cbc1 
								
							 
						 
						
							
							
								
								Fix AVX512 capability test (always returning zero)  
							
							 
							
							... 
							
							
							
							from #2322  
							
						 
						
							2019-11-23 22:38:07 +01:00  
						
					 
				
					
						
							
							
								 
								Martin Kroeker
							
						 
						
							 
							
							
								
								
							
							
							
								
							
							
								3d36c45116 
								
							 
						 
						
							
							
								
								Add CPUID identification of Intel Ice Lake  
							
							 
							
							
							
						 
						
							2019-08-01 22:52:35 +02:00  
						
					 
				
					
						
							
							
								 
								Martin Kroeker
							
						 
						
							 
							
							
								
								
							
							
							
								
							
							
								3ce28fb81a 
								
							 
						 
						
							
							
								
								Merge pull request  #2055  from martin-frbg/atomid  
							
							 
							
							... 
							
							
							
							Add CPUID data for Intel Denverton (as Nehalem) 
							
						 
						
							2019-03-12 22:57:07 +01:00  
						
					 
				
					
						
							
							
								 
								Martin Kroeker
							
						 
						
							 
							
							
								
								
							
							
							
								
							
							
								04f2226ea6 
								
							 
						 
						
							
							
								
								Add Intel Denverton  
							
							 
							
							
							
						 
						
							2019-03-12 16:09:55 +01:00  
						
					 
				
					
						
							
							
								 
								Martin Kroeker
							
						 
						
							 
							
							
								
								
							
							
							
								
							
							
								11cfd0bd75 
								
							 
						 
						
							
							
								
								Do not compile in AVX512 check if AVX support is disabled  
							
							 
							
							... 
							
							
							
							xgetbv is function depends on NO_AVX being undefined - we could change that too, but that combo is unlikely to work anyway 
							
						 
						
							2019-03-05 16:04:25 +01:00  
						
					 
				
					
						
							
							
								 
								caiyu
							
						 
						
							 
							
							
							
							
								
							
							
								29dc72889f 
								
							 
						 
						
							
							
								
								Add support for Hygon Dhyana  
							
							 
							
							
							
						 
						
							2019-01-16 14:25:19 +08:00  
						
					 
				
					
						
							
							
								 
								Martin Kroeker
							
						 
						
							 
							
							
								
								
							
							
							
								
							
							
								dbc9a060ef 
								
							 
						 
						
							
							
								
								Fix missing braces in support_av() call  
							
							 
							
							
							
						 
						
							2019-01-14 22:41:31 +01:00  
						
					 
				
					
						
							
							
								 
								Martin Kroeker
							
						 
						
							 
							
							
								
								
							
							
							
								
							
							
								31ed19e8b9 
								
							 
						 
						
							
							
								
								Add message for SkylakeX and KNL fallbacks to Haswell  
							
							 
							
							
							
						 
						
							2019-01-05 19:41:13 +01:00  
						
					 
				
					
						
							
							
								 
								Martin Kroeker
							
						 
						
							 
							
							
								
								
							
							
							
								
							
							
								e1574fa2b4 
								
							 
						 
						
							
							
								
								Add xcr0 (os support) check  
							
							 
							
							
							
						 
						
							2019-01-05 18:08:02 +01:00  
						
					 
				
					
						
							
							
								 
								Martin Kroeker
							
						 
						
							 
							
							
								
								
							
							
							
								
							
							
								ae1d1f74f7 
								
							 
						 
						
							
							
								
								Query AVX2 and AVX512 capability for runtime cpu selection  
							
							 
							
							
							
						 
						
							2019-01-05 16:55:33 +01:00  
						
					 
				
					
						
							
							
								 
								Martin Kroeker
							
						 
						
							 
							
							
								
								
							
							
							
								
							
							
								504310eeb9 
								
							 
						 
						
							
							
								
								Merge pull request  #1665  from martin-frbg/cpuid-ryzen2  
							
							 
							
							... 
							
							
							
							Add cpuid for AMD Ryzen 2 
							
						 
						
							2018-07-04 08:19:40 +02:00  
						
					 
				
					
						
							
							
								 
								Martin Kroeker
							
						 
						
							 
							
							
								
								
							
							
							
								
							
							
								d0ec4325cf 
								
							 
						 
						
							
							
								
								Add cpuid for AMD Ryzen 2  
							
							 
							
							
							
						 
						
							2018-07-03 21:03:24 +02:00  
						
					 
				
					
						
							
							
								 
								Martin Kroeker
							
						 
						
							 
							
							
								
								
							
							
							
								
							
							
								9d15a3bd16 
								
							 
						 
						
							
							
								
								Fix typo that broke compilation with DYNAMIC_ARCH and NO_AVX2  
							
							 
							
							... 
							
							
							
							fixes 1659 
							
						 
						
							2018-07-02 14:40:41 +02:00  
						
					 
				
					
						
							
							
								 
								Martin Kroeker
							
						 
						
							 
							
							
								
								
							
							
							
								
							
							
								750162a05f 
								
							 
						 
						
							
							
								
								Try gradual fallback for cores not in the dynamic core list  
							
							 
							
							
							
						 
						
							2018-06-25 21:02:31 +02:00  
						
					 
				
					
						
							
							
								 
								Martin Kroeker
							
						 
						
							 
							
							
								
								
							
							
							
								
							
							
								1833a67071 
								
							 
						 
						
							
							
								
								Add support for a user-defined list of dynamic targets  
							
							 
							
							
							
						 
						
							2018-06-23 19:42:15 +02:00  
						
					 
				
					
						
							
							
								 
								Martin Kroeker
							
						 
						
							 
							
							
								
								
							
							
							
								
							
							
								63f7395fb4 
								
							 
						 
						
							
							
								
								Move some DYNAMIC_ARCH targets to new DYNAMIC_OLDER option  
							
							 
							
							
							
						 
						
							2018-06-09 16:31:38 +02:00  
						
					 
				
					
						
							
							
								 
								Martin Kroeker
							
						 
						
							 
							
							
								
								
							
							
							
								
							
							
								38ad05bd04 
								
							 
						 
						
							
							
								
								Extend loop range to find SkylakeX in force_coretype  
							
							 
							
							
							
						 
						
							2018-06-05 10:26:49 +02:00  
						
					 
				
					
						
							
							
								 
								Martin Kroeker
							
						 
						
							 
							
							
								
								
							
							
							
								
							
							
								8be027e4c6 
								
							 
						 
						
							
							
								
								Update dynamic.c  
							
							 
							
							
							
						 
						
							2018-06-04 14:36:39 +02:00  
						
					 
				
					
						
							
							
								 
								Martin Kroeker
							
						 
						
							 
							
							
								
								
							
							
							
								
							
							
								ac7b6e3e9a 
								
							 
						 
						
							
							
								
								Fix misplaced endif  
							
							 
							
							
							
						 
						
							2018-06-04 08:23:40 +02:00  
						
					 
				
					
						
							
							
								 
								Martin Kroeker
							
						 
						
							 
							
							
								
								
							
							
							
								
							
							
								ef626c6824 
								
							 
						 
						
							
							
								
								typo fix  
							
							 
							
							
							
						 
						
							2018-06-04 00:13:19 +02:00  
						
					 
				
					
						
							
							
								 
								Martin Kroeker
							
						 
						
							 
							
							
								
								
							
							
							
								
							
							
								5a51cf4576 
								
							 
						 
						
							
							
								
								Separate Skylake X from Skylake  
							
							 
							
							
							
						 
						
							2018-06-03 23:41:33 +02:00  
						
					 
				
					
						
							
							
								 
								Arjan van de Ven
							
						 
						
							 
							
							
							
							
								
							
							
								99c7bba8e4 
								
							 
						 
						
							
							
								
								Initial support for SkylakeX / AVX512  
							
							 
							
							... 
							
							
							
							This patch adds the basic infrastructure for adding the SkylakeX (Intel Skylake server)
target. The SkylakeX target will use the AVX512 (AVX512VL level) instruction set,
which brings 2 basic things:
1) 512 bit wide SIMD (2x width of AVX2)
2) 32 SIMD registers (2x the number on AVX2)
This initial patch only contains a trivial transofrmation of the Haswell SGEMM kernel
to AVX512VL; more will follow later but this patch aims to get the infrastructure
in place for this "later".
Full performance tuning has not been done yet; with more registers and wider SIMD
it's in theory possible to retune the kernels but even without that there's an
interesting enough performance increase (30-40% range) with just this change. 
							
						 
						
							2018-06-03 07:58:52 +00:00  
						
					 
				
					
						
							
							
								 
								Isuru Fernando
							
						 
						
							 
							
							
							
							
								
							
							
								2f12ea017b 
								
							 
						 
						
							
							
								
								No strncasecmp with MSVC  
							
							 
							
							
							
						 
						
							2017-08-08 00:07:25 +05:30  
						
					 
				
					
						
							
							
								 
								Gian-Carlo Pascutto
							
						 
						
							 
							
							
							
							
								
							
							
								9c884986ad 
								
							 
						 
						
							
							
								
								Add an extra familiy/model combination used by AMD Steamrolller (Godavari).  
							
							 
							
							
							
						 
						
							2017-04-19 19:15:47 +02:00  
						
					 
				
					
						
							
							
								 
								Gian-Carlo Pascutto
							
						 
						
							 
							
							
							
							
								
							
							
								0cbd2d34e4 
								
							 
						 
						
							
							
								
								Recognize ZEN when passed as OPENBLAS_CORETYPE.  
							
							 
							
							
							
						 
						
							2017-04-10 20:05:16 +02:00