e6dd44d989 
								
							 
						 
						
							
							
								
								Power10: Fix for SBGEMM  
							
							... 
							
							
							
							While testing bfloat16 sbgemm kernel, there are some failures for odd value inputs due to updating result for
additional bytes. 
							
						 
						
							2021-06-15 13:07:47 -05:00  
				
					
						
							
							
								 
						
							
								cbb70438df 
								
							 
						 
						
							
							
								
								POWER10: Fixes for sbgemm kernel  
							
							... 
							
							
							
							While testing bfloat16 sbgemm kernel, there are some failures
for odd value inputs due to array access beyond the boundary. 
							
						 
						
							2021-06-09 12:20:09 -05:00  
				
					
						
							
							
								 
						
							
								0826d68f93 
								
							 
						 
						
							
							
								
								POWER10: Change the packing format for bfloat16  
							
							... 
							
							
							
							As the new MMA instructions need the inputs in 4x2 order for bfloat16,
changing the format in copy/packing code.  This avoids permute instructions
in the gemm kernel inner loop. 
							
						 
						
							2020-10-13 16:05:10 -05:00  
				
					
						
							
							
								 
						
							
								9ae80490e0 
								
							 
						 
						
							
							
								
								rename "HALF" and "sh" to "BFLOAT16" and "sb"  
							
							
							
						 
						
							2020-10-11 23:39:42 +02:00  
				
					
						
							
							
								 
						
							
								d314d1f49f 
								
							 
						 
						
							
							
								
								Rename shgemm_kernel_power10.c to sbgemm_kernel_power10.c  
							
							
							
						 
						
							2020-10-11 23:37:38 +02:00