Commit Graph

5 Commits

Author SHA1 Message Date
Gordon Fossum e6dd44d989 Power10: Fix for SBGEMM
While testing bfloat16 sbgemm kernel, there are some failures for odd value inputs due to updating result for
additional bytes.
2021-06-15 13:07:47 -05:00
Rajalakshmi Srinivasaraghavan cbb70438df POWER10: Fixes for sbgemm kernel
While testing bfloat16 sbgemm kernel, there are some failures
for odd value inputs due to array access beyond the boundary.
2021-06-09 12:20:09 -05:00
Rajalakshmi Srinivasaraghavan 0826d68f93 POWER10: Change the packing format for bfloat16
As the new MMA instructions need the inputs in 4x2 order for bfloat16,
changing the format in copy/packing code.  This avoids permute instructions
in the gemm kernel inner loop.
2020-10-13 16:05:10 -05:00
Martin Kroeker 9ae80490e0
rename "HALF" and "sh" to "BFLOAT16" and "sb" 2020-10-11 23:39:42 +02:00
Martin Kroeker d314d1f49f
Rename shgemm_kernel_power10.c to sbgemm_kernel_power10.c 2020-10-11 23:37:38 +02:00