Commit Graph

2 Commits

Author SHA1 Message Date
Chip Kerchner cb154832f8 Vectorize SBGEMM incopy - 4x faster. 2024-07-09 13:10:03 -05:00
Rajalakshmi Srinivasaraghavan 0826d68f93 POWER10: Change the packing format for bfloat16
As the new MMA instructions need the inputs in 4x2 order for bfloat16,
changing the format in copy/packing code.  This avoids permute instructions
in the gemm kernel inner loop.
2020-10-13 16:05:10 -05:00