* fix multiple numerical stability and corner case issues * add a script to generate arbitrary gemm kernel shapes * add a generic zvl256b target to demonstrate large gemm kernel unrolls
Change-Id: Iae7800a32f5af3903c330882cdf6f292d885f266