1. Use UZP instructions but not gather load and scatter store instructions to get lower latency.
2. Padding k to a power of 4.
|
||
|---|---|---|
| .. | ||
| level2 | ||
| level3 | ||
| mapper | ||
| others | ||
1. Use UZP instructions but not gather load and scatter store instructions to get lower latency.
2. Padding k to a power of 4.
|
||
|---|---|---|
| .. | ||
| level2 | ||
| level3 | ||
| mapper | ||
| others | ||