Remove Unnecessary/Erroneous Reads In sgemm_tcopy_16.S COPY1x8 Macro

There appears to have been some code leak when copying from the COPY2x8
macro above where we're reading 8 bytes into d4-d7 directly after
reading 4 bytes into s4-s7. These 32 bytes in d4-7 are unused and can
possibly overrun the boundary of allocated memory -- Valgrind detected
this which is what dragged my attention to it for a 128,1 copy.

Additionally, there is no need to update the addresses stored in A0-A7
as the only possible paths after running this macro will overwrite A0-7
if looping to the next 8 rows, or overwrite A0-3 if moving to 4 rows --
in which case A4-7 are unused.
This commit is contained in:
CodesWithWolves 2021-03-31 15:38:07 -04:00
parent 903fd85c85
commit d2bda3b56a
1 changed files with 0 additions and 10 deletions

View File

@ -271,11 +271,6 @@ All rights reserved.
ldr s2, [A03] ldr s2, [A03]
ldr s3, [A04] ldr s3, [A04]
add A01, A01, #4
add A02, A02, #4
add A03, A03, #4
add A04, A04, #4
stp s0, s1, [B04] stp s0, s1, [B04]
add B04, B04, #8 add B04, B04, #8
stp s2, s3, [B04] stp s2, s3, [B04]
@ -286,11 +281,6 @@ All rights reserved.
ldr s6, [A07] ldr s6, [A07]
ldr s7, [A08] ldr s7, [A08]
ldr d4, [A05], #8
ldr d5, [A06], #8
ldr d6, [A07], #8
ldr d7, [A08], #8
stp s4, s5, [B04] stp s4, s5, [B04]
add B04, B04, #8 add B04, B04, #8
stp s6, s7, [B04] stp s6, s7, [B04]