Martin Kroeker
|
9d8be15789
|
Fix inline assembly constraints
rework indices to allow marking argument lda4 as input and output. For #2009
|
2019-02-16 18:24:11 +01:00 |
Martin Kroeker
|
497f0c3d8a
|
Replace .align with .p2align in the Nehalem microkernels
|
2018-02-26 20:58:33 +01:00 |
Martin Kroeker
|
b973990df2
|
Tag %1 and %2 as both input and output operands
fix from #1292 extended to the other gemv microkernels
|
2017-12-31 18:03:36 +01:00 |
Werner Saar
|
bc5fff7085
|
changed inline assembler labels to short form
|
2014-12-07 12:38:54 +01:00 |
wernsaar
|
7b3932b3f3
|
optimized sgemv_n kernel for nehalem
|
2014-09-07 19:20:08 +02:00 |
wernsaar
|
3a5d8dbff9
|
optimized sgemv_n_4.c
|
2014-09-03 15:34:30 +02:00 |
wernsaar
|
2a60c6d4b0
|
optimized sgemv_n for small sizes
|
2014-09-03 14:48:45 +02:00 |