Commit Graph

9 Commits

Author SHA1 Message Date
Martin Kroeker 4255a58cd2
Rename operands to put lda on the input/output constraint list 2019-02-15 10:10:04 +01:00
Martin Kroeker 46e415b140
Save and restore input argument 8 (lda4)
Fixes miscompilation with gcc9 -ftree-vectorize (related to issue #2009)
2019-02-14 22:43:18 +01:00
Martin Kroeker 723f396a20
Tag %1 and %2 as both input and output
The inline assembly modifies its input operands, so mark them as output to avoid surprises with optimization. Fixes #1292
2017-12-29 23:56:41 +01:00
Zhang Xianyi 6e7be06e07 Refs JuliaLang/julia#5728. Fix gemv performance bug on Haswell Mac OSX.
On Mac OS X, it should use .align 4 (equal to .align 16 on Linux).
I didn't get the performance benefit from .align. Thus, I deleted it.
2016-02-19 17:56:07 -05:00
Werner Saar bc5fff7085 changed inline assembler labels to short form 2014-12-07 12:38:54 +01:00
wernsaar 7c0a94ff47 bugfix in sgemv_n_microk_haswell-4.c 2014-09-08 10:54:33 +02:00
wernsaar cbbc80aad3 added optimized sgemv_t kernel for haswell 2014-09-08 10:13:39 +02:00
wernsaar cf5544b417 optimization for small size 2014-09-06 13:17:56 +02:00
wernsaar d143f84dd2 added optimized sgemv_n kernel for haswell 2014-09-06 12:08:48 +02:00