OpenBLAS/Makefile.zarch at 8a2a137a9e4e4ec657c5befe361061607489aaa2 - OpenBLAS - Trustie: Git with trustie

floraachy/OpenBLAS

Files

Marius Hillenbrand 095f4e6964 s390x: allow clang to emit fused multiply-adds (replicates gcc's default behavior)

gcc's default setting for floating-point expression contraction is
"fast", which allows the compiler to emit fused multiply adds instead of
separate multiplies and adds (amongst others). Fused multiply-adds,
which assembly kernels typically apply, also bring a significant
performance advantage to the C implementation for matrix-matrix
multiplication on s390x. To enable that performance advantage for builds
with clang, add -ffp-contract=fast to the compiler options.

Signed-off-by: Marius Hillenbrand <mhillen@linux.ibm.com>

2020-09-02 13:49:31 +02:00

17 lines

363 B

Makefile

Raw Blame History

 ifeq ($(CORE), Z13)
 CCOMMON_OPT += -march=z13 -mzvector
 FCOMMON_OPT += -march=z13 -mzvector
 endif
 ifeq ($(CORE), Z14)
 CCOMMON_OPT += -march=z14 -mzvector -O3
 FCOMMON_OPT += -march=z14 -mzvector
 endif
 # Enable floating-point expression contraction for clang, since it is the
 # default for gcc
 ifeq ($(C_COMPILER), CLANG)
 CCOMMON_OPT += -ffp-contract=fast
 endif