OpenBLAS/benchmark
Bart Oldeman bae45d94d1 scal benchmark: eliminate y, move init/timing out of loop
Removing y avoids cache effects (if y is the size of the L1 cache, the
main array x is removed from it).
Moving init and timing out of the loop makes the scal benchmark behave like
the gemm benchmark, and allows higher accuracy for smaller test cases since
the loop overhead is much smaller than the timing overhead.

Example:
OPENBLAS_LOOPS=10000 ./dscal.goto 1024 8192 1024
on AMD Zen2 (7532) with 32k (4k doubles) L1 cache per core.

Before
From : 1024  To : 8192 Step = 1024 Inc_x = 1 Inc_y = 1 Loops = 10000
   SIZE       Flops
   1024 :     5627.08 MFlops   0.000000 sec
   2048 :     5907.34 MFlops   0.000000 sec
   3072 :     5553.30 MFlops   0.000001 sec
   4096 :     5446.38 MFlops   0.000001 sec
   5120 :     5504.61 MFlops   0.000001 sec
   6144 :     5501.80 MFlops   0.000001 sec
   7168 :     5547.43 MFlops   0.000001 sec
   8192 :     5548.46 MFlops   0.000001 sec

After
From : 1024  To : 8192 Step = 1024 Inc_x = 1 Inc_y = 1 Loops = 10000
   SIZE       Flops
   1024 :     6310.28 MFlops   0.000000 sec
   2048 :     6396.29 MFlops   0.000000 sec
   3072 :     6439.14 MFlops   0.000000 sec
   4096 :     6327.14 MFlops   0.000001 sec
   5120 :     5628.24 MFlops   0.000001 sec
   6144 :     5616.41 MFlops   0.000001 sec
   7168 :     5553.13 MFlops   0.000001 sec
   8192 :     5600.88 MFlops   0.000001 sec

We can see the L1->L2 switchover point is now where it should be, and the
number of flops for L1 is more accurate.
2022-11-29 08:02:45 -05:00
..
scripts disable NaN checks before BLAS calls dgemm.R 2019-01-16 11:54:22 +02:00
Make_exe.sh bugfixes, to build benchmarks with mingw on Windows OS 2015-05-29 12:56:22 +02:00
Makefile change line endings from CRLF to LF 2022-11-17 10:18:36 +01:00
amax.c change line endings from CRLF to LF 2022-11-17 10:18:36 +01:00
amin.c change line endings from CRLF to LF 2022-11-17 10:18:36 +01:00
asum.c Refactor the performance measurement system 2020-10-23 10:32:03 +08:00
axpby.c Refactor the performance measurement system 2020-10-23 10:32:03 +08:00
axpy.c Refactor the performance measurement system 2020-10-23 10:32:03 +08:00
bench.h Fix comment. 2022-10-20 20:11:09 -04:00
cholesky.c Refactor the performance measurement system 2020-10-23 10:32:03 +08:00
copy.c Refactor the performance measurement system 2020-10-23 10:32:03 +08:00
cula_wrapper.c Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
dot.c Refractoring remaining benchmark cases. 2020-10-26 10:25:05 +08:00
geev.c Refactor the performance measurement system 2020-10-23 10:32:03 +08:00
gemm.c Refactor the performance measurement system 2020-10-23 10:32:03 +08:00
gemm3m.c Refractoring remaining benchmark cases. 2020-10-26 10:25:05 +08:00
gemv.c Fix flipped indices in benchmark for gemv 2021-11-03 12:45:09 +01:00
ger.c Refractoring remaining benchmark cases. 2020-10-26 10:25:05 +08:00
gesv.c Refractoring remaining benchmark cases. 2020-10-26 10:25:05 +08:00
getri.c Handle OPENBLAS_LOOPS and OPENBLAS_TEST options 2021-07-01 17:38:45 +02:00
hbmv.c change line endings from CRLF to LF 2022-11-17 10:18:36 +01:00
hemm.c Refractoring remaining benchmark cases. 2020-10-26 10:25:05 +08:00
hemv.c Refractoring remaining benchmark cases. 2020-10-26 10:25:05 +08:00
her.c Refractoring remaining benchmark cases. 2020-10-26 10:25:05 +08:00
her2.c Refractoring remaining benchmark cases. 2020-10-26 10:25:05 +08:00
her2k.c Refractoring remaining benchmark cases. 2020-10-26 10:25:05 +08:00
herk.c Refractoring remaining benchmark cases. 2020-10-26 10:25:05 +08:00
hpmv.c change line endings from CRLF to LF 2022-11-17 10:18:36 +01:00
iamax.c Refractoring remaining benchmark cases. 2020-10-26 10:25:05 +08:00
iamin.c change line endings from CRLF to LF 2022-11-17 10:18:36 +01:00
imax.c change line endings from CRLF to LF 2022-11-17 10:18:36 +01:00
imin.c change line endings from CRLF to LF 2022-11-17 10:18:36 +01:00
linpack.c Handle OPENBLAS_LOOPS for more stable results 2021-07-01 17:39:37 +02:00
max.c change line endings from CRLF to LF 2022-11-17 10:18:36 +01:00
min.c change line endings from CRLF to LF 2022-11-17 10:18:36 +01:00
nrm2.c Refractoring remaining benchmark cases. 2020-10-26 10:25:05 +08:00
plot-filter.sh added blas level1 copy benchmark 2015-05-20 11:05:00 +02:00
plot-header added a sample plot-filter scripts and a header file for gnuplot 2014-07-21 14:50:24 +02:00
potrf.c Add OPENBLAS_LOOPS support to potrf/potrs/potri benchmark 2021-06-26 23:46:00 +02:00
rot.c Refractoring remaining benchmark cases. 2020-10-26 10:25:05 +08:00
rotm.c change line endings from CRLF to LF 2022-11-17 10:18:36 +01:00
scal.c scal benchmark: eliminate y, move init/timing out of loop 2022-11-29 08:02:45 -05:00
smallscaling.c added bugfixes for some make files and smallscaling.c 2016-04-21 12:54:32 +02:00
spmv.c change line endings from CRLF to LF 2022-11-17 10:18:36 +01:00
spr.c Refractoring remaining benchmark cases. 2020-10-26 10:25:05 +08:00
spr2.c Refractoring remaining benchmark cases. 2020-10-26 10:25:05 +08:00
swap.c Refractoring remaining benchmark cases. 2020-10-26 10:25:05 +08:00
symm.c Refractoring remaining benchmark cases. 2020-10-26 10:25:05 +08:00
symv.c Refractoring remaining benchmark cases. 2020-10-26 10:25:05 +08:00
syr.c Refractoring remaining benchmark cases. 2020-10-26 10:25:05 +08:00
syr2.c Handle OPENBLAS_LOOPS in SYR2 benchmark 2021-07-10 21:27:53 +02:00
syr2k.c Refractoring remaining benchmark cases. 2020-10-26 10:25:05 +08:00
syrk.c Handle OPENBLAS_LOOP 2021-07-04 16:59:43 +02:00
tplot-header added plot-header to compare multithreading 2014-09-02 14:11:42 +02:00
tpmv.c Refractoring remaining benchmark cases. 2020-10-26 10:25:05 +08:00
tpsv.c Refractoring remaining benchmark cases. 2020-10-26 10:25:05 +08:00
trmm.c Refractoring remaining benchmark cases. 2020-10-26 10:25:05 +08:00
trmv.c Refractoring remaining benchmark cases. 2020-10-26 10:25:05 +08:00
trsm.c Refractoring remaining benchmark cases. 2020-10-26 10:25:05 +08:00
trsv.c Refractoring remaining benchmark cases. 2020-10-26 10:25:05 +08:00
zdot-intel.c Refractoring remaining benchmark cases. 2020-10-26 10:25:05 +08:00
zdot.c Refractoring remaining benchmark cases. 2020-10-26 10:25:05 +08:00