Martin Kroeker
|
fdd1b50263
|
Merge pull request #63 from xianyi/develop
rebase
|
2020-06-09 15:54:30 +02:00 |
Leonard Lausen
|
b98923f33a
|
Test enforce -O1 for flang
|
2020-06-09 06:54:47 +00:00 |
Leonard Lausen
|
4cb1db0e3b
|
Test flang build
|
2020-06-09 06:31:17 +00:00 |
Martin Kroeker
|
430e8b45fe
|
Merge pull request #2648 from martin-frbg/lapack411
Break out of potentially infinite rescaling loop in LAPACK xLARGV/xLARTG/xLARTGP
|
2020-06-07 19:45:52 +02:00 |
Martin Kroeker
|
88fe85f4e0
|
Merge pull request #2647 from martin-frbg/aocc-flang
Small fixes for flang in general and the AMD AOCC version of it in particular
|
2020-06-07 19:45:11 +02:00 |
Martin Kroeker
|
89091e6b64
|
Merge pull request #2645 from martin-frbg/misc_fixes
Miscellaneous fixes
|
2020-06-07 19:44:50 +02:00 |
Martin Kroeker
|
522aaf53bf
|
Break out of potentially infinite rescaling loop in LAPACK xLARGV/xLARTG/xLARTGP
Reference-LAPACK issue 411
|
2020-06-07 14:30:20 +02:00 |
Martin Kroeker
|
c3574ffe53
|
Merge pull request #2646 from wjc404/develop
Optimize AVX512 parallel DGEMM performance
|
2020-06-07 13:18:22 +02:00 |
Martin Kroeker
|
4e28dc6353
|
Use only -O1 with AMD AOCC version of flang
to prevent miscompilation of LAPACK codes and tests on Ryzen
|
2020-06-07 00:05:02 +02:00 |
Martin Kroeker
|
13c28889a2
|
Update "cosmetic fixes for non-C99 compilers"
|
2020-06-06 15:22:27 +02:00 |
wjc404
|
0e3ac4a06b
|
Add files via upload
|
2020-06-06 14:56:57 +08:00 |
Martin Kroeker
|
28915eed72
|
Cosmetic fixes for non-C99 compilers
|
2020-06-05 10:05:34 +02:00 |
Martin Kroeker
|
7f60fb6b91
|
Delete spurious copy of common_param.h
|
2020-06-05 10:04:16 +02:00 |
Martin Kroeker
|
0464e662ad
|
make blas_quickdivide unsigned and guard against miscompilation
|
2020-06-05 10:03:36 +02:00 |
Martin Kroeker
|
0f9a935a5a
|
Merge pull request #62 from xianyi/develop
rebase
|
2020-06-05 09:51:06 +02:00 |
Martin Kroeker
|
79cd69fea4
|
Merge pull request #2644 from martin-frbg/cmake-maxstack
Add CMAKE support for MAX_STACK_ALLOC setting
|
2020-06-05 08:33:48 +02:00 |
Martin Kroeker
|
bb12c2c854
|
Limit MAX_STACK_ALLOC availability to non-Wndows
|
2020-06-04 19:07:27 +02:00 |
Martin Kroeker
|
32c1c1e125
|
Update azure-pipelines.yml
|
2020-06-04 19:03:46 +02:00 |
Martin Kroeker
|
f1953b8b81
|
Update azure-pipelines.yml
|
2020-06-04 17:58:13 +02:00 |
Martin Kroeker
|
6e97df7b47
|
Add CMAKE support for MAX_STACK_ALLOC setting
|
2020-06-04 14:45:31 +02:00 |
Martin Kroeker
|
729303e5ed
|
Merge pull request #2643 from craft-zhang/cortex-a53
Improve performance of SGEMM on Arm Cortex-A53
|
2020-06-04 07:58:45 +02:00 |
Martin Kroeker
|
547965530f
|
Merge pull request #2638 from leezu/actions
Add Github Actions test for DYNAMIC_ARCH builds on Linux and macOS
|
2020-06-04 00:02:37 +02:00 |
ZhangDanfeng
|
9b7877ccf1
|
sgemm copy source init
Signed-off-by: ZhangDanfeng <467688405@qq.com>
|
2020-06-04 02:10:45 +08:00 |
ZhangDanfeng
|
f82fa802d1
|
Insert prefetch
Signed-off-by: ZhangDanfeng <467688405@qq.com>
|
2020-06-04 02:08:48 +08:00 |
Martin Kroeker
|
3eda3d34c3
|
Merge pull request #2641 from martin-frbg/ppcg4
Work around PPC G4 test failures
|
2020-06-03 16:43:46 +02:00 |
Martin Kroeker
|
a8f42ae85c
|
set cmake build type to Release
|
2020-06-03 15:28:59 +02:00 |
Martin Kroeker
|
e6e2e531bc
|
revert clang pragma
|
2020-06-03 15:16:27 +02:00 |
Martin Kroeker
|
456dc04441
|
Update sgemm_kernel_16x4_skylakex_3.c
|
2020-06-03 15:15:41 +02:00 |
Martin Kroeker
|
89323458a9
|
preset optimization level for apple clang
|
2020-06-03 15:07:25 +02:00 |
Martin Kroeker
|
e153bdeb70
|
Update dynamic_arch.yml
|
2020-06-03 13:46:43 +02:00 |
Martin Kroeker
|
c2001f7756
|
Make cmake build verbose to see options in use
|
2020-06-03 12:18:15 +02:00 |
Martin Kroeker
|
c2b3f0b3f6
|
Revert "keep Apple Clang from optimizing this"
|
2020-06-03 10:22:15 +02:00 |
Martin Kroeker
|
f16e39554d
|
Change PPCG4 CGEMM_M to match kernel change
|
2020-06-03 09:15:29 +02:00 |
Martin Kroeker
|
b1ee81228a
|
Change complex DOT and ROT to generic kernels and switch CGEMM
in response to test failures seen in #2628 and BLAS-Tester
|
2020-06-03 09:13:29 +02:00 |
Martin Kroeker
|
9f7358d7dc
|
Keep Apple Clang from optimizing this
|
2020-06-03 08:52:53 +02:00 |
Martin Kroeker
|
54fa90fb25
|
Keep apple clang 11.0.3 from trying to optimize this (and running out of registers)
|
2020-06-02 17:31:45 +02:00 |
Leonard Lausen
|
5a709b8340
|
Print CPU info in output
|
2020-06-01 20:51:11 +00:00 |
Leonard Lausen
|
b31a68b835
|
Add Github Actions test for DYNAMIC_ARCH builds
|
2020-06-01 20:16:59 +00:00 |
Martin Kroeker
|
86552bf4c7
|
Update f_check
|
2020-05-31 15:22:12 +02:00 |
Martin Kroeker
|
a349d48d89
|
Merge pull request #2636 from martin-frbg/issue2634
Fix CMAKE build Issues on OS X
|
2020-05-31 15:16:09 +02:00 |
Martin Kroeker
|
4db00121dc
|
Disable EXPRECISION and add -lm on OSX (same as the BSDs and Linux)
|
2020-05-31 12:39:36 +02:00 |
Martin Kroeker
|
909897f13b
|
Document option USE_LOCKING
|
2020-05-31 12:37:57 +02:00 |
Martin Kroeker
|
e79245acd9
|
Merge pull request #2635 from ilayn/patch-1
BUG: Fix the loop range in ZHEEQUB.f
|
2020-05-30 14:37:12 +02:00 |
Ilhan Polat
|
76d2612e0c
|
BUG: Fix the loop range in ZHEEQUB.f
|
2020-05-30 14:11:11 +02:00 |
Martin Kroeker
|
ced49466f0
|
Use the fortran compiler to link LAPACK-related benchmarks
to fix linking problems with (at least) the AMD version of flang that creates dependencies on more than just the fortran runtime.
|
2020-05-29 13:35:51 +02:00 |
Martin Kroeker
|
6e270f91ec
|
add support for RETURN_BY_STACK semantics, e.g. clang
|
2020-05-29 13:29:10 +02:00 |
Martin Kroeker
|
200296b0f4
|
remove libomp from link list only for pgfortran
at least the AMD (aocc) flavor of flang wants to link to a (real or dummy) libomp by default
|
2020-05-29 13:23:51 +02:00 |
Martin Kroeker
|
dd7a650792
|
Merge pull request #59 from xianyi/develop
rebase
|
2020-05-29 13:06:25 +02:00 |
Martin Kroeker
|
4a4c50a7ce
|
Merge pull request #2627 from pkubaj/patch-1
Add powerpc (32-bit)
|
2020-05-26 08:36:24 +02:00 |
Martin Kroeker
|
d069780e63
|
Merge pull request #2626 from docularxu/working-gcc-version-detections
make GCC version detection OS-independent
|
2020-05-26 08:35:58 +02:00 |