Martin Kroeker
63b03efc2a
Merge pull request #2667 from xianyi/develop
...
Merge develop into 0.3.0 for 0.3.10 release
2020-06-14 22:03:04 +02:00
Martin Kroeker
95dbeff66d
Merge branch 'release-0.3.0' into develop
2020-06-14 22:02:45 +02:00
Martin Kroeker
3b673a24b7
Increment version to 0.3.10.dev
2020-06-14 21:57:52 +02:00
Martin Kroeker
1eb1979050
Increment version to 0.3.10.dev
2020-06-14 21:57:15 +02:00
Martin Kroeker
efc53b6e7e
Merge pull request #2665 from martin-frbg/flang-fixes-2a
...
Fix spelling of flang option -Mrecursive, add -Kieee and workaround for AOCC optimizer bug
2020-06-14 21:56:08 +02:00
Martin Kroeker
72888497e2
Update with 0.3.10 changes
2020-06-14 21:55:31 +02:00
Martin Kroeker
7e3e006af6
Merge pull request #2666 from martin-frbg/blastest
...
Update BLAS tests to what netlib 3.9.0 uses
2020-06-14 18:28:37 +02:00
Martin Kroeker
d906d14402
Merge pull request #2664 from ACSimon33/exported_symbols
...
Add missing exported symbols.
2020-06-14 18:27:03 +02:00
Martin Kroeker
3785c0e82b
Merge pull request #2663 from martin-frbg/issue2654
...
Respect predefined defaults for AR, AS, LD and RANLIB
2020-06-14 18:26:43 +02:00
Martin Kroeker
f2d8879af6
Merge pull request #2661 from martin-frbg/issue2660
...
Report selected DYNAMIC_ARCH kernel rather than one of its aliases in gotoblas_corename
2020-06-14 18:25:37 +02:00
Martin Kroeker
6876221cf3
Remove optimization level limit for flang again and add -fno-unroll-loops for AOCC flang 2.x instead
2020-06-14 17:40:24 +02:00
Martin Kroeker
79cdcde717
Re-enable higher optimization levels for flang while disabling loop unrolling for AOCC flang
2020-06-14 17:18:16 +02:00
Martin Kroeker
18a11137f1
Update BLAS tests to correspond to Reference-LAPACK 3.9.0
...
replaces calculation of machine precision with call to epsilon intrinsic and removes the requirement for previous output files to be removed before rerunning tests
2020-06-14 10:26:25 +02:00
Martin Kroeker
1dd712131e
Fix spelling of flang option -Mrecursive and add -Kieee
2020-06-14 00:09:31 +02:00
Martin Kroeker
0ed2adf0b2
Fix spelling of flang option -Mrecursive and add -Kieee
2020-06-14 00:01:20 +02:00
Martin Kroeker
abf670757b
Respect predefined defaults for AR, AS, LD and RANLIB
2020-06-13 23:21:13 +02:00
Simon Märtens
41fc6f3cd2
Added missing exported symbols.
2020-06-13 22:37:39 +02:00
Martin Kroeker
007d9f97d7
Make gotoblas_corename report the name of the selected TARGET rather than its aliases
2020-06-13 19:25:28 +02:00
Martin Kroeker
63d26090f5
Merge pull request #64 from xianyi/develop
...
rebase
2020-06-13 19:14:47 +02:00
Martin Kroeker
3a1b58d54a
Merge pull request #2653 from craft-zhang/cortex-a53
...
fix INIT8x4 of SGEMM on Arm Cortex-A53
2020-06-10 12:19:33 +02:00
Martin Kroeker
f7659be4a0
Merge pull request #2652 from martin-frbg/flang-fixes
...
Fixes for compilation with flang binary release 20190329
2020-06-09 20:31:06 +02:00
ZhangDanfeng
bc6fd20a40
fix INIT8x4
...
Signed-off-by: ZhangDanfeng <467688405@qq.com>
2020-06-10 01:01:16 +08:00
Martin Kroeker
3ce469a34f
Limit optimization level to O1 for flang and add -frecursive
2020-06-09 16:11:13 +02:00
Martin Kroeker
ba2c5b404d
When building with flang, use it also for the final link step to get dependencies right
2020-06-09 16:09:34 +02:00
Martin Kroeker
f07a80354b
Apply previously AOCC-specific workaround to all versions of flang
2020-06-09 16:07:03 +02:00
Martin Kroeker
fdd1b50263
Merge pull request #63 from xianyi/develop
...
rebase
2020-06-09 15:54:30 +02:00
Martin Kroeker
430e8b45fe
Merge pull request #2648 from martin-frbg/lapack411
...
Break out of potentially infinite rescaling loop in LAPACK xLARGV/xLARTG/xLARTGP
2020-06-07 19:45:52 +02:00
Martin Kroeker
88fe85f4e0
Merge pull request #2647 from martin-frbg/aocc-flang
...
Small fixes for flang in general and the AMD AOCC version of it in particular
2020-06-07 19:45:11 +02:00
Martin Kroeker
89091e6b64
Merge pull request #2645 from martin-frbg/misc_fixes
...
Miscellaneous fixes
2020-06-07 19:44:50 +02:00
Martin Kroeker
522aaf53bf
Break out of potentially infinite rescaling loop in LAPACK xLARGV/xLARTG/xLARTGP
...
Reference-LAPACK issue 411
2020-06-07 14:30:20 +02:00
Martin Kroeker
c3574ffe53
Merge pull request #2646 from wjc404/develop
...
Optimize AVX512 parallel DGEMM performance
2020-06-07 13:18:22 +02:00
Martin Kroeker
4e28dc6353
Use only -O1 with AMD AOCC version of flang
...
to prevent miscompilation of LAPACK codes and tests on Ryzen
2020-06-07 00:05:02 +02:00
Martin Kroeker
13c28889a2
Update "cosmetic fixes for non-C99 compilers"
2020-06-06 15:22:27 +02:00
wjc404
0e3ac4a06b
Add files via upload
2020-06-06 14:56:57 +08:00
Martin Kroeker
28915eed72
Cosmetic fixes for non-C99 compilers
2020-06-05 10:05:34 +02:00
Martin Kroeker
7f60fb6b91
Delete spurious copy of common_param.h
2020-06-05 10:04:16 +02:00
Martin Kroeker
0464e662ad
make blas_quickdivide unsigned and guard against miscompilation
2020-06-05 10:03:36 +02:00
Martin Kroeker
0f9a935a5a
Merge pull request #62 from xianyi/develop
...
rebase
2020-06-05 09:51:06 +02:00
Martin Kroeker
79cd69fea4
Merge pull request #2644 from martin-frbg/cmake-maxstack
...
Add CMAKE support for MAX_STACK_ALLOC setting
2020-06-05 08:33:48 +02:00
Martin Kroeker
bb12c2c854
Limit MAX_STACK_ALLOC availability to non-Wndows
2020-06-04 19:07:27 +02:00
Martin Kroeker
32c1c1e125
Update azure-pipelines.yml
2020-06-04 19:03:46 +02:00
Martin Kroeker
f1953b8b81
Update azure-pipelines.yml
2020-06-04 17:58:13 +02:00
Martin Kroeker
6e97df7b47
Add CMAKE support for MAX_STACK_ALLOC setting
2020-06-04 14:45:31 +02:00
Martin Kroeker
729303e5ed
Merge pull request #2643 from craft-zhang/cortex-a53
...
Improve performance of SGEMM on Arm Cortex-A53
2020-06-04 07:58:45 +02:00
Martin Kroeker
547965530f
Merge pull request #2638 from leezu/actions
...
Add Github Actions test for DYNAMIC_ARCH builds on Linux and macOS
2020-06-04 00:02:37 +02:00
ZhangDanfeng
9b7877ccf1
sgemm copy source init
...
Signed-off-by: ZhangDanfeng <467688405@qq.com>
2020-06-04 02:10:45 +08:00
ZhangDanfeng
f82fa802d1
Insert prefetch
...
Signed-off-by: ZhangDanfeng <467688405@qq.com>
2020-06-04 02:08:48 +08:00
Martin Kroeker
3eda3d34c3
Merge pull request #2641 from martin-frbg/ppcg4
...
Work around PPC G4 test failures
2020-06-03 16:43:46 +02:00
Martin Kroeker
a8f42ae85c
set cmake build type to Release
2020-06-03 15:28:59 +02:00
Martin Kroeker
e6e2e531bc
revert clang pragma
2020-06-03 15:16:27 +02:00