Commit Graph

4668 Commits

Author SHA1 Message Date
Martin Kroeker
1c53e1366d Increment version to 0.3.10.dev 2020-06-14 22:04:37 +02:00
Martin Kroeker
95dbeff66d Merge branch 'release-0.3.0' into develop 2020-06-14 22:02:45 +02:00
Martin Kroeker
3b673a24b7 Increment version to 0.3.10.dev 2020-06-14 21:57:52 +02:00
Martin Kroeker
1eb1979050 Increment version to 0.3.10.dev 2020-06-14 21:57:15 +02:00
Martin Kroeker
efc53b6e7e Merge pull request #2665 from martin-frbg/flang-fixes-2a
Fix spelling of flang option -Mrecursive, add -Kieee and workaround for AOCC optimizer bug
2020-06-14 21:56:08 +02:00
Martin Kroeker
72888497e2 Update with 0.3.10 changes 2020-06-14 21:55:31 +02:00
Martin Kroeker
7e3e006af6 Merge pull request #2666 from martin-frbg/blastest
Update BLAS tests to what netlib 3.9.0 uses
2020-06-14 18:28:37 +02:00
Martin Kroeker
d906d14402 Merge pull request #2664 from ACSimon33/exported_symbols
Add missing exported symbols.
2020-06-14 18:27:03 +02:00
Martin Kroeker
3785c0e82b Merge pull request #2663 from martin-frbg/issue2654
Respect predefined defaults for AR, AS, LD and RANLIB
2020-06-14 18:26:43 +02:00
Martin Kroeker
f2d8879af6 Merge pull request #2661 from martin-frbg/issue2660
Report selected DYNAMIC_ARCH kernel rather than one of its aliases in gotoblas_corename
2020-06-14 18:25:37 +02:00
Martin Kroeker
6876221cf3 Remove optimization level limit for flang again and add -fno-unroll-loops for AOCC flang 2.x instead 2020-06-14 17:40:24 +02:00
Martin Kroeker
79cdcde717 Re-enable higher optimization levels for flang while disabling loop unrolling for AOCC flang 2020-06-14 17:18:16 +02:00
Martin Kroeker
18a11137f1 Update BLAS tests to correspond to Reference-LAPACK 3.9.0
replaces calculation of machine precision with call to epsilon intrinsic and removes the requirement for previous output files to be removed before rerunning tests
2020-06-14 10:26:25 +02:00
Martin Kroeker
1dd712131e Fix spelling of flang option -Mrecursive and add -Kieee 2020-06-14 00:09:31 +02:00
Martin Kroeker
0ed2adf0b2 Fix spelling of flang option -Mrecursive and add -Kieee 2020-06-14 00:01:20 +02:00
Martin Kroeker
abf670757b Respect predefined defaults for AR, AS, LD and RANLIB 2020-06-13 23:21:13 +02:00
Simon Märtens
41fc6f3cd2 Added missing exported symbols. 2020-06-13 22:37:39 +02:00
Martin Kroeker
007d9f97d7 Make gotoblas_corename report the name of the selected TARGET rather than its aliases 2020-06-13 19:25:28 +02:00
Martin Kroeker
63d26090f5 Merge pull request #64 from xianyi/develop
rebase
2020-06-13 19:14:47 +02:00
Rajalakshmi Srinivasaraghavan
9fe930f205 powerpc: Add support for future processor
This is the initial patch to support build infrastructure
for POWER10 architecture.
2020-06-11 15:47:20 -05:00
Martin Kroeker
3a1b58d54a Merge pull request #2653 from craft-zhang/cortex-a53
fix INIT8x4 of SGEMM on Arm Cortex-A53
2020-06-10 12:19:33 +02:00
Martin Kroeker
f7659be4a0 Merge pull request #2652 from martin-frbg/flang-fixes
Fixes for compilation with flang binary release 20190329
2020-06-09 20:31:06 +02:00
ZhangDanfeng
bc6fd20a40 fix INIT8x4
Signed-off-by: ZhangDanfeng <467688405@qq.com>
2020-06-10 01:01:16 +08:00
Martin Kroeker
3ce469a34f Limit optimization level to O1 for flang and add -frecursive 2020-06-09 16:11:13 +02:00
Martin Kroeker
ba2c5b404d When building with flang, use it also for the final link step to get dependencies right 2020-06-09 16:09:34 +02:00
Martin Kroeker
f07a80354b Apply previously AOCC-specific workaround to all versions of flang 2020-06-09 16:07:03 +02:00
Martin Kroeker
fdd1b50263 Merge pull request #63 from xianyi/develop
rebase
2020-06-09 15:54:30 +02:00
Martin Kroeker
430e8b45fe Merge pull request #2648 from martin-frbg/lapack411
Break out of potentially infinite rescaling loop in LAPACK xLARGV/xLARTG/xLARTGP
2020-06-07 19:45:52 +02:00
Martin Kroeker
88fe85f4e0 Merge pull request #2647 from martin-frbg/aocc-flang
Small fixes for flang in general and the AMD AOCC version of it in particular
2020-06-07 19:45:11 +02:00
Martin Kroeker
89091e6b64 Merge pull request #2645 from martin-frbg/misc_fixes
Miscellaneous fixes
2020-06-07 19:44:50 +02:00
Martin Kroeker
522aaf53bf Break out of potentially infinite rescaling loop in LAPACK xLARGV/xLARTG/xLARTGP
Reference-LAPACK issue 411
2020-06-07 14:30:20 +02:00
Martin Kroeker
c3574ffe53 Merge pull request #2646 from wjc404/develop
Optimize AVX512 parallel DGEMM performance
2020-06-07 13:18:22 +02:00
Martin Kroeker
4e28dc6353 Use only -O1 with AMD AOCC version of flang
to prevent miscompilation of LAPACK codes and tests on Ryzen
2020-06-07 00:05:02 +02:00
Martin Kroeker
13c28889a2 Update "cosmetic fixes for non-C99 compilers" 2020-06-06 15:22:27 +02:00
wjc404
0e3ac4a06b Add files via upload 2020-06-06 14:56:57 +08:00
Martin Kroeker
28915eed72 Cosmetic fixes for non-C99 compilers 2020-06-05 10:05:34 +02:00
Martin Kroeker
7f60fb6b91 Delete spurious copy of common_param.h 2020-06-05 10:04:16 +02:00
Martin Kroeker
0464e662ad make blas_quickdivide unsigned and guard against miscompilation 2020-06-05 10:03:36 +02:00
Martin Kroeker
0f9a935a5a Merge pull request #62 from xianyi/develop
rebase
2020-06-05 09:51:06 +02:00
Martin Kroeker
79cd69fea4 Merge pull request #2644 from martin-frbg/cmake-maxstack
Add CMAKE support for MAX_STACK_ALLOC setting
2020-06-05 08:33:48 +02:00
Martin Kroeker
bb12c2c854 Limit MAX_STACK_ALLOC availability to non-Wndows 2020-06-04 19:07:27 +02:00
Martin Kroeker
32c1c1e125 Update azure-pipelines.yml 2020-06-04 19:03:46 +02:00
Martin Kroeker
f1953b8b81 Update azure-pipelines.yml 2020-06-04 17:58:13 +02:00
Martin Kroeker
6e97df7b47 Add CMAKE support for MAX_STACK_ALLOC setting 2020-06-04 14:45:31 +02:00
Martin Kroeker
729303e5ed Merge pull request #2643 from craft-zhang/cortex-a53
Improve performance of SGEMM on Arm Cortex-A53
2020-06-04 07:58:45 +02:00
Martin Kroeker
547965530f Merge pull request #2638 from leezu/actions
Add Github Actions test for DYNAMIC_ARCH builds on Linux and macOS
2020-06-04 00:02:37 +02:00
ZhangDanfeng
9b7877ccf1 sgemm copy source init
Signed-off-by: ZhangDanfeng <467688405@qq.com>
2020-06-04 02:10:45 +08:00
ZhangDanfeng
f82fa802d1 Insert prefetch
Signed-off-by: ZhangDanfeng <467688405@qq.com>
2020-06-04 02:08:48 +08:00
Martin Kroeker
3eda3d34c3 Merge pull request #2641 from martin-frbg/ppcg4
Work around PPC G4 test failures
2020-06-03 16:43:46 +02:00
Martin Kroeker
a8f42ae85c set cmake build type to Release 2020-06-03 15:28:59 +02:00