Martin Kroeker
af501eb753
Merge pull request #2669 from mhillenibm/zarch_fix_gcc_detection
...
Zarch fix gcc detection
2020-06-17 17:55:25 +02:00
Martin Kroeker
0eb6c4dded
Merge pull request #2672 from mhillenibm/test_num_threads
...
cpp_thread_test: Change adjustment of concurrency on systems with <52 hw threads
2020-06-17 17:54:31 +02:00
Marius Hillenbrand
de838c38ef
cpp_thread_test/dgemv: fail early if concurrency is zero
...
The two test cases dgemv_tester and dgemm_tester accept the degree of
concurrency as command line argument (amongst others). Fail early if
value 0 has been specified, instead of later with less-clear symptoms.
Signed-off-by: Marius Hillenbrand <mhillen@linux.ibm.com>
2020-06-17 16:15:44 +02:00
Marius Hillenbrand
478898b37a
cpp_thread_test/dgemv: cap concurrency to number of hw threads on small systems
...
... instead of (number of hw threads - 4) to avoid invalid numbers on
smaller systems. Currently, systems with 4 or fewer CPUs (e.g., small CI
VMs) would fail the test. Fixes one of the issues discussed in #2668
Signed-off-by: Marius Hillenbrand <mhillen@linux.ibm.com>
2020-06-17 16:08:48 +02:00
Marius Hillenbrand
cde4690721
RFC: Use gcc -dumpfullversion to get minor version with gcc-7.x
...
In gcc-7.1, the behavior of -dumpversion changed to be configured
at compile-time. On some distributions it only dumps the major version
(e.g., Ubuntu), so the current checks for the gcc minor version report
false negatives. As a replacement, gcc-7.1 introduced -dumpfullversion
which always prints the full version.
Update the gcc version detection in Makefile.system to employ
-dumpfullversion with gcc-7 and newer.
Posting this patch for discussion, since it emerged from discussions
around issue #2668 and PR #2669 . It is not solving a problem right now,
but may be useful in the future.
Signed-off-by: Marius Hillenbrand <mhillen@linux.ibm.com>
2020-06-16 15:45:59 +02:00
Marius Hillenbrand
2389291766
Makefile.system: remove duplicate variable GCCVERSIONGT5
...
... to bring unified gcc version detection with common variables to the
one remaining spot in Makefile.system.
Signed-off-by: Marius Hillenbrand <mhillen@linux.ibm.com>
2020-06-16 15:06:03 +02:00
Marius Hillenbrand
a2d13ea611
Fix gcc version detection for zarch
...
Employ common variables for gcc version detection and fix the broken
check for gcc >= 5.2.
Fixes #2668
Signed-off-by: Marius Hillenbrand <mhillen@linux.ibm.com>
2020-06-16 15:06:03 +02:00
Martin Kroeker
1bd3cd66c2
Increment version to 0.3.10.dev
2020-06-14 22:05:19 +02:00
Martin Kroeker
1c53e1366d
Increment version to 0.3.10.dev
2020-06-14 22:04:37 +02:00
Martin Kroeker
63b03efc2a
Merge pull request #2667 from xianyi/develop
...
Merge develop into 0.3.0 for 0.3.10 release
2020-06-14 22:03:04 +02:00
Martin Kroeker
95dbeff66d
Merge branch 'release-0.3.0' into develop
2020-06-14 22:02:45 +02:00
Martin Kroeker
3b673a24b7
Increment version to 0.3.10.dev
2020-06-14 21:57:52 +02:00
Martin Kroeker
1eb1979050
Increment version to 0.3.10.dev
2020-06-14 21:57:15 +02:00
Martin Kroeker
efc53b6e7e
Merge pull request #2665 from martin-frbg/flang-fixes-2a
...
Fix spelling of flang option -Mrecursive, add -Kieee and workaround for AOCC optimizer bug
2020-06-14 21:56:08 +02:00
Martin Kroeker
72888497e2
Update with 0.3.10 changes
2020-06-14 21:55:31 +02:00
Martin Kroeker
7e3e006af6
Merge pull request #2666 from martin-frbg/blastest
...
Update BLAS tests to what netlib 3.9.0 uses
2020-06-14 18:28:37 +02:00
Martin Kroeker
d906d14402
Merge pull request #2664 from ACSimon33/exported_symbols
...
Add missing exported symbols.
2020-06-14 18:27:03 +02:00
Martin Kroeker
3785c0e82b
Merge pull request #2663 from martin-frbg/issue2654
...
Respect predefined defaults for AR, AS, LD and RANLIB
2020-06-14 18:26:43 +02:00
Martin Kroeker
f2d8879af6
Merge pull request #2661 from martin-frbg/issue2660
...
Report selected DYNAMIC_ARCH kernel rather than one of its aliases in gotoblas_corename
2020-06-14 18:25:37 +02:00
Martin Kroeker
6876221cf3
Remove optimization level limit for flang again and add -fno-unroll-loops for AOCC flang 2.x instead
2020-06-14 17:40:24 +02:00
Martin Kroeker
79cdcde717
Re-enable higher optimization levels for flang while disabling loop unrolling for AOCC flang
2020-06-14 17:18:16 +02:00
Martin Kroeker
18a11137f1
Update BLAS tests to correspond to Reference-LAPACK 3.9.0
...
replaces calculation of machine precision with call to epsilon intrinsic and removes the requirement for previous output files to be removed before rerunning tests
2020-06-14 10:26:25 +02:00
Martin Kroeker
1dd712131e
Fix spelling of flang option -Mrecursive and add -Kieee
2020-06-14 00:09:31 +02:00
Martin Kroeker
0ed2adf0b2
Fix spelling of flang option -Mrecursive and add -Kieee
2020-06-14 00:01:20 +02:00
Martin Kroeker
abf670757b
Respect predefined defaults for AR, AS, LD and RANLIB
2020-06-13 23:21:13 +02:00
Simon Märtens
41fc6f3cd2
Added missing exported symbols.
2020-06-13 22:37:39 +02:00
Martin Kroeker
007d9f97d7
Make gotoblas_corename report the name of the selected TARGET rather than its aliases
2020-06-13 19:25:28 +02:00
Martin Kroeker
63d26090f5
Merge pull request #64 from xianyi/develop
...
rebase
2020-06-13 19:14:47 +02:00
Rajalakshmi Srinivasaraghavan
9fe930f205
powerpc: Add support for future processor
...
This is the initial patch to support build infrastructure
for POWER10 architecture.
2020-06-11 15:47:20 -05:00
Martin Kroeker
3a1b58d54a
Merge pull request #2653 from craft-zhang/cortex-a53
...
fix INIT8x4 of SGEMM on Arm Cortex-A53
2020-06-10 12:19:33 +02:00
Martin Kroeker
f7659be4a0
Merge pull request #2652 from martin-frbg/flang-fixes
...
Fixes for compilation with flang binary release 20190329
2020-06-09 20:31:06 +02:00
ZhangDanfeng
bc6fd20a40
fix INIT8x4
...
Signed-off-by: ZhangDanfeng <467688405@qq.com>
2020-06-10 01:01:16 +08:00
Martin Kroeker
3ce469a34f
Limit optimization level to O1 for flang and add -frecursive
2020-06-09 16:11:13 +02:00
Martin Kroeker
ba2c5b404d
When building with flang, use it also for the final link step to get dependencies right
2020-06-09 16:09:34 +02:00
Martin Kroeker
f07a80354b
Apply previously AOCC-specific workaround to all versions of flang
2020-06-09 16:07:03 +02:00
Martin Kroeker
fdd1b50263
Merge pull request #63 from xianyi/develop
...
rebase
2020-06-09 15:54:30 +02:00
Leonard Lausen
b98923f33a
Test enforce -O1 for flang
2020-06-09 06:54:47 +00:00
Leonard Lausen
4cb1db0e3b
Test flang build
2020-06-09 06:31:17 +00:00
Martin Kroeker
430e8b45fe
Merge pull request #2648 from martin-frbg/lapack411
...
Break out of potentially infinite rescaling loop in LAPACK xLARGV/xLARTG/xLARTGP
2020-06-07 19:45:52 +02:00
Martin Kroeker
88fe85f4e0
Merge pull request #2647 from martin-frbg/aocc-flang
...
Small fixes for flang in general and the AMD AOCC version of it in particular
2020-06-07 19:45:11 +02:00
Martin Kroeker
89091e6b64
Merge pull request #2645 from martin-frbg/misc_fixes
...
Miscellaneous fixes
2020-06-07 19:44:50 +02:00
Martin Kroeker
522aaf53bf
Break out of potentially infinite rescaling loop in LAPACK xLARGV/xLARTG/xLARTGP
...
Reference-LAPACK issue 411
2020-06-07 14:30:20 +02:00
Martin Kroeker
c3574ffe53
Merge pull request #2646 from wjc404/develop
...
Optimize AVX512 parallel DGEMM performance
2020-06-07 13:18:22 +02:00
Martin Kroeker
4e28dc6353
Use only -O1 with AMD AOCC version of flang
...
to prevent miscompilation of LAPACK codes and tests on Ryzen
2020-06-07 00:05:02 +02:00
Martin Kroeker
13c28889a2
Update "cosmetic fixes for non-C99 compilers"
2020-06-06 15:22:27 +02:00
wjc404
0e3ac4a06b
Add files via upload
2020-06-06 14:56:57 +08:00
Martin Kroeker
28915eed72
Cosmetic fixes for non-C99 compilers
2020-06-05 10:05:34 +02:00
Martin Kroeker
7f60fb6b91
Delete spurious copy of common_param.h
2020-06-05 10:04:16 +02:00
Martin Kroeker
0464e662ad
make blas_quickdivide unsigned and guard against miscompilation
2020-06-05 10:03:36 +02:00
Martin Kroeker
0f9a935a5a
Merge pull request #62 from xianyi/develop
...
rebase
2020-06-05 09:51:06 +02:00