Commit Graph

5151 Commits

Author SHA1 Message Date
Martin Kroeker
eddc65c7b7 Add POWER10 support flag (unconditionally for now) 2020-10-20 01:09:49 +02:00
Martin Kroeker
bb8c3f6861 Add ld/binutils version check for POWER10 support 2020-10-20 01:04:20 +02:00
Martin Kroeker
ff65952e46 Move HAVE_P10_SUPPORT to the build system
to be able to include a binutils version check
2020-10-20 00:55:41 +02:00
Martin Kroeker
6208c9899e Merge pull request #104 from xianyi/develop
rebase
2020-10-20 00:52:08 +02:00
Martin Kroeker
8e20ab21c8 Merge pull request #2924 from martin-frbg/issue2920
Put back all symbols accidentally dropped in the reorganization of gensymbol
2020-10-19 23:33:45 +02:00
Martin Kroeker
dc6e44c3f8 Merge pull request #2916 from martin-frbg/issue2911
Clean up duplicate definitions in POWER8 kernels and fix power10 option passing
2020-10-19 23:33:31 +02:00
Martin Kroeker
4ad33c46b0 Add back symbols that got dropped when splitting by type 2020-10-19 20:37:52 +02:00
Martin Kroeker
fe2a922ada Add POWER10 compiler options to CCOMMON_OPT rather than COMMON_OPT 2020-10-19 17:43:53 +02:00
Martin Kroeker
9cac379655 Merge pull request #103 from xianyi/develop
rebase
2020-10-19 15:56:20 +02:00
Martin Kroeker
a61c086408 Fix spurious trailing whitespace in comment 2020-10-19 09:12:12 +02:00
Martin Kroeker
5b9ebe4f8a Merge pull request #2919 from isuruf/export
Fix exporting some lapack and cblas symbols
2020-10-19 08:14:27 +02:00
Martin Kroeker
7eddaf0d6f Remove -mmma again (reduntant with cpu=power10) and add override statements 2020-10-19 08:11:22 +02:00
Isuru Fernando
14b1d33933 Fix exporting some lapack and cblas 2020-10-18 22:45:58 -05:00
Martin Kroeker
77669b019d Merge pull request #2915 from bartoldeman/no-empty_sgemm_direct_skylakex
sgemm_direct_skylakex: fix 75eeb26 regression.
2020-10-19 00:09:54 +02:00
Martin Kroeker
5e8ddc9001 Merge pull request #2913 from martin-frbg/issue2910
Support cross-compiling for Apple Vortex
2020-10-18 23:04:56 +02:00
Bart Oldeman
03e781b766 sgemm_direct_skylakex: fix 75eeb26 regression.
The
`#if defined(SKYLAKEX) || defined (COOPERLAKE)`
from that commit was before #include "common.h" so caused the
compiled function to be empty, returning garbage results for
qualifying sgemm's on those architectures.

Closes #2914
2020-10-18 19:58:07 +00:00
Martin Kroeker
f1a4071d8c Clean up STACKSIZE redefinition 2020-10-18 19:41:43 +02:00
Martin Kroeker
97cf10062f Clean up STACKSIZE redefinition 2020-10-18 19:39:18 +02:00
Martin Kroeker
17e288e18d Clean up STACKSIZE redefinition 2020-10-18 19:37:04 +02:00
Martin Kroeker
c1422f3e46 Clean up STACKSIZE redefinition 2020-10-18 19:31:01 +02:00
Martin Kroeker
d85b24e103 Clean up STACKSIZE redefinition 2020-10-18 19:29:45 +02:00
Martin Kroeker
7d6c85f9da Add compiler option -mmma for POWER10 2020-10-18 19:27:51 +02:00
Martin Kroeker
2e7ee7c716 Fix naming of L2 cache size item reported for Vortex 2020-10-18 19:22:05 +02:00
Martin Kroeker
efd47b0104 Merge pull request #2909 from isuruf/patch-1
Need a space when redirecting to file
2020-10-18 19:16:08 +02:00
Martin Kroeker
f5902ab0a1 Support cross-compiling for Apple Vortex 2020-10-18 19:10:58 +02:00
Martin Kroeker
1a0c185122 Support cross-compiling for Apple Vortex 2020-10-18 18:54:54 +02:00
Martin Kroeker
89eea6b455 Merge pull request #102 from xianyi/develop
rebase
2020-10-18 18:49:59 +02:00
Isuru Fernando
a5c667b55c Need a space when redirecting to file
Following two commands have two completely different meanings
perl ./gensymbol objcopy x86_64 _ 0 0  0 0 0 0 "" "64_" 1 0 1 1 1 1 > objcopy.def
perl ./gensymbol objcopy x86_64 _ 0 0  0 0 0 0 "" "64_" 1 0 1 1 1 1> objcopy.def
2020-10-18 09:40:31 -05:00
Martin Kroeker
0ac6102708 Update version string to 0.3.11.dev 2020-10-17 22:40:47 +02:00
Martin Kroeker
26a701f4ad Update version string to 0.3.11.dev 2020-10-17 22:40:06 +02:00
Martin Kroeker
fcd0fa1a3a Merge pull request #2908 from xianyi/release-0.3.0
Synchronyse tag with release 0.3.11
2020-10-17 22:38:58 +02:00
Martin Kroeker
51c22612eb Merge pull request #2907 from xianyi/develop
Update from develop for 0.3.11
v0.3.11
2020-10-17 22:14:12 +02:00
Martin Kroeker
b8f689200e Update version number to 0.3.11 2020-10-17 22:11:34 +02:00
Martin Kroeker
fe9015b619 Update version for 0.3.11 release 2020-10-17 22:10:50 +02:00
Martin Kroeker
f99b8c1502 Merge pull request #2906 from martin-frbg/changelog-0311
Update Changelog.txt with the 0.3.11 changes
2020-10-17 22:07:14 +02:00
Martin Kroeker
5381a18056 Update Changelog.txt with the 0.3.11 changes 2020-10-17 22:05:36 +02:00
Martin Kroeker
e35576c6fc Merge pull request #2905 from martin-frbg/aocc-clang
Add -mavx for clang & aocc
2020-10-17 09:45:22 +02:00
Martin Kroeker
f1bb85d378 Add AVX flags for clang/aocc as well 2020-10-16 20:52:15 +02:00
Martin Kroeker
25907e672b Merge pull request #101 from xianyi/develop
rebase
2020-10-16 20:48:58 +02:00
Martin Kroeker
9789375389 Merge pull request #2900 from martin-frbg/fixcmake_sse
Add compiler options for SSE to the cmake support files
2020-10-16 16:17:36 +02:00
Martin Kroeker
f64243ff57 Add compiler options for sse/sse2/ssse3/sse4.1 2020-10-16 10:47:06 +02:00
Martin Kroeker
786c0a3ce8 Add sse options for use of intrinics with older compilers 2020-10-16 10:41:53 +02:00
Martin Kroeker
df70667043 fix core list for sse/sse2 2020-10-16 09:55:48 +02:00
Martin Kroeker
e6c5b13a18 Merge pull request #2898 from martin-frbg/morefixes
More pre-release fixes
2020-10-16 07:26:39 +02:00
Martin Kroeker
f071d1207a add sse2 2020-10-15 22:10:32 +02:00
Martin Kroeker
dc6cefd2f5 Expressly enable -msse for 32bit DYNAMIC_ARCH kernels 2020-10-15 20:16:15 +02:00
Martin Kroeker
c339c40c01 Silence a redefinition warning 2020-10-15 19:08:12 +02:00
Martin Kroeker
ac8af9cec6 Add -msse where supported, apparently required for older gcc 2020-10-15 19:06:45 +02:00
Martin Kroeker
10379fc83b Use ifdef instead of if 2020-10-15 19:05:37 +02:00
Martin Kroeker
a85ac71633 Merge pull request #100 from xianyi/develop
rebase
2020-10-15 18:54:20 +02:00