Martin Kroeker
383262035d
Merge pull request #2740 from RajalakshmiSR/clang-power
...
Fix compilation issues with clang on POWER
2020-07-28 18:15:25 +02:00
Martin Kroeker
5fa581c87e
Put hint to use git develop rather than master branch in README
2020-07-28 14:22:41 +00:00
Martin Kroeker
12918358aa
Add AMD Renoir/Matisse and preliminary support for Zen3 as Zen2
...
also support AMD family 22 Jaguar/Puma as Bobcat
2020-07-28 13:53:17 +00:00
Martin Kroeker
200f5c44cc
Add AMD Renoir models and preliminary support for ZEN3 as ZEN2
...
also remap erroneous family 16 entry to BOBCAT and reclaim erroneous family 25 "Barcelona" for Zen3
2020-07-28 13:45:23 +00:00
Martin Kroeker
64e2e4aaf3
missing braces
2020-07-27 20:19:22 +00:00
Martin Kroeker
921ec4e9e2
Adjust A53 SGEMM parameters to reflect move to 8x8 kernel
2020-07-27 19:54:46 +00:00
Rajalakshmi Srinivasaraghavan
d557584b71
Fix compilation issues with clang on POWER
...
As gcc defaults to -malign-power, removing that option. Also
adding -fno-integrated-as to use GNU assembler for powerpc
assembly optimization files. Fixed other compilation errors
reported in dgemv_t.c file.
2020-07-27 14:11:07 -05:00
Martin Kroeker
a4ceb1ade9
Merge pull request #2737 from ashwinyes/add_thunderx3_target
...
ARM64: Add THUNDERX3T110 Target
2020-07-27 15:19:47 +02:00
Ashwin Sekhar T K
4e1be0e481
ARM64: Add THUNDERX3T110 Target
2020-07-26 23:32:24 -07:00
Martin Kroeker
49b83e00b7
Merge pull request #2735 from martin-frbg/move_potrf
...
Move potrf_parallel.c from lapack/getrf to lapack/potrf where it belongs
2020-07-26 19:54:11 +02:00
Martin Kroeker
769ed9ffad
Merge pull request #2734 from RajalakshmiSR/p10_fix
...
Fix to store results in correct order for POWER10 GEMM kernels
2020-07-25 09:02:32 +02:00
Martin Kroeker
f194ad59e1
Use _Atomic instead of volatile where available (file moved from ../getrf)
...
must have misplaced this in ../getrf when I made that change in March 2018 (40160ff
)
the only changes since then were
RFC : Add half precision gemm for bfloat16 in OpenBLAS Rajalakshmi Srinivasaraghavan
Rajalakshmi Srinivasaraghavan committed on 14 Apr 2020 as 7ebbb50
Change _STDC_VERSION__ to __STDC_VERSION__
Zhiyong Dang committed on 11 May 2018 as 3716267
2020-07-25 08:52:24 +02:00
Martin Kroeker
4fda217f99
Delete potrf_parallel.c (moving it to ../potrf)
2020-07-25 06:42:39 +00:00
Rajalakshmi Srinivasaraghavan
9be2688c78
Fix to store results in correct order for POWER10 GEMM kernels
...
There is a recent compiler change in __builtin_mma_disassemble_acc() which
affects the order of storing result in POWER10. Also removing new LDFLAG
-mno-power10-stub as it is handled by linker automatically.
2020-07-24 23:08:11 -05:00
Martin Kroeker
6a2a60038c
Merge pull request #2720 from martin-frbg/issue2694
...
WIP Further fixes for 32bit POWER8
2020-07-24 23:19:45 +02:00
Martin Kroeker
251a09ec90
Typo fix
2020-07-24 16:04:58 +00:00
Martin Kroeker
95d37e1575
Regroup the 32 and 64bit sections and restore 64bit CAXPY
2020-07-24 10:13:46 +00:00
Martin Kroeker
3523bb778e
Merge pull request #2721 from martin-frbg/p8align
...
Fix alignment errors in the power8 saxpy kernel
2020-07-24 11:06:20 +02:00
Martin Kroeker
a50d0e29c8
Merge pull request #2731 from martin-frbg/pgippc
...
Fixes for compilation on POWER with PGI compilers
2020-07-24 11:05:16 +02:00
Martin Kroeker
bf1f0734ff
Use OPENBLAS_MAKE_COMPLEX_FLOAT on PPC only
2020-07-23 20:40:13 +00:00
Martin Kroeker
ca3561cab9
Add ifdefs around call to altivec microkernel
2020-07-23 18:30:42 +00:00
Martin Kroeker
21072e502a
Typo fix
2020-07-23 17:34:56 +00:00
Martin Kroeker
7c6e56b5df
Rewrite assignment to complex for better portability
2020-07-23 17:10:59 +02:00
Martin Kroeker
661c6bfa5a
Exclude altivec code paths if the compiler does not support them
2020-07-23 17:08:20 +02:00
Martin Kroeker
9796e552ea
Avoid undefining NAME,CNAME etc for pgcc as it makes it ignore the new defininitions
2020-07-23 17:03:28 +02:00
Martin Kroeker
d6b6e5ccd7
Merge pull request #73 from xianyi/develop
...
rebase
2020-07-23 16:59:06 +02:00
Martin Kroeker
349b722d8d
Merge pull request #2729 from martin-frbg/issue2728
...
Unify BUFFER_SIZE settings for x86_64 again to fix DYNAMIC_ARCH crashes
2020-07-22 22:45:57 +02:00
Martin Kroeker
6c33764ca4
Unify BUFFER_SIZE settings for x86_64 again to fix potentially fatal mismatch in DYNAMIC_ARCH builds
2020-07-22 17:30:55 +00:00
Martin Kroeker
d1b9613fd4
Merge pull request #2727 from wyphan/develop
...
Patch for building on POWERPC with PGI compilers (was Patch for building on Summit)
2020-07-21 17:06:53 +02:00
Martin Kroeker
3cfc74b1a0
Merge pull request #2726 from martin-frbg/2725-2
...
Add detection of stdatomic.h for cmake
2020-07-21 16:42:06 +02:00
Wileam Phan
9ae154ba89
Patch for building on Summit
2020-07-20 23:30:28 -04:00
Martin Kroeker
9e21a100e3
Add trivial check for stdatomic.h
2020-07-20 22:52:09 +00:00
Martin Kroeker
31d30312dc
Merge pull request #72 from xianyi/develop
...
rebase
2020-07-21 00:49:12 +02:00
Martin Kroeker
fcfb7ffafb
Merge pull request #2725 from martin-frbg/ccheck_c11
...
Have c_check probe availability of C11 atomics support and stdatomic.h
2020-07-18 23:08:08 +02:00
Martin Kroeker
bbe119ee3b
Update conditional for atomics to use HAVE_C11
2020-07-18 17:19:59 +00:00
Martin Kroeker
f4f74941bd
Update conditional for atomics to use HAVE_C11
2020-07-18 17:14:50 +00:00
Martin Kroeker
a36eb19ae0
Update conditional for C11 atomics to use HAVE_C11
2020-07-18 17:13:24 +00:00
Martin Kroeker
ce45af8151
Update conditional for atomics to use HAVE_C11
2020-07-18 17:09:56 +00:00
Martin Kroeker
6f38de06d2
Update conditional for atomics to use HAVE_C11
2020-07-18 17:09:01 +00:00
Martin Kroeker
09eb9d2584
Update conditional for atomics to HAVE_C11
2020-07-18 17:07:38 +00:00
Martin Kroeker
791e046744
Update conditional for atomics to use HAVE_C11
2020-07-18 17:05:59 +00:00
Martin Kroeker
94bab9d1f9
Update conditional for atomics to use HAVE_C11
2020-07-18 17:03:31 +00:00
Martin Kroeker
97d6eb97b1
Report availability of C11 support
2020-07-18 16:59:33 +00:00
Martin Kroeker
4afd11dae5
Add a check for C11 atomics and stdatomic.h
2020-07-18 16:57:41 +00:00
Martin Kroeker
72ec6280c7
Merge pull request #2724 from martin-frbg/loongsonreadme
...
Update cross-compiling example in README to reflect change in Loongson gcc
2020-07-18 18:08:40 +02:00
Martin Kroeker
26b7f24d16
Update cross-compiling example to reflect change in Loongson gcc
...
for #2723
2020-07-18 12:51:37 +00:00
Martin Kroeker
0db4218fed
Merge pull request #2722 from martin-frbg/cmakefcheck
...
Handle lack of fortran compiler more gracefully in cmake
2020-07-17 10:33:03 +02:00
Martin Kroeker
9d000ecaa2
include CheckLanguage module
2020-07-16 22:36:35 +00:00
Martin Kroeker
a847d00366
handle missing lack of fortran compiler more gracefully
2020-07-16 22:17:39 +00:00
Martin Kroeker
0033f8be0d
Use vec_vsx_ld/st to fix misaligned accesses flagged by asan
2020-07-16 23:32:54 +02:00