Commit Graph

4746 Commits

Author SHA1 Message Date
Martin Kroeker
525db5401c Merge pull request #74 from xianyi/develop
rebase
2020-07-30 01:04:09 +02:00
Martin Kroeker
cb097beba2 Merge pull request #2741 from martin-frbg/issue2739
Adjust A53 SGEMM parameters to reflect recent switch to 8x8 kernel
2020-07-29 10:01:14 +02:00
Martin Kroeker
7c02f4b1f7 Merge pull request #2744 from martin-frbg/issue2738
Add AMD Renoir/Matisse cpu autodetection and preliminary support for Zen3
2020-07-28 19:32:04 +02:00
Martin Kroeker
383262035d Merge pull request #2740 from RajalakshmiSR/clang-power
Fix compilation issues with clang on POWER
2020-07-28 18:15:25 +02:00
Martin Kroeker
5fa581c87e Put hint to use git develop rather than master branch in README 2020-07-28 14:22:41 +00:00
Martin Kroeker
12918358aa Add AMD Renoir/Matisse and preliminary support for Zen3 as Zen2
also support AMD family 22 Jaguar/Puma as Bobcat
2020-07-28 13:53:17 +00:00
Martin Kroeker
200f5c44cc Add AMD Renoir models and preliminary support for ZEN3 as ZEN2
also remap erroneous family 16 entry to BOBCAT and reclaim erroneous family 25 "Barcelona" for Zen3
2020-07-28 13:45:23 +00:00
Martin Kroeker
64e2e4aaf3 missing braces 2020-07-27 20:19:22 +00:00
Martin Kroeker
921ec4e9e2 Adjust A53 SGEMM parameters to reflect move to 8x8 kernel 2020-07-27 19:54:46 +00:00
Rajalakshmi Srinivasaraghavan
d557584b71 Fix compilation issues with clang on POWER
As gcc defaults to -malign-power, removing that option. Also
adding -fno-integrated-as to use GNU assembler for powerpc
assembly optimization files. Fixed other compilation errors
reported in dgemv_t.c file.
2020-07-27 14:11:07 -05:00
Martin Kroeker
a4ceb1ade9 Merge pull request #2737 from ashwinyes/add_thunderx3_target
ARM64: Add THUNDERX3T110 Target
2020-07-27 15:19:47 +02:00
Ashwin Sekhar T K
4e1be0e481 ARM64: Add THUNDERX3T110 Target 2020-07-26 23:32:24 -07:00
Martin Kroeker
49b83e00b7 Merge pull request #2735 from martin-frbg/move_potrf
Move potrf_parallel.c from lapack/getrf to lapack/potrf where it belongs
2020-07-26 19:54:11 +02:00
Martin Kroeker
769ed9ffad Merge pull request #2734 from RajalakshmiSR/p10_fix
Fix to store results in correct order for POWER10 GEMM kernels
2020-07-25 09:02:32 +02:00
Martin Kroeker
f194ad59e1 Use _Atomic instead of volatile where available (file moved from ../getrf)
must have misplaced this in ../getrf when I made that change in March 2018 (40160ff)
the only changes since then were 
RFC : Add half precision gemm for bfloat16 in OpenBLAS Rajalakshmi Srinivasaraghavan
Rajalakshmi Srinivasaraghavan committed on 14 Apr 2020 as 7ebbb50

    Change _STDC_VERSION__ to __STDC_VERSION__ 
Zhiyong Dang committed on 11 May 2018 as 3716267
2020-07-25 08:52:24 +02:00
Martin Kroeker
4fda217f99 Delete potrf_parallel.c (moving it to ../potrf) 2020-07-25 06:42:39 +00:00
Rajalakshmi Srinivasaraghavan
9be2688c78 Fix to store results in correct order for POWER10 GEMM kernels
There is a recent compiler change in __builtin_mma_disassemble_acc() which
affects the order of storing result in POWER10. Also removing new LDFLAG
-mno-power10-stub as it is handled by linker automatically.
2020-07-24 23:08:11 -05:00
Martin Kroeker
6a2a60038c Merge pull request #2720 from martin-frbg/issue2694
WIP Further fixes for 32bit POWER8
2020-07-24 23:19:45 +02:00
Martin Kroeker
251a09ec90 Typo fix 2020-07-24 16:04:58 +00:00
Martin Kroeker
95d37e1575 Regroup the 32 and 64bit sections and restore 64bit CAXPY 2020-07-24 10:13:46 +00:00
Martin Kroeker
3523bb778e Merge pull request #2721 from martin-frbg/p8align
Fix alignment errors in the power8 saxpy kernel
2020-07-24 11:06:20 +02:00
Martin Kroeker
a50d0e29c8 Merge pull request #2731 from martin-frbg/pgippc
Fixes for compilation on POWER with PGI compilers
2020-07-24 11:05:16 +02:00
Martin Kroeker
bf1f0734ff Use OPENBLAS_MAKE_COMPLEX_FLOAT on PPC only 2020-07-23 20:40:13 +00:00
Martin Kroeker
ca3561cab9 Add ifdefs around call to altivec microkernel 2020-07-23 18:30:42 +00:00
Martin Kroeker
21072e502a Typo fix 2020-07-23 17:34:56 +00:00
Martin Kroeker
7c6e56b5df Rewrite assignment to complex for better portability 2020-07-23 17:10:59 +02:00
Martin Kroeker
661c6bfa5a Exclude altivec code paths if the compiler does not support them 2020-07-23 17:08:20 +02:00
Martin Kroeker
9796e552ea Avoid undefining NAME,CNAME etc for pgcc as it makes it ignore the new defininitions 2020-07-23 17:03:28 +02:00
Martin Kroeker
d6b6e5ccd7 Merge pull request #73 from xianyi/develop
rebase
2020-07-23 16:59:06 +02:00
Martin Kroeker
349b722d8d Merge pull request #2729 from martin-frbg/issue2728
Unify BUFFER_SIZE settings for x86_64 again to fix DYNAMIC_ARCH crashes
2020-07-22 22:45:57 +02:00
Martin Kroeker
6c33764ca4 Unify BUFFER_SIZE settings for x86_64 again to fix potentially fatal mismatch in DYNAMIC_ARCH builds 2020-07-22 17:30:55 +00:00
Martin Kroeker
d1b9613fd4 Merge pull request #2727 from wyphan/develop
Patch for building on POWERPC with PGI compilers (was Patch for building on Summit)
2020-07-21 17:06:53 +02:00
Martin Kroeker
3cfc74b1a0 Merge pull request #2726 from martin-frbg/2725-2
Add detection of stdatomic.h for cmake
2020-07-21 16:42:06 +02:00
Wileam Phan
9ae154ba89 Patch for building on Summit 2020-07-20 23:30:28 -04:00
Martin Kroeker
9e21a100e3 Add trivial check for stdatomic.h 2020-07-20 22:52:09 +00:00
Martin Kroeker
31d30312dc Merge pull request #72 from xianyi/develop
rebase
2020-07-21 00:49:12 +02:00
Martin Kroeker
fcfb7ffafb Merge pull request #2725 from martin-frbg/ccheck_c11
Have c_check probe availability of C11 atomics support and stdatomic.h
2020-07-18 23:08:08 +02:00
Martin Kroeker
bbe119ee3b Update conditional for atomics to use HAVE_C11 2020-07-18 17:19:59 +00:00
Martin Kroeker
f4f74941bd Update conditional for atomics to use HAVE_C11 2020-07-18 17:14:50 +00:00
Martin Kroeker
a36eb19ae0 Update conditional for C11 atomics to use HAVE_C11 2020-07-18 17:13:24 +00:00
Martin Kroeker
ce45af8151 Update conditional for atomics to use HAVE_C11 2020-07-18 17:09:56 +00:00
Martin Kroeker
6f38de06d2 Update conditional for atomics to use HAVE_C11 2020-07-18 17:09:01 +00:00
Martin Kroeker
09eb9d2584 Update conditional for atomics to HAVE_C11 2020-07-18 17:07:38 +00:00
Martin Kroeker
791e046744 Update conditional for atomics to use HAVE_C11 2020-07-18 17:05:59 +00:00
Martin Kroeker
94bab9d1f9 Update conditional for atomics to use HAVE_C11 2020-07-18 17:03:31 +00:00
Martin Kroeker
97d6eb97b1 Report availability of C11 support 2020-07-18 16:59:33 +00:00
Martin Kroeker
4afd11dae5 Add a check for C11 atomics and stdatomic.h 2020-07-18 16:57:41 +00:00
Martin Kroeker
72ec6280c7 Merge pull request #2724 from martin-frbg/loongsonreadme
Update cross-compiling example in README to reflect change in Loongson gcc
2020-07-18 18:08:40 +02:00
Martin Kroeker
26b7f24d16 Update cross-compiling example to reflect change in Loongson gcc
for #2723
2020-07-18 12:51:37 +00:00
Martin Kroeker
0db4218fed Merge pull request #2722 from martin-frbg/cmakefcheck
Handle lack of fortran compiler more gracefully in cmake
2020-07-17 10:33:03 +02:00