Commit Graph

7452 Commits

Author SHA1 Message Date
Martin Kroeker 921ec4e9e2
Adjust A53 SGEMM parameters to reflect move to 8x8 kernel 2020-07-27 19:54:46 +00:00
Rajalakshmi Srinivasaraghavan d557584b71 Fix compilation issues with clang on POWER
As gcc defaults to -malign-power, removing that option. Also
adding -fno-integrated-as to use GNU assembler for powerpc
assembly optimization files. Fixed other compilation errors
reported in dgemv_t.c file.
2020-07-27 14:11:07 -05:00
Martin Kroeker a4ceb1ade9
Merge pull request #2737 from ashwinyes/add_thunderx3_target
ARM64: Add THUNDERX3T110 Target
2020-07-27 15:19:47 +02:00
Ashwin Sekhar T K 4e1be0e481 ARM64: Add THUNDERX3T110 Target 2020-07-26 23:32:24 -07:00
Martin Kroeker 49b83e00b7
Merge pull request #2735 from martin-frbg/move_potrf
Move potrf_parallel.c from lapack/getrf to lapack/potrf where it belongs
2020-07-26 19:54:11 +02:00
Martin Kroeker 769ed9ffad
Merge pull request #2734 from RajalakshmiSR/p10_fix
Fix to store results in correct order for POWER10 GEMM kernels
2020-07-25 09:02:32 +02:00
Martin Kroeker f194ad59e1
Use _Atomic instead of volatile where available (file moved from ../getrf)
must have misplaced this in ../getrf when I made that change in March 2018 (40160ff)
the only changes since then were 
RFC : Add half precision gemm for bfloat16 in OpenBLAS Rajalakshmi Srinivasaraghavan
Rajalakshmi Srinivasaraghavan committed on 14 Apr 2020 as 7ebbb50

    Change _STDC_VERSION__ to __STDC_VERSION__ 
Zhiyong Dang committed on 11 May 2018 as 3716267
2020-07-25 08:52:24 +02:00
Martin Kroeker 4fda217f99
Delete potrf_parallel.c (moving it to ../potrf) 2020-07-25 06:42:39 +00:00
Rajalakshmi Srinivasaraghavan 9be2688c78 Fix to store results in correct order for POWER10 GEMM kernels
There is a recent compiler change in __builtin_mma_disassemble_acc() which
affects the order of storing result in POWER10. Also removing new LDFLAG
-mno-power10-stub as it is handled by linker automatically.
2020-07-24 23:08:11 -05:00
Martin Kroeker 6a2a60038c
Merge pull request #2720 from martin-frbg/issue2694
WIP Further fixes for 32bit POWER8
2020-07-24 23:19:45 +02:00
Martin Kroeker 251a09ec90
Typo fix 2020-07-24 16:04:58 +00:00
Martin Kroeker 95d37e1575
Regroup the 32 and 64bit sections and restore 64bit CAXPY 2020-07-24 10:13:46 +00:00
Martin Kroeker 3523bb778e
Merge pull request #2721 from martin-frbg/p8align
Fix alignment errors in the power8 saxpy kernel
2020-07-24 11:06:20 +02:00
Martin Kroeker a50d0e29c8
Merge pull request #2731 from martin-frbg/pgippc
Fixes for compilation on POWER with PGI compilers
2020-07-24 11:05:16 +02:00
Martin Kroeker bf1f0734ff
Use OPENBLAS_MAKE_COMPLEX_FLOAT on PPC only 2020-07-23 20:40:13 +00:00
Martin Kroeker ca3561cab9
Add ifdefs around call to altivec microkernel 2020-07-23 18:30:42 +00:00
Martin Kroeker 21072e502a
Typo fix 2020-07-23 17:34:56 +00:00
Martin Kroeker 7c6e56b5df
Rewrite assignment to complex for better portability 2020-07-23 17:10:59 +02:00
Martin Kroeker 661c6bfa5a
Exclude altivec code paths if the compiler does not support them 2020-07-23 17:08:20 +02:00
Martin Kroeker 9796e552ea
Avoid undefining NAME,CNAME etc for pgcc as it makes it ignore the new defininitions 2020-07-23 17:03:28 +02:00
Martin Kroeker d6b6e5ccd7
Merge pull request #73 from xianyi/develop
rebase
2020-07-23 16:59:06 +02:00
Martin Kroeker 349b722d8d
Merge pull request #2729 from martin-frbg/issue2728
Unify BUFFER_SIZE settings for x86_64 again to fix DYNAMIC_ARCH crashes
2020-07-22 22:45:57 +02:00
Martin Kroeker 6c33764ca4
Unify BUFFER_SIZE settings for x86_64 again to fix potentially fatal mismatch in DYNAMIC_ARCH builds 2020-07-22 17:30:55 +00:00
Martin Kroeker d1b9613fd4
Merge pull request #2727 from wyphan/develop
Patch for building on POWERPC with PGI compilers (was Patch for building on Summit)
2020-07-21 17:06:53 +02:00
Martin Kroeker 3cfc74b1a0
Merge pull request #2726 from martin-frbg/2725-2
Add detection of stdatomic.h for cmake
2020-07-21 16:42:06 +02:00
Wileam Phan 9ae154ba89 Patch for building on Summit 2020-07-20 23:30:28 -04:00
Martin Kroeker 9e21a100e3
Add trivial check for stdatomic.h 2020-07-20 22:52:09 +00:00
Martin Kroeker 31d30312dc
Merge pull request #72 from xianyi/develop
rebase
2020-07-21 00:49:12 +02:00
Martin Kroeker fcfb7ffafb
Merge pull request #2725 from martin-frbg/ccheck_c11
Have c_check probe availability of C11 atomics support and stdatomic.h
2020-07-18 23:08:08 +02:00
Martin Kroeker bbe119ee3b
Update conditional for atomics to use HAVE_C11 2020-07-18 17:19:59 +00:00
Martin Kroeker f4f74941bd
Update conditional for atomics to use HAVE_C11 2020-07-18 17:14:50 +00:00
Martin Kroeker a36eb19ae0
Update conditional for C11 atomics to use HAVE_C11 2020-07-18 17:13:24 +00:00
Martin Kroeker ce45af8151
Update conditional for atomics to use HAVE_C11 2020-07-18 17:09:56 +00:00
Martin Kroeker 6f38de06d2
Update conditional for atomics to use HAVE_C11 2020-07-18 17:09:01 +00:00
Martin Kroeker 09eb9d2584
Update conditional for atomics to HAVE_C11 2020-07-18 17:07:38 +00:00
Martin Kroeker 791e046744
Update conditional for atomics to use HAVE_C11 2020-07-18 17:05:59 +00:00
Martin Kroeker 94bab9d1f9
Update conditional for atomics to use HAVE_C11 2020-07-18 17:03:31 +00:00
Martin Kroeker 97d6eb97b1
Report availability of C11 support 2020-07-18 16:59:33 +00:00
Martin Kroeker 4afd11dae5
Add a check for C11 atomics and stdatomic.h 2020-07-18 16:57:41 +00:00
Martin Kroeker 72ec6280c7
Merge pull request #2724 from martin-frbg/loongsonreadme
Update cross-compiling example in README to reflect change in Loongson gcc
2020-07-18 18:08:40 +02:00
Martin Kroeker 26b7f24d16
Update cross-compiling example to reflect change in Loongson gcc
for #2723
2020-07-18 12:51:37 +00:00
Martin Kroeker 0db4218fed
Merge pull request #2722 from martin-frbg/cmakefcheck
Handle lack of fortran compiler more gracefully in cmake
2020-07-17 10:33:03 +02:00
Martin Kroeker 9d000ecaa2
include CheckLanguage module 2020-07-16 22:36:35 +00:00
Martin Kroeker a847d00366
handle missing lack of fortran compiler more gracefully 2020-07-16 22:17:39 +00:00
Martin Kroeker 0033f8be0d
Use vec_vsx_ld/st to fix misaligned accesses flagged by asan 2020-07-16 23:32:54 +02:00
Martin Kroeker f308e741b2
remove debug output and revert changes to cdot and crot 2020-07-15 10:00:07 +02:00
Martin Kroeker 4f5d26bb02
Merge pull request #2716 from RajalakshmiSR/p10_ldflag
Add new linker option for POWER10
2020-07-15 01:20:54 +02:00
Rajalakshmi Srinivasaraghavan 417c4e8af8 Add new linker option for POWER10
While building with DYNAMIC_ARCH on POWER9 with POWER10
aware toolchain, new LDFLAG is needed to avoid POWER10
instructions on PLT calls .
2020-07-14 11:54:04 -05:00
Martin Kroeker da17abec87
fix trailing whitespace 2020-07-14 18:20:03 +02:00
Martin Kroeker f8c2697701
Use POWER6 GEMM, TRMM and DTRSM on 32bit POWER8 2020-07-14 18:11:19 +02:00