Commit Graph

  • f5fcc5baec
    Add trivial gemm test for multithread consistency Martin Kroeker 2020-08-15 13:30:29 +0200
  • 597010a968
    Fix incorrect argument to SLASET Martin Kroeker 2020-08-14 00:41:56 +0200
  • d64f1ef26b
    Fix incorrect argument to SLASET Martin Kroeker 2020-08-14 00:40:24 +0200
  • c62aad62e5
    Fix incorrect calls to DLASET Martin Kroeker 2020-08-14 00:35:45 +0200
  • e740c4873d Enable COOPERLAKE build target Chen, Guobing 2020-08-13 06:17:34 +0800
  • efdd237a91
    Add a dedicated POWER9 build to the Travis CI (#2774) Martin Kroeker 2020-08-12 23:08:38 +0200
  • 8f1111f4c3
    Update .travis.yml Martin Kroeker 2020-08-12 22:35:29 +0200
  • b05289dd23
    Switch p9 to Ubuntu 18 container to ensure P9 hosting Martin Kroeker 2020-08-12 19:57:38 +0200
  • 7632a561df
    use autodetection for power9 in case there are still power8 boxes in the mix Martin Kroeker 2020-08-12 18:05:14 +0200
  • 9413398243
    Update .travis.yml Martin Kroeker 2020-08-12 16:54:06 +0200
  • ef2db95f57
    add the script back... Martin Kroeker 2020-08-12 13:57:39 +0200
  • 5137146d5d
    use plain apt commands rather than addon on ppc64le Martin Kroeker 2020-08-12 12:50:55 +0200
  • 072f68dbcb
    Update .travis.yml Martin Kroeker 2020-08-12 10:54:10 +0200
  • f7bd46483a
    Update .travis.yml Martin Kroeker 2020-08-11 21:13:48 +0200
  • 93e748d67a Change BFLOAT16 data type/API support naming Chen, Guobing 2020-08-11 09:27:29 +0800
  • 4573cb2f43
    Merge pull request #2765 from martin-frbg/issue2760 Martin Kroeker 2020-08-11 22:40:17 +0200
  • 2a4bb797db
    Merge pull request #2773 from martin-frbg/issue2770 Martin Kroeker 2020-08-11 21:02:55 +0200
  • 72f8d8f44d
    Update .travis.yml Martin Kroeker 2020-08-11 18:34:22 +0200
  • cbbe38bb88
    Merge pull request #2772 from mhillenibm/s390x_gemm_tuning Martin Kroeker 2020-08-11 18:14:09 +0200
  • 4f9fb930ec
    Update .travis.yml Martin Kroeker 2020-08-11 18:06:18 +0200
  • 22f746786b
    Update .travis.yml Martin Kroeker 2020-08-11 17:57:16 +0200
  • 780bd896b2
    Update .travis.yml Martin Kroeker 2020-08-11 17:49:59 +0200
  • 7dd3ccf798
    Bump gcc version for POWER9 build Martin Kroeker 2020-08-11 17:37:36 +0200
  • 8ccd6831d2
    Add dedicated POWER9 build Martin Kroeker 2020-08-11 16:12:49 +0200
  • 619343278d
    Fix mishandling of NO_CBLAS=0 and NO_LAPACKE=0 Martin Kroeker 2020-08-11 13:40:40 +0200
  • fee361ae64
    fix another source of NO_CBLAS=0 surprise Martin Kroeker 2020-08-11 13:27:19 +0200
  • 62f4c84f27
    Merge pull request #76 from xianyi/develop Martin Kroeker 2020-08-11 13:25:12 +0200
  • e115c97e05 s390x/SGEMM: adjust default P and Q to multiples of M Marius Hillenbrand 2020-08-11 12:55:59 +0200
  • 07c334e7be s390x: Factor out small block sizes for SGEMM/DGEMM on z14 Marius Hillenbrand 2020-08-11 12:55:53 +0200
  • e2828e30aa s390x: Optimize SGEMM/DGEMM blocks for z14 with explicit loop unrolling/interleaving Marius Hillenbrand 2020-08-11 12:55:42 +0200
  • 7219c9cb87
    Merge pull request #2764 from martin-frbg/lapacktests Martin Kroeker 2020-08-10 13:27:51 +0200
  • c9d32674ea
    Add memory barrier to the blas_lock implementation for Linux Martin Kroeker 2020-08-09 19:17:04 +0200
  • 64259d521a
    Fix use of unallocated array in workspace query and wrong type of argument to xSCAL Martin Kroeker 2020-08-09 13:02:27 +0200
  • 6f5ca44c1a
    Expand TAU array as SGEMQR/DGEMQR read elements 2 and 3 Martin Kroeker 2020-08-09 12:59:20 +0200
  • d28b3f2776
    Create Jenkinsfile for OSUOSL PowerCI Martin Kroeker 2020-08-08 18:05:20 +0200
  • ba3f7b3acf
    Merge pull request #2761 from RajalakshmiSR/Makefile_err Martin Kroeker 2020-08-08 12:20:04 +0200
  • 475b5c95b9 Remove extra symbol in Makefile Rajalakshmi Srinivasaraghavan 2020-08-07 15:27:44 -0500
  • cd60080d4a
    Merge pull request #2758 from martin-frbg/undef_shift Martin Kroeker 2020-08-03 23:30:26 +0200
  • 4847bfdddd
    Merge pull request #2757 from martin-frbg/cmake64 Martin Kroeker 2020-08-02 23:05:21 +0200
  • 81dcfdcf39
    Multiply by 2 instead of left-shifting a potentially negative number Martin Kroeker 2020-08-02 18:29:56 +0200
  • 0ef4b3f1f2
    Multiply instead of doing a left shift of a potentially negative number Martin Kroeker 2020-08-02 18:27:40 +0200
  • aa53a8a5cb
    Multiply by two instead of left-shifting one place Martin Kroeker 2020-08-02 18:25:09 +0200
  • aa3a1e7d8c
    Multiply by two rather than left shift by one place Martin Kroeker 2020-08-02 18:22:31 +0200
  • aaf1a17168
    Apply current library name suffix Martin Kroeker 2020-08-02 17:58:33 +0200
  • 53add6a80d
    Apply library name suffix to openblas if any Martin Kroeker 2020-08-02 17:57:12 +0200
  • 9eb897cc01
    Merge pull request #75 from xianyi/develop Martin Kroeker 2020-08-02 17:50:06 +0200
  • 7cead56258
    Merge pull request #2753 from martin-frbg/issue2751 Martin Kroeker 2020-08-02 15:32:46 +0200
  • 6794ac3415
    Add SYMBOLPREFIX and/or -SUFFIX to cblas.h if needed Martin Kroeker 2020-08-02 11:20:08 +0200
  • ecf4b9e0fc
    Improve substitution rules for SYMBOLPREFIX and -SUFFIX addition Martin Kroeker 2020-08-01 17:06:03 +0200
  • dfe5d09641
    Merge pull request #2756 from martin-frbg/issue2755 Martin Kroeker 2020-08-01 15:19:02 +0200
  • 60cd5e55fc
    Protect against inadvertent activation of USE_CUDA Martin Kroeker 2020-08-01 12:31:39 +0200
  • da9e2a7ada
    Add SYMBOLPREFIX and/or SYMBOLSUFFIX to cblas prototypes Martin Kroeker 2020-07-31 16:03:33 +0200
  • c88cbc5e0d
    Merge pull request #2752 from kadler/cpuid_aix Martin Kroeker 2020-07-31 12:52:24 +0200
  • 589c74aed3
    Use systemcfg APIs for CPU detection on AIX Kevin Adler 2020-07-30 20:52:16 -0500
  • 104aa678b0
    Fix inadvertent version number reversal to 0.3.9.dev caused by #2710 Martin Kroeker 2020-07-30 11:40:52 +0200
  • c6b48e0394
    Merge pull request #2749 from martin-frbg/make_ppc Martin Kroeker 2020-07-30 11:35:53 +0200
  • 4927251298
    Merge pull request #2750 from RajalakshmiSR/dgemv_p10 Martin Kroeker 2020-07-30 10:13:19 +0200
  • f77b6a83f4 dgemv optimization for POWER10 Rajalakshmi Srinivasaraghavan 2020-07-29 18:59:32 -0500
  • 39724e8128
    Separate OpenMP handling and allow compilation of Power9 code with older gcc Martin Kroeker 2020-07-30 01:14:08 +0200
  • 525db5401c
    Merge pull request #74 from xianyi/develop Martin Kroeker 2020-07-30 01:04:09 +0200
  • cb097beba2
    Merge pull request #2741 from martin-frbg/issue2739 Martin Kroeker 2020-07-29 10:01:14 +0200
  • 7c02f4b1f7
    Merge pull request #2744 from martin-frbg/issue2738 Martin Kroeker 2020-07-28 19:32:04 +0200
  • 383262035d
    Merge pull request #2740 from RajalakshmiSR/clang-power Martin Kroeker 2020-07-28 18:15:25 +0200
  • 5fa581c87e
    Put hint to use git develop rather than master branch in README Martin Kroeker 2020-07-28 14:22:41 +0000
  • 12918358aa
    Add AMD Renoir/Matisse and preliminary support for Zen3 as Zen2 Martin Kroeker 2020-07-28 13:53:17 +0000
  • 200f5c44cc
    Add AMD Renoir models and preliminary support for ZEN3 as ZEN2 Martin Kroeker 2020-07-28 13:45:23 +0000
  • c4176105d1
    Fix accidental deletion Martin Kroeker 2020-07-28 10:08:41 +0000
  • ba27936ceb
    Add cpuid detection of AMD Zen2 Matisse and Renoir Martin Kroeker 2020-07-28 09:03:52 +0000
  • afdca268ab
    Add AMD Matisse and Renoir Zen2 variants Martin Kroeker 2020-07-28 09:00:12 +0000
  • 64e2e4aaf3
    missing braces Martin Kroeker 2020-07-27 20:19:22 +0000
  • 921ec4e9e2
    Adjust A53 SGEMM parameters to reflect move to 8x8 kernel Martin Kroeker 2020-07-27 19:54:46 +0000
  • d557584b71 Fix compilation issues with clang on POWER Rajalakshmi Srinivasaraghavan 2020-07-27 14:11:07 -0500
  • a4ceb1ade9
    Merge pull request #2737 from ashwinyes/add_thunderx3_target Martin Kroeker 2020-07-27 15:19:47 +0200
  • 4e1be0e481 ARM64: Add THUNDERX3T110 Target Ashwin Sekhar T K 2020-06-11 04:12:49 -0700
  • 49b83e00b7
    Merge pull request #2735 from martin-frbg/move_potrf Martin Kroeker 2020-07-26 19:54:11 +0200
  • 769ed9ffad
    Merge pull request #2734 from RajalakshmiSR/p10_fix Martin Kroeker 2020-07-25 09:02:32 +0200
  • f194ad59e1
    Use _Atomic instead of volatile where available (file moved from ../getrf) Martin Kroeker 2020-07-25 08:52:24 +0200
  • 4fda217f99
    Delete potrf_parallel.c (moving it to ../potrf) Martin Kroeker 2020-07-25 06:42:39 +0000
  • 9be2688c78 Fix to store results in correct order for POWER10 GEMM kernels Rajalakshmi Srinivasaraghavan 2020-07-24 23:08:11 -0500
  • 6a2a60038c
    Merge pull request #2720 from martin-frbg/issue2694 Martin Kroeker 2020-07-24 23:19:45 +0200
  • 251a09ec90
    Typo fix Martin Kroeker 2020-07-24 16:04:58 +0000
  • 95d37e1575
    Regroup the 32 and 64bit sections and restore 64bit CAXPY Martin Kroeker 2020-07-24 10:13:46 +0000
  • 3523bb778e
    Merge pull request #2721 from martin-frbg/p8align Martin Kroeker 2020-07-24 11:06:20 +0200
  • a50d0e29c8
    Merge pull request #2731 from martin-frbg/pgippc Martin Kroeker 2020-07-24 11:05:16 +0200
  • bf1f0734ff
    Use OPENBLAS_MAKE_COMPLEX_FLOAT on PPC only Martin Kroeker 2020-07-23 20:40:13 +0000
  • ca3561cab9
    Add ifdefs around call to altivec microkernel Martin Kroeker 2020-07-23 18:30:42 +0000
  • 21072e502a
    Typo fix Martin Kroeker 2020-07-23 17:34:56 +0000
  • 7c6e56b5df
    Rewrite assignment to complex for better portability Martin Kroeker 2020-07-23 17:10:59 +0200
  • 661c6bfa5a
    Exclude altivec code paths if the compiler does not support them Martin Kroeker 2020-07-23 17:08:20 +0200
  • 9796e552ea
    Avoid undefining NAME,CNAME etc for pgcc as it makes it ignore the new defininitions Martin Kroeker 2020-07-23 17:03:28 +0200
  • d6b6e5ccd7
    Merge pull request #73 from xianyi/develop Martin Kroeker 2020-07-23 16:59:06 +0200
  • 349b722d8d
    Merge pull request #2729 from martin-frbg/issue2728 Martin Kroeker 2020-07-22 22:45:57 +0200
  • 6c33764ca4
    Unify BUFFER_SIZE settings for x86_64 again to fix potentially fatal mismatch in DYNAMIC_ARCH builds Martin Kroeker 2020-07-22 17:30:55 +0000
  • d1b9613fd4
    Merge pull request #2727 from wyphan/develop Martin Kroeker 2020-07-21 17:06:53 +0200
  • 3cfc74b1a0
    Merge pull request #2726 from martin-frbg/2725-2 Martin Kroeker 2020-07-21 16:42:06 +0200
  • 9ae154ba89 Patch for building on Summit Wileam Phan 2020-07-20 23:30:28 -0400
  • 9e21a100e3
    Add trivial check for stdatomic.h Martin Kroeker 2020-07-20 22:52:09 +0000
  • 31d30312dc
    Merge pull request #72 from xianyi/develop Martin Kroeker 2020-07-21 00:49:12 +0200
  • fcfb7ffafb
    Merge pull request #2725 from martin-frbg/ccheck_c11 Martin Kroeker 2020-07-18 23:08:08 +0200
  • bbe119ee3b
    Update conditional for atomics to use HAVE_C11 Martin Kroeker 2020-07-18 17:19:59 +0000