Commit Graph

  • a6f45ab7fe
    Reduce thread count if necessary Martin Kroeker 2020-04-10 12:31:12 +0200
  • d8bdd4f236
    revert previous, num_buffers is not a makefile variable Martin Kroeker 2020-04-09 23:25:33 +0200
  • ff23bd09f4
    Update gemm.c Martin Kroeker 2020-04-09 23:24:21 +0200
  • 1d12a33a9d
    print num_buffers at end of build just to be sure Martin Kroeker 2020-04-09 23:09:34 +0200
  • c00b960009
    Update .drone.yml Martin Kroeker 2020-04-09 21:38:56 +0200
  • 417eb28517
    Update .drone.yml Martin Kroeker 2020-04-09 21:38:17 +0200
  • 54973cca1b
    Update .drone.yml Martin Kroeker 2020-04-09 20:35:27 +0200
  • 5d2cf4ec19
    Update gemm.c Martin Kroeker 2020-04-09 19:13:56 +0200
  • 4ffe9d788f
    Update .drone.yml Martin Kroeker 2020-04-09 18:04:12 +0200
  • f10c9a99a3
    Delete azure-pipelines.yml Martin Kroeker 2020-04-09 16:52:00 +0200
  • b7fa8fe694
    Delete appveyor.yml Martin Kroeker 2020-04-09 16:51:37 +0200
  • 71b8e284e6
    Delete .travis.yml Martin Kroeker 2020-04-09 16:51:20 +0200
  • 8290b6787f
    Update .drone.yml Martin Kroeker 2020-04-09 16:50:49 +0200
  • 67de70813c
    remove thread count from pragma as drone.io HW varies Martin Kroeker 2020-04-09 15:47:52 +0200
  • 35036d9b61
    reduce NUM_PARALLEL to 1 Martin Kroeker 2020-04-09 14:43:59 +0200
  • ce95853101
    limit dgemm benchmark to just 10,10,0 Martin Kroeker 2020-04-09 14:39:04 +0200
  • 8d07cf9b67 Fix compilation problem on loongson platform gxw 2020-04-09 19:25:13 +0800
  • 11528f3afe
    Update gemm.c Martin Kroeker 2020-04-08 22:19:18 +0200
  • 7b4773b24d Add API to set thread affinity on Linux. Sharvil Nanavati 2020-04-08 12:47:41 -0700
  • 9ed53824d9
    Update gemm.c Martin Kroeker 2020-04-08 20:26:58 +0200
  • 3778b91657
    Update gemm.c Martin Kroeker 2020-04-08 17:25:28 +0200
  • 626e98028d
    Update gemm.c Martin Kroeker 2020-04-08 15:24:22 +0200
  • aa170123e6
    fix accidental deletion Martin Kroeker 2020-04-08 14:58:37 +0200
  • 353e996d1d
    Merge branch 'develop' into dronethunder2 Martin Kroeker 2020-04-08 14:45:32 +0200
  • bc792904ea
    use modified gemm benchmark to trigger race condition Martin Kroeker 2020-04-08 14:43:19 +0200
  • d8735bb66a
    parallelize gemm benchmark to trigger races Martin Kroeker 2020-04-08 14:41:21 +0200
  • 69f277f8ee
    Add another memory barrier for ARM and a multicore test run on ThunderX to help detect such issues (#2544) Martin Kroeker 2020-04-08 11:04:51 +0200
  • 0e0681f535
    Experimental barrier Martin Kroeker 2020-04-08 09:24:03 +0200
  • 29a50dd048
    increase nthreads to 96 Martin Kroeker 2020-04-08 01:04:40 +0200
  • aa8269d472
    Add g++ as dependency for dgemm_tester Martin Kroeker 2020-04-08 00:00:15 +0200
  • e1ec040b95
    Try dgemm_tester instead of lapack-test Martin Kroeker 2020-04-07 23:50:41 +0200
  • 9a4959997d
    Add python dependency for lapack test Martin Kroeker 2020-04-07 22:36:16 +0200
  • 8639c8a683
    Try to get an all-core lapack test to identify barrier issues Martin Kroeker 2020-04-07 21:48:38 +0200
  • 330d6b1ee4
    Update common_param.h Martin Kroeker 2020-04-07 00:10:14 +0200
  • fd99b3e057
    workaround for sign change warning Martin Kroeker 2020-04-06 23:15:13 +0200
  • aab5380aa8
    typo fix Martin Kroeker 2020-04-06 22:14:44 +0200
  • 6f2e18d5e5
    Comment out SGEMM_R for POWER8 again, try if declaring P and Q as UL is sufficient to avoid int overflow Martin Kroeker 2020-04-06 20:51:14 +0200
  • 3a6d51c2fd
    Merge pull request #44 from xianyi/develop Martin Kroeker 2020-04-04 22:48:53 +0200
  • 1c7771df96
    Merge pull request #43 from martin-frbg/revert-42-z12ci Martin Kroeker 2020-04-04 22:46:58 +0200
  • a56c9ec52a Revert "Add IBM Z to Travis configuration (#42)" Martin Kroeker 2020-04-04 22:45:01 +0200
  • 66caf61a2c
    Try predefining GEMM_R for POWER8 Martin Kroeker 2020-04-04 19:31:38 +0200
  • 188e9239a4
    Increase BUFFER_SIZE and remove remnants of arm64 source Martin Kroeker 2020-04-04 15:27:32 +0200
  • 0b8d69f7ae
    Restore correct version Martin Kroeker 2020-04-04 00:00:10 +0200
  • 4ae6d1a01b
    Add a Z13 build to the Travis configuration (#2542) Martin Kroeker 2020-04-03 16:02:11 +0200
  • 7972beb375
    Add IBM Z to Travis configuration (#42) Martin Kroeker 2020-04-03 15:59:18 +0200
  • e19d106225
    Update .travis.yml Martin Kroeker 2020-04-03 14:43:30 +0200
  • 41e802443a libname: treat FreeBSD and DragonFly like linux and sunos Baptiste Daroussin 2020-04-03 06:20:42 +0200
  • 07d59c0455
    print the current values when buffer_size is too small Martin Kroeker 2020-04-02 23:27:10 +0200
  • fdcf50f999 Add arch entry for s390x Martin Kroeker 2020-04-02 22:24:43 +0200
  • 4666cc4422
    Update .travis.yml Martin Kroeker 2020-04-02 21:38:14 +0200
  • b474c65db8
    Add IBM Z to Travis configuration Martin Kroeker 2020-04-02 19:54:34 +0200
  • f03b667dd2
    Increase BUFFER_SIZE for POWER8/9 Martin Kroeker 2020-04-02 18:20:27 +0200
  • 053712eb1f
    Increase BUFFER_SIZE Martin Kroeker 2020-04-02 15:12:50 +0200
  • db6db050de
    Increase BUFFER_SIZE for POWER8/9 Martin Kroeker 2020-04-02 15:11:53 +0200
  • b21ca5c96a
    Increase BUFFER_SIZE for POWER8/9 Martin Kroeker 2020-04-02 14:33:49 +0200
  • cab855d56e
    Increase default BUFFER_SIZE for Haswell, Zen and SKX Martin Kroeker 2020-04-02 14:26:53 +0200
  • df989d7a52
    Add compile-time guard for adequate buffersize Martin Kroeker 2020-04-02 10:58:05 +0200
  • 5e3e657caa
    Make BUFFER_SIZE configurable and increase its default value for TSV110 and EMAG8180 Martin Kroeker 2020-04-02 10:38:35 +0200
  • 7bd8624b79
    Merge pull request #41 from xianyi/develop Martin Kroeker 2020-04-02 10:32:19 +0200
  • 806f89166e
    Make ARMV7 compile with xcode and add a CI job for it (#2537) Martin Kroeker 2020-04-02 10:30:37 +0200
  • 41b470244e
    restore quiet_make Martin Kroeker 2020-04-02 02:04:31 +0200
  • 07cb1097ff
    Make local labels in macro compatible with the xcode assembler Martin Kroeker 2020-04-02 00:44:28 +0200
  • 62cf7a82f1
    Update .travis.yml Martin Kroeker 2020-04-01 23:08:56 +0200
  • f0889ab504
    Update .travis.yml Martin Kroeker 2020-04-01 21:49:14 +0200
  • ac1d704f57
    Add no-thumb option for ARMV7 IOS to get it to accept DMB ISH Martin Kroeker 2020-04-01 20:09:34 +0200
  • f059e614eb
    Merge pull request #2536 from martin-frbg/recurs Martin Kroeker 2020-04-01 20:00:13 +0200
  • abfc80a5e2
    thread_local appears to be unavailable on ARMV7 iOS Martin Kroeker 2020-04-01 17:53:40 +0200
  • 2d7209fdb5
    Update .travis.yml Martin Kroeker 2020-04-01 16:22:01 +0200
  • e13b6773ee
    ifort and pgfort need "recursive" for safe compilation of LAPACK as well Martin Kroeker 2020-04-01 15:39:16 +0200
  • a05243d0f2
    ifort and pgfort need "recursive" for compiling LAPACK as well Martin Kroeker 2020-04-01 15:38:07 +0200
  • 2977f652cc
    Update .travis.yml Martin Kroeker 2020-04-01 14:27:09 +0200
  • 798322bf0b
    Update .travis.yml Martin Kroeker 2020-04-01 09:47:20 +0200
  • 1becf4ef5b
    Add an ARMV7 iOS build Martin Kroeker 2020-03-31 22:52:05 +0200
  • c6af9bbb32
    Merge pull request #2534 from martin-frbg/issue2496 Martin Kroeker 2020-03-31 20:53:13 +0200
  • 144be81ca1
    fix initialization to zero in the NEON SGEMM_BETA kernel as well Martin Kroeker 2020-03-31 16:53:56 +0200
  • 07cdd5d05c
    Fix zero initialization for beta=0 case Martin Kroeker 2020-03-31 00:21:02 +0200
  • 567d2760e6
    Merge pull request #2520 from wjc404/develop Martin Kroeker 2020-03-30 20:15:59 +0200
  • 018bb3e433
    Merge pull request #2533 from martin-frbg/gemmdirect2 Martin Kroeker 2020-03-30 20:15:37 +0200
  • 79fd006c58
    Expose the support_avx512 function provided in dynamic.c Martin Kroeker 2020-03-26 21:25:39 +0100
  • 8229c163b7
    Use runtime check for AVX512 (sgemm_direct) capability when using DYNAMIC_ARCH Martin Kroeker 2020-03-26 21:12:56 +0100
  • a986d42ea6
    Merge pull request #39 from xianyi/develop Martin Kroeker 2020-03-26 21:06:51 +0100
  • 06ef74c84f Do not deploy import libraries on Windows when NO_STATIC=1 Harmen Stoppels 2020-03-24 16:45:52 +0100
  • b6a948fbee
    Merge pull request #2530 from martin-frbg/dynmsg Martin Kroeker 2020-03-24 15:44:46 +0100
  • 0cc352417e
    Merge pull request #2529 from shengyang-3390/dev1 Martin Kroeker 2020-03-24 15:44:27 +0100
  • fe47dc8673
    Add message highlighting minimum target choice at end of DYNAMIC_ARCH builds Martin Kroeker 2020-03-23 19:35:51 +0100
  • 9f67d03d3b
    Merge pull request #2527 from martin-frbg/gemmdirect Martin Kroeker 2020-03-23 12:47:19 +0100
  • 50f4fb2fbd add ctest for drotm and modified ctest for drot. make sure that test cases cover all code path when kernel uses looping unrolling. shengyang 2020-03-21 15:58:21 +0800
  • 6a14b34c20
    Avoid calling DIRECT codepath in DYNAMIC_ARCH on non-SKX Martin Kroeker 2020-03-22 14:33:16 +0100
  • 8c7c1395da
    Merge pull request #2521 from martin-frbg/cm-avx512 Martin Kroeker 2020-03-22 01:03:42 +0100
  • 5f6f6a2c7d
    Merge pull request #2525 from andreas-schwab/develop Martin Kroeker 2020-03-21 18:47:48 +0100
  • 71cf2acdef Fix ARCHCONFIG for Neoverse-N1 Andreas Schwab 2020-03-21 17:33:33 +0100
  • 06e1062329 add ctest for drotm and modified ctest for drot. make sure that test cases cover all code path when kernel uses looping unrolling. shengyang 2020-03-21 15:58:21 +0800
  • 1d9773b800
    Use proper extension on the avx512 testcase filename Martin Kroeker 2020-03-20 23:05:53 +0100
  • a46a8c4956
    Merge pull request #2518 from shengyang-3390/dev Martin Kroeker 2020-03-20 23:00:06 +0100
  • 7ae737e04c
    Merge pull request #2519 from martin-frbg/issue2472 Martin Kroeker 2020-03-20 22:57:44 +0100
  • 64daad4365
    Update param.h wjc404 2020-03-20 21:46:18 +0000
  • b8307768e2
    Add files via upload wjc404 2020-03-21 05:42:10 +0800
  • 6d54c94760
    Make ifort on Windows create lowercase symbols with appended underscore Martin Kroeker 2020-03-20 01:08:10 +0100
  • c0da205412
    Merge pull request #38 from xianyi/develop Martin Kroeker 2020-03-20 01:05:22 +0100
  • a06d78556d add ctest for srotm and modified ctest for srot. make sure that test cases cover all code path when kernel uses looping unrolling. shengyang 2020-03-18 14:17:32 +0800