Commit Graph

  • cb9dc36dd5 Update CONTRIBUTORS.md Marius Hillenbrand 2020-05-12 16:14:00 +0200
  • 1b0b4349a1 s390x/Z14: Change register blocking for SGEMM to 16x4 Marius Hillenbrand 2020-05-12 15:06:38 +0200
  • 71b6eaf459 s390x: Use new sgemm kernel also for strmm on Z14 and newer Marius Hillenbrand 2020-05-12 14:40:30 +0200
  • 43c0d4f312 s390x: Add vectorized sgemm kernel for Z14 and newer Marius Hillenbrand 2020-05-12 14:13:54 +0200
  • d7c1677c20 Update CONTRIBUTORS.md, adding myself Marius Hillenbrand 2020-05-12 11:09:28 +0200
  • 0dbe61a612 s390x: choose SIMD kernels at run-time based on OS and compiler support Marius Hillenbrand 2020-05-11 13:00:10 +0200
  • 62cf391cbb s390x: only build kernels supported by gcc with dynamic arch support Marius Hillenbrand 2020-05-11 18:37:04 +0200
  • 8c338616f9 s390x: gate dynamic arch detection on gcc version and add generic Marius Hillenbrand 2020-05-11 12:37:21 +0200
  • f94c53ec0a
    Merge pull request #2612 from RajalakshmiSR/testshgemm Martin Kroeker 2020-05-12 08:34:02 +0200
  • 8efba9b7c0 Improve shgemm test Rajalakshmi Srinivasaraghavan 2020-05-11 17:15:10 -0500
  • 4fffa556d8
    Merge pull request #2611 from RajalakshmiSR/bench_half Martin Kroeker 2020-05-11 21:08:41 +0200
  • ce90e2bd3f Include shgemm in benchtest Rajalakshmi Srinivasaraghavan 2020-05-11 09:57:46 -0500
  • 948b6712ba
    Merge pull request #2610 from martin-frbg/issue2552-3 Martin Kroeker 2020-05-10 13:10:31 +0200
  • 2271c3506b
    Work around excessive LAPACK test failures on Skylake-X Martin Kroeker 2020-05-09 23:49:18 +0200
  • db00b21445
    Merge pull request #2609 from martin-frbg/issue2552-2 Martin Kroeker 2020-05-09 21:33:02 +0200
  • 58d26b4448
    Correct ifort options Martin Kroeker 2020-05-09 17:15:36 +0200
  • 8e47d14053
    Merge pull request #2608 from martin-frbg/issue2604 Martin Kroeker 2020-05-09 16:36:14 +0200
  • cd10b35fe9
    Handle trailing spaces and empty condition variables Martin Kroeker 2020-05-09 13:42:33 +0200
  • 9472dd99cd
    Merge pull request #57 from xianyi/develop Martin Kroeker 2020-05-09 13:20:44 +0200
  • 7181665452
    Merge pull request #2605 from RajalakshmiSR/cmake-power Martin Kroeker 2020-05-09 11:29:28 +0200
  • bd9ff820bc Fix cmake compilation issue - POWER9 Rajalakshmi Srinivasaraghavan 2020-05-08 20:31:56 -0500
  • 63e45def70
    Merge pull request #2603 from martin-frbg/issue2552 Martin Kroeker 2020-05-08 22:08:39 +0200
  • ec0f228632
    Add FFLAGS_DRV to the generated make.inc to fix lapack-test on x86_64 with icc/ifort Martin Kroeker 2020-05-08 18:06:12 +0200
  • 90e2941c61
    Merge pull request #56 from xianyi/develop Martin Kroeker 2020-05-07 22:43:48 +0200
  • 10d5f3c87b
    Merge pull request #2602 from ashwinyes/thunderx2_develop Martin Kroeker 2020-05-07 22:06:41 +0200
  • 8353cb245a ARM64: Improve DAXPY for ThunderX2 Ashwin Sekhar T K 2020-05-07 09:14:05 -0700
  • ec2dd7b875
    Merge pull request #2601 from martin-frbg/issue818 Martin Kroeker 2020-05-07 10:12:33 +0200
  • 4e82eb9f8a
    Undefine ASMNAME/NAME/CNAME before defining them Martin Kroeker 2020-05-07 00:31:32 +0200
  • 61300bb735
    Merge pull request #55 from xianyi/develop Martin Kroeker 2020-05-07 00:27:14 +0200
  • 33e9b12464
    Merge pull request #2597 from martin-frbg/appleclang Martin Kroeker 2020-05-05 13:55:08 +0200
  • 90dba9f716
    Duplicate earlier Clang 9.0.0 workaround for corresponding Apple Clang version Martin Kroeker 2020-05-05 10:44:50 +0200
  • 4d0fd365a9
    Update common_x86_64.h Martin Kroeker 2020-05-02 20:29:25 +0200
  • 4abb651af1
    fix format specifier for unsigned Martin Kroeker 2020-05-02 16:10:49 +0200
  • b5d3e46e69
    more debugging Martin Kroeker 2020-05-02 15:21:13 +0200
  • ccdf81ecc3
    and back to unsigned to run another test... Martin Kroeker 2020-05-02 14:22:32 +0200
  • 20f2f6fc84
    revert last change, blas_quickdivide returns a signed int again Martin Kroeker 2020-05-01 21:12:11 +0200
  • 6b96e6dfad
    make blas_quickdivide actually return unsigned (to placate clang) Martin Kroeker 2020-05-01 16:01:42 +0200
  • 94487c02db
    Delete extra semicolon after brace to make clang happy Martin Kroeker 2020-05-01 15:56:17 +0200
  • c3c00380da
    Delete spurious copy of common_param.h Martin Kroeker 2020-05-01 15:34:56 +0200
  • 2de3fff4f9
    Move some declarations for pre-C99 compatibility Martin Kroeker 2020-05-01 15:25:32 +0200
  • 424d551e01
    Merge pull request #53 from xianyi/develop Martin Kroeker 2020-05-01 15:18:46 +0200
  • 596f5df9e8
    Merge pull request #2591 from RajalakshmiSR/testhalf Martin Kroeker 2020-05-01 09:59:39 +0200
  • 5dd14e3d48
    Make building the bfloat16 functions conditional on option BUILD_HALF (#2590) Martin Kroeker 2020-05-01 09:58:30 +0200
  • 924cc7e588
    typo fix Martin Kroeker 2020-04-29 22:11:42 +0200
  • 4297e2ed84
    fix shgemm parameter references in arm64 branch Martin Kroeker 2020-04-29 22:09:23 +0200
  • a54e35e780
    Merge pull request #2586 from martin-frbg/miscfixes Martin Kroeker 2020-04-29 22:01:41 +0200
  • 564b0d39ef Add test for shgemm Rajalakshmi Srinivasaraghavan 2020-04-29 13:40:34 -0500
  • 254a934b57
    ifdef another group of shgemm parameters Martin Kroeker 2020-04-29 20:25:33 +0200
  • 9acf45c675
    Fix overlooked shgemm parameters Martin Kroeker 2020-04-29 19:25:13 +0200
  • 8d4042d897
    Make shgemm parameters conditional on BUILD_HALF Martin Kroeker 2020-04-29 18:46:16 +0200
  • 33059ad1de
    make bfloat16 functions conditional on BUILD_HALF Martin Kroeker 2020-04-29 18:31:24 +0200
  • 1377810961
    fix endif Martin Kroeker 2020-04-29 18:30:41 +0200
  • b2f6f76a5a
    Pass BUILD_HALF as a compiler define for dynamic_arch builds Martin Kroeker 2020-04-29 18:30:10 +0200
  • 84e5b0c4f8
    typo Martin Kroeker 2020-04-29 16:07:27 +0200
  • 75e0495a75
    Make shgemm kernels conditional on BUILD_HALF Martin Kroeker 2020-04-29 15:58:59 +0200
  • fd267b58b2
    make shgemm kernels conditional on BUILD_HALF Martin Kroeker 2020-04-29 14:48:37 +0200
  • f881c697fb
    pass the BUILD_HALF option to gensymbol Martin Kroeker 2020-04-29 14:47:09 +0200
  • 48e26bc317
    make bfloat16 functions conditional on BUILD_HALF Martin Kroeker 2020-04-29 14:46:13 +0200
  • 34e64d57ab
    make shgemm functions conditional on BUILD_HALF Martin Kroeker 2020-04-29 14:44:53 +0200
  • 45881fab58
    make shgemm functions conditional on BUILD_HALF Martin Kroeker 2020-04-29 14:44:07 +0200
  • 7bf1865656
    make building the bfloat16 functions conditional on BUILD_HALF Martin Kroeker 2020-04-29 14:42:35 +0200
  • 3c37071eef
    make bfloat16 kernels conditional on BUILD_HALF Martin Kroeker 2020-04-29 14:40:17 +0200
  • 5d58b11101
    Merge pull request #52 from xianyi/develop Martin Kroeker 2020-04-29 14:36:15 +0200
  • d394d4e677
    Merge pull request #2585 from martin-frbg/mips64fix Martin Kroeker 2020-04-28 19:47:55 +0200
  • 9d3a317abc Refs #2587 Fix typos. Xianyi Zhang 2020-04-29 00:19:19 +0800
  • 92372c70fc Fix gemm interface bug for small matrix. Xianyi Zhang 2020-04-28 23:15:20 +0800
  • 43bef4aaac Add alpha=1.0 beta=0.0 for small gemm. Xianyi Zhang 2020-04-28 22:35:36 +0800
  • aae6af94bb Add small marix optimization kernel interface. Xianyi Zhang 2020-04-28 19:01:36 +0800
  • f4248af26e
    Fix compiler warnings Martin Kroeker 2020-04-28 10:43:12 +0200
  • 2d89603e9d
    Increase BUFFER_SIZE on mips64 to match SGEMM parameters Martin Kroeker 2020-04-28 10:40:40 +0200
  • 26bc15258a
    Merge pull request #51 from xianyi/develop Martin Kroeker 2020-04-28 10:38:50 +0200
  • 141998dce2
    Merge pull request #2584 from martin-frbg/issue2583 Martin Kroeker 2020-04-28 10:35:12 +0200
  • 3bd56846bb
    Silence a debug message Martin Kroeker 2020-04-27 16:27:09 +0200
  • e7bbdfdf84
    Have CMAKE parse conditional lines in KERNEL files Martin Kroeker 2020-04-27 15:20:03 +0200
  • b6795db731
    Merge pull request #2582 from martin-frbg/mips32fix Martin Kroeker 2020-04-27 09:18:34 +0200
  • 5e0dbf8dfe
    Increase default BUFFER_SIZE to accomodate SGEMM parameters Martin Kroeker 2020-04-26 22:21:05 +0200
  • 955d73127f
    Merge pull request #50 from xianyi/develop Martin Kroeker 2020-04-26 22:17:56 +0200
  • a8c1bea7ae
    Merge pull request #2581 from martin-frbg/raji Martin Kroeker 2020-04-25 19:57:10 +0200
  • e43b49e064
    Drop the set -e from travis scripts Martin Kroeker 2020-04-25 16:18:54 +0200
  • 3e28db7f38
    Update CONTRIBUTORS.md Martin Kroeker 2020-04-25 13:51:44 +0200
  • 4b69ee31af
    Merge pull request #2580 from martin-frbg/issue2538-3 Martin Kroeker 2020-04-25 00:28:18 +0200
  • 03ff213c51
    Increase POWER8 ZGEMM_R and use same R values for POWER9 Martin Kroeker 2020-04-24 21:46:54 +0200
  • 299d1c8de0
    Merge pull request #2578 from martin-frbg/issue2576 Martin Kroeker 2020-04-24 14:32:46 +0200
  • 70869d571f
    Quote include paths for getarch to protect any embedded spaces Martin Kroeker 2020-04-24 10:30:44 +0200
  • b27fdd08aa
    Quote include paths for getarch to protect any embedded spaces Martin Kroeker 2020-04-24 10:23:31 +0200
  • cba87222b2
    Merge pull request #49 from xianyi/develop Martin Kroeker 2020-04-24 10:21:48 +0200
  • f80dd2151e
    xcode 11.4.1 for homebrew ? Martin Kroeker 2020-04-23 14:31:09 +0200
  • 4412ee1754
    Switch homebrew build env to new xcode 11.4 Martin Kroeker 2020-04-23 10:54:46 +0200
  • f6104b68c1
    Merge pull request #2571 from martin-frbg/issue2299 Martin Kroeker 2020-04-22 18:27:13 +0200
  • 84f2c71e93
    Merge pull request #2573 from martin-frbg/issue2572 Martin Kroeker 2020-04-22 15:04:49 +0200
  • 06208c8d01
    Limit this fix to ELFv2 builds Martin Kroeker 2020-04-22 14:16:40 +0200
  • c90b28dee6
    Export ELF_VERSION for use in powerpc kernel configurations Martin Kroeker 2020-04-22 14:14:20 +0200
  • 6275b43918
    Avoid duplicate printout of byte order and report ELF_VERSION Martin Kroeker 2020-04-22 14:12:27 +0200
  • 2db5178e2d
    enable cblas interfaces to GEMM3M in CMAKE builds Martin Kroeker 2020-04-22 11:01:28 +0200
  • 57549f5c92
    Merge pull request #2569 from martin-frbg/issue2472-2 Martin Kroeker 2020-04-21 20:26:53 +0200
  • f5c4c28b98
    Work around POWER8BE bugs on FreeBSD (ELFv2) Martin Kroeker 2020-04-21 17:17:17 +0200
  • 239282d5e2
    Use CMAKE_SHARED_LINKER_FLAGS to pass MSVC linker option Martin Kroeker 2020-04-20 22:30:51 +0200
  • 568674477c
    Merge pull request #48 from xianyi/develop Martin Kroeker 2020-04-20 21:51:59 +0200
  • fa42588e1f
    Merge pull request #2565 from martin-frbg/mips24k Martin Kroeker 2020-04-20 17:13:53 +0200
  • 8a6d26458b
    Merge pull request #2559 from RajalakshmiSR/shgemm Martin Kroeker 2020-04-19 22:09:55 +0200