Commit Graph

  • a35a436ff9
    Update .drone.yml Martin Kroeker 2021-08-22 22:35:55 +0200
  • f0973d14ed
    need python3 for this test Martin Kroeker 2021-08-22 20:19:01 +0200
  • 29f13cc8b0
    fix url Martin Kroeker 2021-08-22 18:20:13 +0200
  • ccccdc49b2
    Update .drone.yml Martin Kroeker 2021-08-22 16:30:52 +0200
  • 90ee2302a8
    fix formatting Martin Kroeker 2021-08-22 15:13:53 +0200
  • 97d802e3ed
    add testcase for external caller exceeding our thread limit Martin Kroeker 2021-08-22 15:05:08 +0200
  • c6c2a71fb7 Fix ctest.h to build using clang on windows Niyas Sait 2021-08-16 11:25:07 +0100
  • cdb5d2737e add support for building on windows/arm64 target Niyas Sait 2021-08-16 11:22:51 +0100
  • 13d411677f
    Add more OSX build jobs to Azure CI (#3338) Martin Kroeker 2021-08-15 00:17:23 +0200
  • 4c93f2e255
    Update azure-pipelines.yml Martin Kroeker 2021-08-14 21:24:07 +0200
  • 1eef884fa3
    Update azure-pipelines.yml Martin Kroeker 2021-08-14 19:38:57 +0200
  • 6c6651f20c
    Update azure-pipelines.yml Martin Kroeker 2021-08-14 19:28:09 +0200
  • f413ff46fa
    move IOS xbuilds from travis to azure Martin Kroeker 2021-08-14 18:48:17 +0200
  • d19af01f51
    Update .drone.yml Martin Kroeker 2021-08-13 07:08:48 +0200
  • f9dba63c28 Small Matrix: skylakex: remove unnecessary b0 source files Wangyang Guo 2021-08-13 03:28:44 +0000
  • 989e6bbdd3 Small Matrix: reduce generic kernel source files Wangyang Guo 2021-08-13 03:17:38 +0000
  • 68b2b5038f
    Update .drone.yml Martin Kroeker 2021-08-13 01:44:45 +0200
  • a5a7892fa8
    Update .drone.yml Martin Kroeker 2021-08-12 23:40:53 +0200
  • 2fb65d062b
    Update .drone.yml Martin Kroeker 2021-08-12 22:50:36 +0200
  • 3bd81e9b91
    Update .drone.yml Martin Kroeker 2021-08-12 21:20:11 +0200
  • 0161aba5dc
    Update .drone.yml Martin Kroeker 2021-08-12 19:13:11 +0200
  • 3f021a1b7d
    try to force installation of a specific version of gcc Martin Kroeker 2021-08-12 16:03:41 +0200
  • 04255be948
    Merge pull request #3344 from gxw-loongson/develop Martin Kroeker 2021-08-12 15:16:46 +0200
  • a7bc8ec1f1 Delete the macro instruction "li" and use "li.d" instead gxw 2021-08-10 16:42:57 +0800
  • 8cd2b32fef
    Merge pull request #3343 from cianciosa/develop Martin Kroeker 2021-08-12 01:28:18 +0200
  • 4c766cd11f Fix a small syntax error. A ( was accidently deleted. cianciosa 2021-08-11 12:08:34 -0400
  • c28560129f Check the total number of arguments passed insead of if the ARGV# is defined. This fixes a problem when compling openblas as a subproject of another code. cianciosa 2021-08-11 12:00:07 -0400
  • 6667aa5bc8
    Update .drone.yml Martin Kroeker 2021-08-11 16:47:26 +0200
  • b9e4fb206d
    Merge pull request #3341 from RajalakshmiSR/dasump10 Martin Kroeker 2021-08-11 09:39:10 +0200
  • 3bdca029b2
    Update .drone.yml Martin Kroeker 2021-08-11 09:28:53 +0200
  • b06880c2cd POWER10: Improving dasum performance Rajalakshmi Srinivasaraghavan 2021-08-10 22:06:04 -0500
  • b33002365f
    Update .drone.yml Martin Kroeker 2021-08-10 18:39:44 +0200
  • 3da6a5d7c3
    Add mixed clang/gfortran build with cmake on OSX Martin Kroeker 2021-08-10 11:24:22 +0200
  • ea48bbac6b
    Update .drone.yml Martin Kroeker 2021-08-09 16:23:09 +0200
  • 3cbbb3a37f
    run blas-tester on ThunderX/Falkor Martin Kroeker 2021-08-09 15:11:15 +0200
  • fa71b9fea6
    Check install step on OSX/gcc Martin Kroeker 2021-08-08 13:03:34 +0200
  • bb2916d1e2
    Update azure-pipelines.yml Martin Kroeker 2021-08-07 22:23:10 +0200
  • 7d2cd3d80b
    Update azure-pipelines.yml Martin Kroeker 2021-08-07 18:45:28 +0200
  • e8e285511a
    set cmake build type to debug to ease register pressure for LLVM SKX build Martin Kroeker 2021-08-07 17:32:08 +0200
  • a0c6350f41
    Add OSX build job with Homebrew OpenMP in a CMAKE build Martin Kroeker 2021-08-07 16:59:53 +0200
  • cbc583eb54
    Merge pull request #3336 from martin-frbg/traviscom Zhang Xianyi 2021-08-05 19:13:19 +0800
  • e5ba7c3235
    Disable all x86 jobs Martin Kroeker 2021-08-05 11:08:18 +0200
  • 435d84a7ce
    Merge pull request #3332 from martin-frbg/travisbadge Martin Kroeker 2021-08-05 09:36:59 +0200
  • 139f632ca4
    Merge pull request #3334 from Guobing-Chen/BF16_gemm_full_kernel Martin Kroeker 2021-08-05 08:01:13 +0200
  • c17d6dacb2 Small Matrix: skip compile in unimplemented data type Wangyang Guo 2021-08-05 05:46:13 +0000
  • 44d0032f3b Small Matrix: skylakex: fix build error in old compiler Wangyang Guo 2021-08-05 04:43:47 +0000
  • 5d86becdae Add all SBGEMM kernels for IA AVX512-BF16 based platforms Chen, Guobing 2021-08-05 11:11:14 +0800
  • 76ea8db4da Small Matrix: enable by default for x86_64 arch Wangyang Guo 2021-08-05 02:57:58 +0000
  • aa50185647 Small Matrix: better handle with GEMM3M marco Wangyang Guo 2021-08-05 02:45:53 +0000
  • fee5abd84b Small Matrix: support cmake build Wangyang Guo 2021-08-04 08:50:15 +0000
  • 478d1086c1 Small Matrix: support DYNAMIC_ARCH build Wangyang Guo 2021-08-04 03:12:41 +0000
  • 93c8bafff5
    Update Travis badge in README Martin Kroeker 2021-08-03 10:45:45 +0200
  • 6b58bca18b Small Matrix: disable low performance default kernel Wangyang Guo 2021-06-15 16:09:51 +0000
  • b5858c4472
    Merge pull request #3330 from xianyi/issue3321 Martin Kroeker 2021-08-02 22:36:05 +0200
  • 898212efcd
    Actually add the message to the TLS section issue3321 Martin Kroeker 2021-08-02 14:50:14 +0200
  • 210a1584c5
    Rebase source and edit TLS version of the message as well Martin Kroeker 2021-08-02 14:19:16 +0200
  • fa777f5517 Small Matrix: skylakex: add DGEMM_SMALL_M_PERMIT and tune for TN kernel Wangyang Guo 2021-06-02 14:55:54 +0000
  • 8592c21af4 Small Matrix: skylakex: dgemm nn: fix typo in idx load Wangyang Guo 2021-06-02 13:57:39 +0000
  • 3e79f6d89a Small Matrix: skylakex: add dgemm tn kernel Wangyang Guo 2021-06-02 13:56:40 +0000
  • 323d7da4f7 Small Matrix: skylakex: add dgemm tt kernel Wangyang Guo 2021-06-02 11:45:44 +0000
  • f57fc932ac Small Matrix: skylakex: add dgemm nt kernel Wangyang Guo 2021-06-01 14:23:56 +0000
  • 91ec21202b Small Matrix: skylakex: add dgemm nn kernel Wangyang Guo 2021-06-01 11:31:50 +0000
  • 72e070539c Small Matrix: skylakex: add sgemm tt kernel Wangyang Guo 2021-05-31 14:53:03 +0000
  • 02c6e764f2 Small Matrix: skylakex: add SGEMM_SMALL_M_PERMIT and tune for TN kernel Wangyang Guo 2021-05-27 11:26:49 +0000
  • 5dc7c3c8e5 Small Matrix: add GEMM_SMALL_MATRIX_PERMIT to tune small matrics case Wangyang Guo 2021-05-27 11:03:56 +0000
  • 642c393879 Small Matrix: skylakex: add sgemm tn kernel Wangyang Guo 2021-05-26 16:30:57 +0000
  • ae3f5c737c Small Matrix: skylakex: sgemm nt: optimize for M < 12 Wangyang Guo 2021-05-21 13:31:31 +0000
  • 0d72d75bf9 Small Matrix: skylakex: add sgemm nt kernel Wangyang Guo 2021-05-20 11:47:10 +0000
  • ca7682e3a3 Small Matrix: skylakex: sgemm nn: fix n6 conflicts with n4 Wangyang Guo 2021-05-20 11:24:31 +0000
  • 9967e61abb Small Matrix: skylakex: sgemm nn: fix error when beta not zero Wangyang Guo 2021-05-19 10:50:03 +0000
  • a87736346f Small Matrix: skylakex: sgemm nn: add n6 to improve performance Wangyang Guo 2021-05-13 10:16:54 +0000
  • 4c9d9940fd Small Matrix: skylakex: sgemm nn: reduce store 4 N at a time Wangyang Guo 2021-05-13 09:41:51 +0000
  • 13b32f69b7 Small Matrix: skylakex: sgemm nn: reduce store 4 M at a time Wangyang Guo 2021-05-12 17:08:18 +0000
  • 3d8c6d9607 Small Matrix: skylakex: sgemm nn: clean up unused code Wangyang Guo 2021-05-11 10:33:07 +0000
  • 49b61a3f30 Small Matrix: skylakex: sgemm_nn: optimize for M <= 8 Wangyang Guo 2021-05-11 10:24:10 +0000
  • f88470323b Optimize M < 16 using AVX512 mask Wangyang Guo 2021-05-08 15:59:14 +0000
  • 9186456a12 small matrix: SkylakeX: add SGEMM NN kernel Wangyang Guo 2021-05-08 10:45:10 +0000
  • 6022e5629c Refs #2587 fix small matrix c/zgemm bug. Xianyi Zhang 2020-08-28 22:36:36 +0800
  • 57ed58cefe Refs #2587 Add small matrix optimization reference kernel for c/zgemm. Xianyi Zhang 2020-08-28 21:00:54 +0800
  • 17d32a4a82 Change a1b0 gemm to b0 gemm. Xianyi Zhang 2020-08-28 07:55:27 +0800
  • 59cb5de46b Refs #2587 Fix typos. Xianyi Zhang 2020-04-29 00:19:19 +0800
  • 4271cfcc6f Fix gemm interface bug for small matrix. Xianyi Zhang 2020-04-28 23:15:20 +0800
  • be3349405d Add alpha=1.0 beta=0.0 for small gemm. Xianyi Zhang 2020-04-28 22:35:36 +0800
  • 0a2077901c Add small marix optimization kernel interface. Xianyi Zhang 2020-04-28 19:01:36 +0800
  • e6d6d3ee43
    Merge pull request #3331 from gxw-loongson/develop Martin Kroeker 2021-08-02 07:21:46 +0200
  • 0b8f7c8c10 Add cmake support for LOONGARCH64 gxw 2021-08-02 10:00:41 +0800
  • f2a7a67f5a
    Improve the "tried to allocate too many buffers" error message Martin Kroeker 2021-07-31 17:23:40 +0200
  • e0e88f9edc
    Merge pull request #3329 from martin-frbg/issue3272 Martin Kroeker 2021-07-30 20:39:38 +0200
  • 5dc6aa74f0
    Disable gfortran tree vectorizer to avoid gcc11+ miscompilation at O3 Martin Kroeker 2021-07-30 14:46:19 +0200
  • e78fbe4654
    Disable gfortran tree vectorizer to avoid gcc11+ miscompilation at O3 Martin Kroeker 2021-07-30 14:44:54 +0200
  • b4f4ed378b
    Disable gfortran tree vectorizer to avoid gcc11+ miscompilation at O3 Martin Kroeker 2021-07-30 14:21:08 +0200
  • cbc41973fd
    Disable gfortran tree vectorizer to avoid gcc11+ miscompilation at O3 Martin Kroeker 2021-07-30 14:20:12 +0200
  • 34207bdf5b Fixed typos about LOONGARCH64 gxw 2021-07-30 18:11:12 +0800
  • 1b6db3dbba
    Merge pull request #3327 from h-vetinari/lapack597_redux Martin Kroeker 2021-07-28 23:04:02 +0200
  • f681553c6a
    Merge pull request #3326 from wattoc/develop Martin Kroeker 2021-07-28 23:03:37 +0200
  • afadeeba2a
    Merge pull request #3325 from gxw-loongson/develop Martin Kroeker 2021-07-28 23:03:15 +0200
  • 02d4a49761 Also make sure the `1` is INTEGER*4 for OMP_SET_NUM_THREADS Isuru Fernando 2021-07-15 04:54:33 -0500
  • 4d7dfe4845 Include Haiku in processor count checks Craig Watson 2021-07-27 09:00:30 +0000
  • af0a69f355 Add support for LOONGARCH64 gxw 2021-07-26 15:44:54 +0800
  • 5a2fe5bfb9
    Merge pull request #3323 from martin-frbg/issue3322 Martin Kroeker 2021-07-23 22:46:02 +0200