Commit Graph

  • a4e56e0452 Merge pull request #4806 from Mousius/small-gemm Martin Kroeker 2024-07-25 21:50:04 +02:00
  • 90d93b38d5 deploy: 949a7f9393 martin-frbg 2024-07-25 17:13:39 +00:00
  • 949a7f9393 Merge pull request #4811 from yamazakimitsufumi/add_a64fx_to_dynamic_arch Martin Kroeker 2024-07-25 19:13:04 +02:00
  • 88caf02f62 Fix ambiguous error on Mac OS yamazaki-mitsufumi 2024-07-25 22:43:13 +09:00
  • b613754143 Update scal..c Martin Kroeker 2024-07-24 14:31:29 +02:00
  • 4140ac45d7 Merge pull request #4813 from martin-frbg/issue4812 Martin Kroeker 2024-07-23 21:35:06 +02:00
  • 0096482f03 fix incompatible definitions of MAXLOC Martin Kroeker 2024-07-23 15:01:26 +02:00
  • db47113038 deploy: ed82fd24fc martin-frbg 2024-07-23 12:55:28 +00:00
  • ed82fd24fc Merge pull request #4810 from martin-frbg/issue4805 Martin Kroeker 2024-07-23 14:54:55 +02:00
  • 821ef34635 Add A64FX to the list of CPUs supported by DYNAMIC_ARCH yamazaki-mitsufumi 2024-07-23 20:44:39 +09:00
  • 29f3e759b9 work around a gcc14.1 bug observed on Loongarch Martin Kroeker 2024-07-23 11:20:48 +02:00
  • f5d04318e3 Merge branch 'OpenMathLib:develop' into scalfixes Martin Kroeker 2024-07-21 13:43:43 +02:00
  • 73f8866ffb make NAN handling depend on DUMMY2 parameter Martin Kroeker 2024-07-21 13:42:47 +02:00
  • dfbc2348a8 fix NAN handling Martin Kroeker 2024-07-20 18:27:15 +02:00
  • c064319ecb fix alpha=NAN case Martin Kroeker 2024-07-20 17:42:31 +02:00
  • c2ffd90e8c make NAN handling depend on dummy2 parameter Martin Kroeker 2024-07-20 17:31:00 +02:00
  • ea4ab3b310 Better header guard around bridge Chris Sidebottom 2024-07-20 13:39:22 +00:00
  • 7311d93016 Unroll TT further Chris Sidebottom 2024-07-19 16:50:50 +00:00
  • a815594fd1 Merge pull request #4801 from markdryan/markdryan/riscv-dynamic-arch Martin Kroeker 2024-07-19 17:12:07 +02:00
  • dd6c33d34d make NAN handling depend on dummy2 parameter Martin Kroeker 2024-07-19 16:14:55 +02:00
  • 66622de36d Hack: Test gemv vs gemm. Chip Kerchner 2024-07-19 07:26:08 -05:00
  • 6fd59a620a deploy: 5a845ef1f4 martin-frbg 2024-07-19 11:07:15 +00:00
  • 5a845ef1f4 Merge pull request #4809 from penghongbo/reorder_gemm_gemvt Martin Kroeker 2024-07-19 13:06:42 +02:00
  • db98f8753f Try to fix LAPACK testing failures on P7. 1. Remove the FADD insn from the GEMV Transpose code. 2. Remove the FADD insn from GEMM and ZGEMM code. 3. Reorder the compution of the Imaginary part in ZGEMM code. Hong Bo Peng 2024-07-19 02:08:19 -04:00
  • a9edddb695 Unroll TN further Chris Sidebottom 2024-07-18 19:03:34 +00:00
  • 9984c5ce9d Clean up k2 removal more and unroll SGEMM more Chris Sidebottom 2024-07-18 17:34:43 +00:00
  • b1c9fafabb Remove k2 loop from DGEMM TN and use a more conservative heuristic for SGEMM Chris Sidebottom 2024-07-18 17:37:18 +01:00
  • 2020569705 fix NAN handling and make it depend on dummy2 parameter Martin Kroeker 2024-07-17 23:55:54 +02:00
  • 3870995f01 make NAN handling depend on dummy2 parameter Martin Kroeker 2024-07-17 23:54:24 +02:00
  • 7284c533b5 make NAN handling depend on dummy2 parameter Martin Kroeker 2024-07-17 23:50:40 +02:00
  • 73751218a4 make NAN handling depend on dummy2 parameter Martin Kroeker 2024-07-17 23:41:26 +02:00
  • b9bfc8ce09 make NAN handling depend on dummy2 parameter Martin Kroeker 2024-07-17 23:29:50 +02:00
  • eb4879e04c make NAN handling depend on the dummy2 parameter Martin Kroeker 2024-07-17 23:24:19 +02:00
  • e99c574f26 deploy: ee87cb90d0 martin-frbg 2024-07-17 21:14:56 +00:00
  • ee87cb90d0 Merge pull request #4803 from iha-taisei/SVESupportSDGEMV Martin Kroeker 2024-07-17 23:14:21 +02:00
  • 34b80ce03f mips64: Fixed numpy CI failure gxw 2024-07-17 09:52:14 +08:00
  • f6d6c14a96 mips: Fixed numpy CI failure gxw 2024-07-17 09:23:49 +08:00
  • ba47c7f4f3 Vectorize reduction stage of sgemv_t. Chip Kerchner 2024-07-16 15:57:24 -05:00
  • e9f6aa46a4 Merge pull request #4800 from vlad0x00/patch-2 Martin Kroeker 2024-07-16 16:32:04 +02:00
  • 5a168e02da deploy: b1aa2e1768 martin-frbg 2024-07-16 12:23:14 +00:00
  • b1aa2e1768 Merge pull request #4802 from markdryan/markdryan/rvv_axpby_incy0 Martin Kroeker 2024-07-16 14:22:38 +02:00
  • 0985fdc82b A64FX: Add support for SVE to SGEMV/DGEMV kernels. iha fujitsu 2024-07-16 17:31:33 +09:00
  • 56e1782ffb Add another missing parenthesis Vladimir Nikolić 2024-07-15 15:15:23 -07:00
  • 127ea5d0d9 Add missing parenthesis Vladimir Nikolić 2024-07-15 15:12:21 -07:00
  • 48edc2a1dd deploy: a3c10c6c25 martin-frbg 2024-07-15 18:58:34 +00:00
  • a3c10c6c25 Merge pull request #4799 from martin-frbg/issue4762 Martin Kroeker 2024-07-15 20:57:56 +02:00
  • a373d0f107 Improve the error message for thread creation failure Martin Kroeker 2024-07-15 18:32:21 +02:00
  • 67bf4b6998 Fix axpby_rvv kernels for cases where inc_y = 0 Mark Ryan 2024-07-12 11:16:48 +00:00
  • 3b715e6162 Add autodetection for riscv64 Mark Ryan 2024-07-05 10:39:07 +00:00
  • 9b3e80efe2 utest: Add test_gemv gxw 2024-07-15 16:33:09 +08:00
  • 3f39c8f94f LoongArch: Fixed numpy CI failure gxw 2024-07-12 16:56:35 +08:00
  • 9fca45d365 deploy: 6013b36b16 martin-frbg 2024-07-12 09:30:12 +00:00
  • 6013b36b16 Merge pull request #4796 from martin-frbg/ppcbuf Martin Kroeker 2024-07-12 11:06:45 +02:00
  • 9789034281 Merge branch 'OpenMathLib:develop' into ppcbuf Martin Kroeker 2024-07-12 11:05:46 +02:00
  • f3cebb3ca3 x86: Fixed numpy CI failure when the target is ZEN. gxw 2024-07-10 15:11:12 +08:00
  • 5d08ec7ff3 Merge pull request #4782 from martin-frbg/azurewincl Martin Kroeker 2024-07-11 23:55:15 +02:00
  • dfc11ef248 Merge pull request #4791 from ChipKerchner/vectorizeSBGEMMincopy Martin Kroeker 2024-07-11 21:38:57 +02:00
  • 2fefdfa2b8 Merge branch 'OpenMathLib:develop' into azurewincl Martin Kroeker 2024-07-11 21:38:21 +02:00
  • 475bd2452b Suffix BUFFERSIZEs as UL to prevent int overflow in computations Martin Kroeker 2024-07-11 20:13:57 +02:00
  • b70227ad62 Merge pull request #4795 from pkubaj/patch-1 Martin Kroeker 2024-07-11 19:00:07 +02:00
  • 3408b8f7b7 deploy: 8277828fdc martin-frbg 2024-07-11 16:49:46 +00:00
  • 8277828fdc Merge pull request #4785 from rgommers/docs-install Martin Kroeker 2024-07-11 18:49:07 +02:00
  • f0fc7249f1 Merge pull request #4792 from martin-frbg/issue4790 Martin Kroeker 2024-07-11 17:38:43 +02:00
  • 362856fece Merge pull request #4778 from JAicewizard/develop Martin Kroeker 2024-07-11 15:12:46 +02:00
  • 63ec095ebb deploy: 1d77647d1b martin-frbg 2024-07-11 13:02:25 +00:00
  • 1d77647d1b Merge pull request #4769 from drupol/fix-buffersize-value Martin Kroeker 2024-07-11 14:45:50 +02:00
  • 4c12090776 Fix build on FreeBSD/powerpc64* Piotr Kubaj 2024-07-10 22:21:48 +00:00
  • f708944fea Add all 4 variations of the SBGEMM to compare_sgemm_sbgemm Chip Kerchner 2024-07-10 13:07:48 -05:00
  • e706bc1ec0 Fix core assignment for Intel family 15 Martin Kroeker 2024-07-09 20:22:56 +02:00
  • cb154832f8 Vectorize SBGEMM incopy - 4x faster. Chip Kerchner 2024-07-09 13:10:03 -05:00
  • c92104a605 Update cross compile info Vladimir Nikolić 2024-07-05 12:32:58 -07:00
  • a5c04e326a Update scal.c Martin Kroeker 2024-07-04 22:28:01 +02:00
  • 268dcd8f45 docs: convert remaining install sections (Android, iOS, FreeBSD, Cortex-M) Ralf Gommers 2024-07-04 19:15:07 +02:00
  • 452014341e docs: rework building from source on Windows section Ralf Gommers 2024-07-04 11:05:01 +02:00
  • 4547908901 docs: rewrite "Install OpenBLAS" page (part 1: binaries, basic from source) Ralf Gommers 2024-07-04 09:56:42 +02:00
  • bde7d7c576 deploy: e1eef56e05 martin-frbg 2024-07-04 16:10:02 +00:00
  • e1eef56e05 Merge pull request #4783 from martin-frbg/cpuid_meteor Martin Kroeker 2024-07-04 18:09:27 +02:00
  • 536200bc9e fix handling of INF or NAN Martin Kroeker 2024-07-04 17:47:19 +02:00
  • 3063d03021 Add another CPUID for Meteor Lake Martin Kroeker 2024-07-04 16:05:05 +02:00
  • b422742899 collect error output from ctest, if any Martin Kroeker 2024-07-04 15:42:34 +02:00
  • cea4abcac0 Fix compiling on mingw Jaap Aarts 2024-07-04 14:56:16 +02:00
  • 89885ee381 deploy: f729013d2e martin-frbg 2024-07-03 19:00:59 +00:00
  • f729013d2e Merge pull request #4781 from rgommers/fix-docs-deployment Martin Kroeker 2024-07-03 21:00:18 +02:00
  • 6ede8b14c6 ci: fix CI job to deploy docs, and make it run on pull requests too Ralf Gommers 2024-07-03 20:01:21 +02:00
  • 9836883ee9 Merge pull request #4780 from martin-frbg/azureosx12 Martin Kroeker 2024-07-03 19:53:05 +02:00
  • df81b159e8 Merge pull request #4774 from rgommers/improve-docs Martin Kroeker 2024-07-03 17:10:44 +02:00
  • 2df4007425 Update compiler and sdk versions for osx Martin Kroeker 2024-07-03 16:48:43 +02:00
  • ce5769315d deploy: acf0c3ccaf martin-frbg 2024-07-03 13:22:06 +00:00
  • acf0c3ccaf Merge pull request #4777 from ev-br/sgesdd_ci_err Martin Kroeker 2024-07-03 15:21:33 +02:00
  • 74f059a3ce Update OSX jobs to use the macos-12 image Martin Kroeker 2024-07-03 13:24:02 +02:00
  • cd3c167c28 ignore sgesdd failure on codspeed Evgeni Burovski 2024-07-03 10:58:30 +03:00
  • 9d0abe2d26 Add support for RISCV64_GENERIC in cmake Jaap Aarts 2024-07-03 00:16:12 +02:00
  • 5b385fd453 WIP: fish out the gesdd failure? Evgeni Burovski 2024-07-02 19:23:42 +03:00
  • c1c0dbfd60 docs: address review comments on PR 4774 Ralf Gommers 2024-07-02 14:05:47 +02:00
  • bdb6069051 Merge pull request #4775 from martin-frbg/issue4770 Martin Kroeker 2024-07-01 00:35:30 +02:00
  • 4052b312b2 Merge pull request #4763 from ev-br/sync-codspeed Martin Kroeker 2024-07-01 00:18:08 +02:00
  • 9f1db66881 deploy: 3677b3886c martin-frbg 2024-06-30 20:49:15 +00:00
  • 3677b3886c Merge pull request #4702 from bashimao/detect-nv-grace Martin Kroeker 2024-06-30 22:48:48 +02:00
  • d0b9948b23 Guard against invalid thread_status.queue Martin Kroeker 2024-06-30 19:31:15 +02:00
  • ca9a0c28e8 docs: improve extensions page Ralf Gommers 2024-06-30 17:58:36 +02:00