Commit Graph

  • 995a990e24 Make AVX512 BFLOAT16 kernels conditional on compiler capability Martin Kroeker 2024-01-12 00:12:46 +01:00
  • 1dada6d65d Add compiler test and flag for AVX512BF16 capability Martin Kroeker 2024-01-12 00:10:56 +01:00
  • 7df363e1e2 temporarily disable the MSA C/ZSCAL kernels Martin Kroeker 2024-01-12 00:08:52 +01:00
  • 3599f2de8b Merge pull request #4421 from ChipKerchner/power10Copies_DGEMM Martin Kroeker 2024-01-10 07:49:00 +01:00
  • 5c5c1a1220 Merge remote-tracking branch 'origin/develop' into power10Copies_DGEMM Chip-Kerchner 2024-01-09 07:18:41 -06:00
  • 058dd2a4cb Replace two vector loads with one vector pair load and fix endianess of stores - DGEMM versions. Chip-Kerchner 2024-01-08 14:16:09 -06:00
  • 1c31f56e5a Handle NAN Martin Kroeker 2024-01-08 16:11:25 +01:00
  • 7ee1ee38e2 Handle NaN in input Martin Kroeker 2024-01-08 14:20:07 +01:00
  • f637e12713 Handle INF and NAN Martin Kroeker 2024-01-08 09:52:38 +01:00
  • 25b0c48082 Update zscal.c Martin Kroeker 2024-01-08 09:49:18 +01:00
  • 5e7f714e93 Update zscal.c Martin Kroeker 2024-01-08 08:17:40 +01:00
  • cf8b03ae8b Use NAN rather than SNAN for portability Martin Kroeker 2024-01-07 23:09:57 +01:00
  • 7a6a24647d Merge pull request #4420 from martin-frbg/revertstuff Martin Kroeker 2024-01-07 23:04:22 +01:00
  • f0808d856b Handle NAN in input Martin Kroeker 2024-01-07 20:27:29 +01:00
  • acf17a825d Handle NAN in input Martin Kroeker 2024-01-07 20:26:16 +01:00
  • f052bd4705 revert accidental direct commit to develop Martin Kroeker 2024-01-07 20:21:48 +01:00
  • 91bbde7f64 revert accidental direct commit to develop Martin Kroeker 2024-01-07 20:21:05 +01:00
  • 2173356d5b Update zscal_vector.c Martin Kroeker 2024-01-07 19:45:28 +01:00
  • b08a208365 Update zscal_vector.c Martin Kroeker 2024-01-07 19:14:41 +01:00
  • 0c33b57f5f Handle NAN in input Martin Kroeker 2024-01-07 18:40:19 +01:00
  • 903589f84b Update zscal.c Martin Kroeker 2024-01-07 18:37:00 +01:00
  • 711433fcf0 Update zscal.c Martin Kroeker 2024-01-07 18:01:58 +01:00
  • d3d99c34f2 Fix handling of NAN and INF Martin Kroeker 2024-01-07 17:56:51 +01:00
  • c9df62e883 Fix handling of NAN Martin Kroeker 2024-01-07 17:49:40 +01:00
  • def4996170 Fix handling of NAN and INF arguments Martin Kroeker 2024-01-07 15:29:42 +01:00
  • e48627c999 Add tests for ZSCAL with NaN and Inf arguments Martin Kroeker 2024-01-06 23:55:52 +01:00
  • 1412d2deeb Update version to 0.3.26.dev Martin Kroeker 2024-01-02 22:33:01 +01:00
  • 4f5da84e2f Update version to 0.3.26.dev Martin Kroeker 2024-01-02 22:32:27 +01:00
  • 1ad742844b Merge pull request #4409 from OpenMathLib/release-0.3.0 Martin Kroeker 2024-01-02 22:31:38 +01:00
  • 6c77e5e314 Update Makefile.rule v0.3.26 Martin Kroeker 2024-01-02 22:25:05 +01:00
  • fde8bb9903 Update version to 0.3.26 Martin Kroeker 2024-01-02 22:24:33 +01:00
  • 8fe7f80271 Merge pull request #4408 from OpenMathLib/develop Martin Kroeker 2024-01-02 22:23:31 +01:00
  • cddd35fae1 Merge pull request #4407 from martin-frbg/changelog0326 Martin Kroeker 2024-01-02 22:21:16 +01:00
  • 03713bc464 Update Changelog for 0.3.26 Martin Kroeker 2024-01-02 22:08:49 +01:00
  • cdff44e4d3 Merge pull request #4406 from martin-frbg/issue3291 Martin Kroeker 2024-01-02 22:02:56 +01:00
  • 8278d0d093 Merge pull request #4353 from erikbs/feature/fix-xerbla-linking-on-older-mac-versions Martin Kroeker 2024-01-02 19:55:05 +01:00
  • 504f9b0c5e Increase S/D GEMM PQ to match typical L2 size as forNeoverseV1 Martin Kroeker 2024-01-02 18:46:21 +01:00
  • 534de14a02 Merge pull request #4402 from martin-frbg/lapack967 Martin Kroeker 2023-12-31 16:31:28 +01:00
  • 4a15d72420 AzureCI: Update alpine-chroot-install (#4403) Martin Kroeker 2023-12-31 16:30:57 +01:00
  • 0c43c6fa99 Merge pull request #4341 from catap/openblas.pc.in Martin Kroeker 2023-12-31 13:25:06 +01:00
  • 00d7476b4b Fix uninitialized read/wrong variable (Reference-LAPACK PR 967) Martin Kroeker 2023-12-31 12:39:21 +01:00
  • 1b668479de Fix uninitialized read/wrong variable (Reference-LAPACK PR 967) Martin Kroeker 2023-12-31 12:37:52 +01:00
  • bd787c8a1a Fix uninitialized read/wrong variable (Reference-LAPACK PR 967) Martin Kroeker 2023-12-31 12:36:47 +01:00
  • d3451af03f Fix uninitialized read/wrong variable (Reference-LAPACK PR 967) Martin Kroeker 2023-12-31 12:35:37 +01:00
  • 5a20bc5e02 Merge pull request #4401 from martin-frbg/fix4398 Martin Kroeker 2023-12-31 10:15:59 +01:00
  • 2802478449 revert change to Loongson2k1000 zgemm Martin Kroeker 2023-12-30 23:35:51 +01:00
  • 910ab7f698 Merge branch 'OpenMathLib:develop' into fix4398 Martin Kroeker 2023-12-30 22:51:31 +01:00
  • 44b5b9e39f Update C/ZGEMM MN for Loongson2k1000 Martin Kroeker 2023-12-30 22:50:40 +01:00
  • 9d89bcfbf0 Merge pull request #4399 from martin-frbg/fixloongsonci Martin Kroeker 2023-12-30 20:50:55 +01:00
  • 0f648ebcd1 use alternate download for the CLFS cross-compiler package Martin Kroeker 2023-12-30 20:31:32 +01:00
  • 519b40fad9 Merge pull request #4398 from yinshiyou/la-dev Martin Kroeker 2023-12-30 19:51:08 +01:00
  • a5d0d21378 loongarch64: Add zgemm and cgemm optimization pengxu 2023-12-29 15:10:01 +08:00
  • 546f13558c loongarch64: Add {c/z}swap and {c/z}sum optimization gxw 2023-12-29 11:03:53 +08:00
  • edabb93668 loongarch64: Refine axpby optimization functions. Hao Chen 2023-12-29 15:08:10 +08:00
  • 1ec5dded43 loongarch64: Add c/zrot optimization functions. Hao Chen 2023-12-28 21:23:59 +08:00
  • 3c53ded315 loongarch64: Add c/znrm2 optimization functions. Hao Chen 2023-12-28 20:26:01 +08:00
  • fbd612f8c4 loongarch64: Add ic/zamin optimization functions. Hao Chen 2023-12-28 20:07:58 +08:00
  • d97272cb35 loongarch64: Add c/zdot optimization functions. Hao Chen 2023-12-28 19:09:18 +08:00
  • 65a0aeb128 loongarch64: Add c/zcopy optimization functions. Hao Chen 2023-12-28 17:45:17 +08:00
  • 2a34fb4b80 loongarch64: Add and refine scal optimization functions. Hao Chen 2023-12-27 18:17:51 +08:00
  • 8785e948b5 loongarch64: Add camin optimization function. Hao Chen 2023-12-27 17:04:46 +08:00
  • 0753848e03 loongarch64: Refine and add axpy optimization functions. Hao Chen 2023-12-27 16:54:01 +08:00
  • 06fd5b5995 loongarch64: Add and Refine asum optimization functions. Hao Chen 2023-12-27 10:44:02 +08:00
  • e771be185e Optimize copy functions with lsx. guxiwei 2023-12-21 14:28:06 +08:00
  • 179ed51d3b Add dgemm_kernel_8x4.S file. Hao Chen 2023-12-21 14:18:39 +08:00
  • 173a65d4e6 loongarch64: Add and refine iamax optimization functions. Hao Chen 2023-12-25 15:11:04 +08:00
  • ea70e165c7 loongarch64: Refine rot optimization. zhoupeng 2023-12-28 20:07:59 +08:00
  • 116aee7527 loongarch64: Refine imin optimization. zhoupeng 2023-12-28 15:17:28 +08:00
  • 8be2654193 loongarch64: Refine imax optimization. zhoupeng 2023-12-28 10:24:24 +08:00
  • 154baad454 loongarch64: Refine iamin optimization. zhoupeng 2023-12-27 16:04:33 +08:00
  • 36c12c4971 loongarch64: Refine copy,swap,nrm2,sum optimization. Shiyou Yin 2023-12-27 11:30:17 +08:00
  • c6996a80e9 loongarch64: Refine amax,amin,max,min optimization. Shiyou Yin 2023-12-08 16:06:17 +08:00
  • 21564bde2c Merge pull request #4394 from martin-frbg/dyn_vortex Martin Kroeker 2023-12-28 13:35:55 +01:00
  • e9c32ed165 Merge pull request #4384 from yetist/develop Martin Kroeker 2023-12-27 14:05:01 +01:00
  • e7a895e714 Add Apple M as NeoverseN1 Martin Kroeker 2023-12-25 12:36:05 +01:00
  • 474ce0ace9 Merge pull request #4393 from martin-frbg/pr4389-2 Martin Kroeker 2023-12-25 12:30:56 +01:00
  • 1106460bb3 remove redundant targets from the default ARM64 DYNAMIC_ARCH list Martin Kroeker 2023-12-25 12:29:56 +01:00
  • 236acee706 Merge pull request #4389 from Mousius/reduce-dynamic-targets Martin Kroeker 2023-12-25 12:27:42 +01:00
  • d2f4f1b28a CI: update toolchains for LoongArch64 Xiaotian Wu 2023-12-20 14:13:04 +08:00
  • 0baf462dbc Fix: build failed on LoongArch Wu Xiaotian 2023-12-20 10:34:47 +08:00
  • 63a83939a1 Merge pull request #4390 from Mousius/reduce-kernel-duplication Martin Kroeker 2023-12-24 18:04:26 +01:00
  • dba404055d Merge pull request #4392 from martin-frbg/lapack959 Martin Kroeker 2023-12-24 10:44:15 +01:00
  • c6fa921027 Add tests for ?GEDMD (Reference-LAPACK PR 959) Martin Kroeker 2023-12-23 23:39:53 +01:00
  • 283713e4c5 Add tests for ?GEDMD (Reference-LAPACK PR 959) Martin Kroeker 2023-12-23 23:32:45 +01:00
  • 201f22f49a Fix issues related to ?GEDMD (Reference-LAPACK PR 959) Martin Kroeker 2023-12-23 23:27:38 +01:00
  • 05dde8ef04 Merge pull request #4391 from martin-frbg/lapack942 Martin Kroeker 2023-12-23 23:11:46 +01:00
  • 45ef0d7361 Handle corner cases of LWORK (Reference-LAPACK PR 942) Martin Kroeker 2023-12-23 20:16:33 +01:00
  • c082669ad4 Handle corner cases of LWORK (Reference-LAPACK PR 942) Martin Kroeker 2023-12-23 20:05:03 +01:00
  • 29d6024ec5 Handle corner cases of LWORK (Reference-LAPACK PR 942) Martin Kroeker 2023-12-23 19:44:11 +01:00
  • 0814491d96 Handle corner cases of LWORK (Reference-LAPACK PR 942) Martin Kroeker 2023-12-23 19:37:03 +01:00
  • 5c11b2ff41 Handle corner cases of LWORK (Reference-LAPACK PR 942) Martin Kroeker 2023-12-23 19:27:20 +01:00
  • 8ce44c18a0 Handle corner cases of LWORK (Reference-LAPACK PR 942) Martin Kroeker 2023-12-23 19:24:10 +01:00
  • dc20a78188 Use functionally equivalent dynamic targets Chris Sidebottom 2023-12-23 12:19:33 +00:00
  • ecae1389df Reduce duplication in kernel definitions Chris Sidebottom 2023-12-23 12:21:48 +00:00
  • 68ef2328eb Merge pull request #4388 from martin-frbg/issue4387 Martin Kroeker 2023-12-21 22:21:44 +01:00
  • a7ed60bfe9 Add lower limit for multithreading Martin Kroeker 2023-12-21 20:05:23 +01:00
  • 67779177b9 Merge pull request #4383 from martin-frbg/fixlapatest Martin Kroeker 2023-12-20 14:01:59 +01:00
  • e67a0eaaf9 Restore OpenBLAS-specific build rule changes Martin Kroeker 2023-12-19 23:15:11 +01:00
  • bb8b91e9f2 restore OpenBLAS-specific test paths Martin Kroeker 2023-12-19 23:13:02 +01:00
  • fa220b2969 Merge pull request #4382 from Mousius/sve-dot-again Martin Kroeker 2023-12-19 18:46:18 +01:00