Commit Graph

  • f860e82166 Merge pull request #4522 from martin-frbg/arm64scsum Martin Kroeker 2024-02-25 19:20:11 +01:00
  • 7d506984fa fix assignment of default CSUM kernel Martin Kroeker 2024-02-25 17:57:11 +01:00
  • 12787775d9 add csum/zsum kernels (trivially derived from the asum ones)s) Martin Kroeker 2024-02-25 17:55:36 +01:00
  • 1c93e6a5e4 Merge pull request #4521 from martin-frbg/fixczsum Martin Kroeker 2024-02-25 10:46:51 +01:00
  • 8f8ef3492a Add CSUM and ZSUM kernels (trivially derived from their existing ASUM counterparts) Martin Kroeker 2024-02-24 23:57:50 +01:00
  • be5e18c6f9 Add kernel definitions for CSUM and ZSUM Martin Kroeker 2024-02-24 23:55:43 +01:00
  • 5403900539 Merge pull request #4520 from frjohnst/new_branch Martin Kroeker 2024-02-23 20:58:27 +01:00
  • bdaa6705ca fix conlict between PR 4515 and AIX shared obj support frjohnst 2024-02-23 10:20:48 -05:00
  • 0d976acdd7 Merge pull request #4485 from martin-frbg/issue4468 Martin Kroeker 2024-02-23 14:54:12 +01:00
  • 2e86faa657 Merge branch 'develop' into issue4468 Martin Kroeker 2024-02-23 11:39:49 +01:00
  • 0ff854921c Merge pull request #4519 from martin-frbg/gh-applem1 Martin Kroeker 2024-02-23 08:03:59 +01:00
  • 00ae343db0 Merge pull request #4518 from martin-frbg/cmakefixes Martin Kroeker 2024-02-22 23:15:05 +01:00
  • 5b953f2f8d Disable most AppleM1 builds (replaced by gh workflows) Martin Kroeker 2024-02-22 22:41:08 +01:00
  • 16b488cabe CI: Add various Apple M1 build configurations to gh workflow Martin Kroeker 2024-02-22 22:38:05 +01:00
  • be20588a3c Avoid linking both libgomp and libomp in mixed clang/gfortran builds Martin Kroeker 2024-02-22 22:17:48 +01:00
  • ca121eb5ed Avoid linking both libgomp and libomp in mixed clang/gfortran builds Martin Kroeker 2024-02-22 22:17:05 +01:00
  • 4adfe4d531 Avoid linking both libgomp and libomp in mixed clang/gfortran builds Martin Kroeker 2024-02-22 22:16:01 +01:00
  • 3516fff378 Avoid linking both libgomp and libomp in mixed clang/gfortran builds Martin Kroeker 2024-02-22 22:15:28 +01:00
  • 8fc2c2db04 Fix missing support for INTERFACE64 on ARM64 and MIPS64 Martin Kroeker 2024-02-22 22:14:13 +01:00
  • 82b81c0bbe Dont fail if there is no Fortran compiler Martin Kroeker 2024-02-22 22:11:50 +01:00
  • 5e8722a963 Merge pull request #4517 from ayappanec/SharedLibforAIX Martin Kroeker 2024-02-22 19:08:52 +01:00
  • e5c93d1f37 Merge pull request #4516 from XiWeiGu/loongarch64-cgemv-zgemv-opt Martin Kroeker 2024-02-22 17:34:27 +01:00
  • 78a9ef35b4 Merge pull request #4515 from frjohnst/second_conflict Martin Kroeker 2024-02-22 16:23:12 +01:00
  • 892f8ff3e5 Shared library support for AIX Ayappan Perumal 2024-02-22 07:05:37 -06:00
  • 9d6eeea867 Merge pull request #4513 from ChipKerchner/fixNumCoresAIX Martin Kroeker 2024-02-22 12:42:15 +01:00
  • 990507e3b8 LoongArch64: Opt zgemv with LASX gxw 2024-02-22 11:41:15 +08:00
  • d51ffec3a2 LoongArch64: Opt cgemv with LASX gxw 2024-02-22 10:46:45 +08:00
  • 9b24b31419 resolve second_ conflict which breaks xlf timef frjohnst 2024-02-21 15:52:29 -05:00
  • bf2310442b Fix get_num_cores for AIX. Chip-Kerchner 2024-02-21 13:26:28 -06:00
  • a69adbbd11 Merge branch 'develop' of https://github.com/openmathlib/openblas into develop Chip-Kerchner 2024-02-21 12:18:18 -06:00
  • 99ef76f9bb Merge pull request #4511 from ErnstPeng/feature-branch Martin Kroeker 2024-02-21 14:25:57 +01:00
  • 4787a55c64 Optimized cgemm kernel 16x4 LASX for LoongArch pengxu 2024-02-20 20:41:45 +08:00
  • ebbf5b3ea0 Merge pull request #4504 from sergei-lewis/dev/slewis/ci Martin Kroeker 2024-02-16 22:48:28 +01:00
  • 461ecabb22 add RISCV64_ZVL128B and RISCV64_ZVL256B targets to CI flows and to README.md Sergei Lewis 2024-02-16 11:33:28 +00:00
  • ba17758c02 fix axpy implementations where y has a stride of 0 Sergei Lewis 2024-02-16 15:58:02 +00:00
  • 5266998b9f Merge pull request #4498 from mseminatore/win_tidy Martin Kroeker 2024-02-15 14:37:37 +01:00
  • ca6b4961e4 updates to fix option conflicts and config file generation Martin Kroeker 2024-02-15 14:31:11 +01:00
  • c90979d8ef allow for more pre- and suffixes in the name of the openblas library Martin Kroeker 2024-02-15 14:17:11 +01:00
  • 3120f12e76 allow for more pre- and suffixes in the name of the openblas library Martin Kroeker 2024-02-15 14:16:20 +01:00
  • a0e3f77e0b add FIXED_LIBNAME, PREFIX and SUFFIX Martin Kroeker 2024-02-15 12:17:38 +01:00
  • ffbfc3c692 Add libname prefix and suffix Martin Kroeker 2024-02-15 12:16:34 +01:00
  • 179527f622 Merge branch 'OpenMathLib:develop' into issue4468 Martin Kroeker 2024-02-15 12:15:39 +01:00
  • a28afac791 Add FIXED_LIBNAME, LIBNAMEPREFIX and LIBNAMESUFFIX Martin Kroeker 2024-02-15 11:48:33 +01:00
  • 57dd894af0 Merge pull request #4502 from dmikushin/add-missing-use_gemm3m-macro Martin Kroeker 2024-02-15 11:13:36 +01:00
  • b29fd48998 Merge branch 'develop' into win_tidy Mark Seminatore 2024-02-12 10:23:17 -08:00
  • 0a7ae326d2 Merge branch 'win_tidy' of https://github.com/mseminatore/OpenBLAS into win_tidy Mark Seminatore 2024-02-12 10:22:26 -08:00
  • 10548a0460 update contributors Mark Seminatore 2024-02-12 10:22:12 -08:00
  • d0f5dc763b Adding USE_GEMM3M macro to kernel targets, so that the *gemm3m functions and parameters can be included into the gotoblas structure. Fixes #4500 Dmitry Mikushin 2024-02-12 02:18:03 +01:00
  • 8698f9e37f Adding basic support of benchmarks into CMake for single, double, single complex and double complex cases. Each benchmarking target has a suffix to identify the data type, for example ./benchmark_gemm3m_COMPLEX_DOUBLE is a gemm3m.c source compiled with COMPLEX and DOUBLE macros defined Dmitry Mikushin 2024-02-10 19:12:16 +01:00
  • 7e9b1c0807 fix uninitialized data usage kseniyazaytseva 2024-02-10 00:49:42 +03:00
  • c6f30fd414 check for zero inc kseniyazaytseva 2024-02-10 00:48:07 +03:00
  • 5e9ead09ac fix info return kseniyazaytseva 2024-02-10 00:47:25 +03:00
  • 4c554bd527 check abs zero inc kseniyazaytseva 2024-02-10 00:46:52 +03:00
  • 46de7c8a2b Merge remote-tracking branch 'origin/risc-v-new-tests' into new-tests kseniyazaytseva 2024-02-09 23:52:51 +03:00
  • 10ea3fb742 fix duplication of name parts Martin Kroeker 2024-02-09 17:09:55 +01:00
  • b1ae777afb Merge pull request #4497 from sergei-lewis/dev/slewis/zaxpy Martin Kroeker 2024-02-09 16:22:00 +01:00
  • bb96e466ae Introduce LIBNAMEPREFIX to avoid messing with the internal LIBPREFIX Martin Kroeker 2024-02-09 15:50:11 +01:00
  • 32ed6e391a Merge branch 'develop' of https://github.com/openmathlib/openblas into develop Chip-Kerchner 2024-02-09 07:25:04 -06:00
  • ff1523163f Fix axpy test hangs when n==0. Reenable zaxpy_vector kernel for C910V. Sergei Lewis 2024-02-09 12:59:14 +00:00
  • ba3bfe85ee Merge pull request #4495 from martin-frbg/update-gensymbol Martin Kroeker 2024-02-09 08:55:22 +01:00
  • 93872f4681 drop the ?laqz? symbols for now (not translatable by f2c) Martin Kroeker 2024-02-08 23:02:09 +01:00
  • 98c56a7314 more cleanup Mark Seminatore 2024-02-08 13:50:15 -08:00
  • 83bec51355 Update with recently added CBLAS interfaces and LAPACK/LAPACKE functions Martin Kroeker 2024-02-08 21:23:48 +01:00
  • 974f29c4e9 Merge pull request #4494 from ChipKerchner/fixPower10CPUID Martin Kroeker 2024-02-08 21:21:32 +01:00
  • d408ecedba Add environment variable to display coretype for dynamic arch. Chip Kerchner 2024-02-08 12:17:18 -06:00
  • a96a04ee61 Merge pull request #4493 from martin-frbg/issue4475-3 Martin Kroeker 2024-02-08 16:50:06 +01:00
  • ac6b4b7aa4 Make sure CPU ID works for all POWER_10 conditions Chip Kerchner 2024-02-08 08:56:30 -06:00
  • 500ac4de5e fix incompatible pointer types Martin Kroeker 2024-02-08 13:18:34 +01:00
  • b3fa16345d fix prototype for c/zaxpby Martin Kroeker 2024-02-08 13:15:34 +01:00
  • cfabc48190 Update rotg tests kseniyazaytseva 2024-02-08 00:22:15 +03:00
  • ec5cfe3bc8 Fix invalid tests kseniyazaytseva 2024-02-08 00:21:38 +03:00
  • ff10e6b6dc Fix zero step tests kseniyazaytseva 2024-02-08 00:19:54 +03:00
  • e9cfb7fd30 Merge pull request #4491 from martin-frbg/fixup-4488 Martin Kroeker 2024-02-07 21:34:40 +01:00
  • cb9aa2a587 Merge branch 'develop' of https://github.com/openmathlib/openblas into develop Chip-Kerchner 2024-02-07 13:09:58 -06:00
  • e9f480111e fix sbgemm bfloat16 conversion errors introduced in PR 4488 Martin Kroeker 2024-02-07 19:57:18 +01:00
  • 22b487b622 Merge pull request #4488 from martin-frbg/issue4475-2 Martin Kroeker 2024-02-07 18:40:35 +01:00
  • 818bf30628 Merge pull request #4490 from ChipKerchner/missingCPUIDsForAIX Martin Kroeker 2024-02-07 17:31:26 +01:00
  • 344763331a Merge pull request #4484 from martin-frbg/lapack981 Martin Kroeker 2024-02-07 15:22:48 +01:00
  • 574912f534 Add missing CPU ID definitions for old versions of AIX. Chip Kerchner 2024-02-07 07:54:06 -06:00
  • 08ce6b1c1c Add missing CPU ID definitions for old versions of AIX. Chip Kerchner 2024-02-07 07:54:06 -06:00
  • fb99fc2e6e fix type conversion warnings Martin Kroeker 2024-02-07 13:42:08 +01:00
  • 08e479f956 Merge pull request #4487 from ErnstPeng/feature-branch Martin Kroeker 2024-02-07 13:19:04 +01:00
  • 25b300bbee improve internal names Martin Kroeker 2024-02-06 23:40:01 +01:00
  • 9ef10ffa49 Handle prefixed and suffixed libnames, optionally suppress softlinking Martin Kroeker 2024-02-06 23:38:19 +01:00
  • 1ed69ea1c0 improve naming Martin Kroeker 2024-02-06 23:35:12 +01:00
  • d4db6a9f16 Separate the interface for SBGEMMT from GEMMT due to differences in GEMV arguments Martin Kroeker 2024-02-06 22:23:47 +01:00
  • fe3da43b7d Optimized zgemm kernel 8*4 LASX, 4*4 LSX and cgemm kernel 8*4 LSX for LoongArch pengxu 2024-02-06 11:49:01 +08:00
  • 440edfd997 Add option to suppress versioning of the internal name Martin Kroeker 2024-02-05 21:44:50 +01:00
  • 63fbffddf8 Add option FIXED_LIBNAME to suppress versioning and softlinking Martin Kroeker 2024-02-05 21:44:03 +01:00
  • e5d2725e5a Merge pull request #4185 from XiWeiGu/mips_enable_msa Martin Kroeker 2024-02-05 15:50:16 +01:00
  • 479e4af089 Rescale input vector more often to minimize relative error (Reference-LAPACK PR 981) Martin Kroeker 2024-02-05 15:35:24 +01:00
  • a4fde2c5ac Merge pull request #4451 from martin-frbg/overflow_reset Martin Kroeker 2024-02-05 07:27:04 +01:00
  • b537528feb Merge pull request #4480 from XiWeiGu/loongarch64-fixed-{s/d}amin-lsx Martin Kroeker 2024-02-05 06:24:50 +01:00
  • bc7154a80d Merge pull request #4482 from martin-frbg/issue4476 Martin Kroeker 2024-02-04 23:13:10 +01:00
  • 6d8a273cca Handle zero increment(s) in C910V ?AXPBY (#4483) Martin Kroeker 2024-02-04 22:07:51 +01:00
  • dbcf4f8b7d Merge pull request #4479 from XiWeiGu/loongarch-opt-axpby Martin Kroeker 2024-02-04 19:50:28 +01:00
  • dc802dd637 Merge pull request #4474 from ChipKerchner/sgemmIncopy_PR Martin Kroeker 2024-02-04 18:51:09 +01:00
  • e307675222 Merge pull request #4478 from martin-frbg/issue4475 Martin Kroeker 2024-02-04 16:36:40 +01:00
  • 033168cdf0 Merge pull request #4481 from martin-frbg/cpuid_riscv Martin Kroeker 2024-02-04 14:09:44 +01:00
  • a29f91ae9a Merge pull request #4471 from ChipKerchner/fixMakefileAIXOpenMP Martin Kroeker 2024-02-04 12:13:26 +01:00