Commit Graph

8260 Commits

Author SHA1 Message Date
Chip Kerchner ac6b4b7aa4 Make sure CPU ID works for all POWER_10 conditions 2024-02-08 08:56:30 -06:00
Martin Kroeker 500ac4de5e
fix incompatible pointer types 2024-02-08 13:18:34 +01:00
Martin Kroeker b3fa16345d
fix prototype for c/zaxpby 2024-02-08 13:15:34 +01:00
kseniyazaytseva cfabc48190 Update rotg tests 2024-02-08 00:22:15 +03:00
kseniyazaytseva ec5cfe3bc8 Fix invalid tests 2024-02-08 00:21:38 +03:00
kseniyazaytseva ff10e6b6dc Fix zero step tests 2024-02-08 00:19:54 +03:00
Martin Kroeker e9cfb7fd30
Merge pull request #4491 from martin-frbg/fixup-4488
fix sbgemm bfloat16 conversion errors introduced in PR 4488
2024-02-07 21:34:40 +01:00
Chip-Kerchner cb9aa2a587 Merge branch 'develop' of https://github.com/openmathlib/openblas into develop 2024-02-07 13:09:58 -06:00
Martin Kroeker e9f480111e
fix sbgemm bfloat16 conversion errors introduced in PR 4488 2024-02-07 19:57:18 +01:00
Martin Kroeker 22b487b622
Merge pull request #4488 from martin-frbg/issue4475-2
Separate the interface for SBGEMMT from GEMMT
2024-02-07 18:40:35 +01:00
Martin Kroeker 818bf30628
Merge pull request #4490 from ChipKerchner/missingCPUIDsForAIX
Add missing CPU ID definitions for old versions of AIX.
2024-02-07 17:31:26 +01:00
Martin Kroeker 344763331a
Merge pull request #4484 from martin-frbg/lapack981
Rescale input vector more often in C/ZLARFGP (Reference-LAPACK PR 981)
2024-02-07 15:22:48 +01:00
Chip Kerchner 574912f534 Add missing CPU ID definitions for old versions of AIX. 2024-02-07 08:21:34 -06:00
Chip Kerchner 08ce6b1c1c Add missing CPU ID definitions for old versions of AIX. 2024-02-07 07:54:06 -06:00
Martin Kroeker fb99fc2e6e
fix type conversion warnings 2024-02-07 13:42:08 +01:00
Martin Kroeker 08e479f956
Merge pull request #4487 from ErnstPeng/feature-branch
Optimized zgemm kernel 8x4 LASX, 4x4 LSX and cgemm kernel 8x4 LSX for LoongArch
2024-02-07 13:19:04 +01:00
Martin Kroeker 25b300bbee
improve internal names 2024-02-06 23:40:01 +01:00
Martin Kroeker 9ef10ffa49
Handle prefixed and suffixed libnames, optionally suppress softlinking 2024-02-06 23:38:19 +01:00
Martin Kroeker 1ed69ea1c0
improve naming 2024-02-06 23:35:12 +01:00
Martin Kroeker d4db6a9f16
Separate the interface for SBGEMMT from GEMMT due to differences in GEMV arguments 2024-02-06 22:23:47 +01:00
pengxu fe3da43b7d Optimized zgemm kernel 8*4 LASX, 4*4 LSX and cgemm kernel 8*4 LSX for LoongArch 2024-02-06 11:49:01 +08:00
Martin Kroeker 440edfd997
Add option to suppress versioning of the internal name 2024-02-05 21:44:50 +01:00
Martin Kroeker 63fbffddf8
Add option FIXED_LIBNAME to suppress versioning and softlinking 2024-02-05 21:44:03 +01:00
Martin Kroeker e5d2725e5a
Merge pull request #4185 from XiWeiGu/mips_enable_msa
MIPS: Enable MSA
2024-02-05 15:50:16 +01:00
Martin Kroeker 479e4af089
Rescale input vector more often to minimize relative error (Reference-LAPACK PR 981) 2024-02-05 15:35:24 +01:00
Martin Kroeker a4fde2c5ac
Merge pull request #4451 from martin-frbg/overflow_reset
Reset "buffer management structure overflowed" state and free auxiliary struct on blas_shutdown
2024-02-05 07:27:04 +01:00
Martin Kroeker b537528feb
Merge pull request #4480 from XiWeiGu/loongarch64-fixed-{s/d}amin-lsx
LoongArch64: Fixed {s/d}amin LSX optimization
2024-02-05 06:24:50 +01:00
Martin Kroeker bc7154a80d
Merge pull request #4482 from martin-frbg/issue4476
Fix missing NO_AVX2 fallback for SapphireRapids in DYNAMIC_ARCH
2024-02-04 23:13:10 +01:00
Martin Kroeker 6d8a273cca
Handle zero increment(s) in C910V ?AXPBY (#4483)
* Handle zero increment(s)
2024-02-04 22:07:51 +01:00
Martin Kroeker dbcf4f8b7d
Merge pull request #4479 from XiWeiGu/loongarch-opt-axpby
Loongarch opt axpby
2024-02-04 19:50:28 +01:00
Martin Kroeker dc802dd637
Merge pull request #4474 from ChipKerchner/sgemmIncopy_PR
Vectorize in-copy packing/copying for SGEMM - up to 4X faster.
2024-02-04 18:51:09 +01:00
Martin Kroeker e307675222
Merge pull request #4478 from martin-frbg/issue4475
Fix incompatible pointer type in BFLOAT16 GEMMT
2024-02-04 16:36:40 +01:00
Martin Kroeker 033168cdf0
Merge pull request #4481 from martin-frbg/cpuid_riscv
Update lowercase cpunames for RISC-V
2024-02-04 14:09:44 +01:00
Martin Kroeker a29f91ae9a
Merge pull request #4471 from ChipKerchner/fixMakefileAIXOpenMP
Fix Makefiles to support OpenMP on AIX for xlc (clang) with xlf.
2024-02-04 12:13:26 +01:00
Martin Kroeker e61d96303d
Fix missing NO_AVX2 fallback for SapphireRapids 2024-02-04 10:05:20 +01:00
Martin Kroeker d02c61e82e
Update lowercase cpunames for RISC-V 2024-02-04 10:01:27 +01:00
Martin Kroeker 7228c708d7
Merge pull request #4461 from markdryan/cpuid_riscv64_crash
Fix two issues with cpuid_riscv64.c
2024-02-04 09:57:00 +01:00
gxw adde725321 LoongArch64: Fixed {s/d}amin LSX optimization 2024-02-04 14:44:47 +08:00
gxw 7bc93d95a1 LoongArch64: Opt {c/z}axpby 2024-02-04 11:23:31 +08:00
gxw 1e1f487dc7 LoongArch64: Fixed {s/d}axpby 2024-02-04 09:41:37 +08:00
gxw 3597827c93 utest: add axpby 2024-02-04 09:41:30 +08:00
Martin Kroeker 68d354814f
Fix incompatible pointer type in BFLOAT16 mode 2024-02-04 01:14:22 +01:00
Martin Kroeker 3848d4e9f4
Merge pull request #4477 from martin-frbg/c910caxpy
Temporarily disable the CAXPY/ZAXPY kernels for C910V to workaround a CI hang
2024-02-04 01:10:57 +01:00
Martin Kroeker 4d8dee508c
temporarily disable the CAXPY/ZAXPY kernels 2024-02-04 01:05:03 +01:00
Martin Kroeker 27816fa929
Merge pull request #4472 from sergei-lewis/dev/slewis/merge-from-riscv
Merge risc-v branch to develop
2024-02-03 20:56:11 +01:00
kseniyazaytseva b6949ce74c add axpyc to cmake build 2024-02-02 14:42:27 +03:00
kseniyazaytseva 441339104f fix test ext cmake build 2024-02-02 13:49:39 +03:00
kseniyazaytseva f68e9989c4 Remove zero rows/columns matcopy tests 2024-02-02 12:26:23 +03:00
austinpagan 87ba528d8b Changed C files to straighten out indentation. Removed commented lines from other file. 2024-02-01 18:46:07 -06:00
austinpagan 461cf9083c Merge remote-tracking branch 'origin/develop' into cgemm_zgemm_c_code 2024-02-01 12:40:04 -06:00