Commit Graph

7933 Commits

Author SHA1 Message Date
Martin Kroeker a28afac791
Add FIXED_LIBNAME, LIBNAMEPREFIX and LIBNAMESUFFIX 2024-02-15 11:48:33 +01:00
Martin Kroeker 10ea3fb742
fix duplication of name parts 2024-02-09 17:09:55 +01:00
Martin Kroeker bb96e466ae
Introduce LIBNAMEPREFIX to avoid messing with the internal LIBPREFIX 2024-02-09 15:50:11 +01:00
Martin Kroeker 25b300bbee
improve internal names 2024-02-06 23:40:01 +01:00
Martin Kroeker 9ef10ffa49
Handle prefixed and suffixed libnames, optionally suppress softlinking 2024-02-06 23:38:19 +01:00
Martin Kroeker 1ed69ea1c0
improve naming 2024-02-06 23:35:12 +01:00
Martin Kroeker 440edfd997
Add option to suppress versioning of the internal name 2024-02-05 21:44:50 +01:00
Martin Kroeker 63fbffddf8
Add option FIXED_LIBNAME to suppress versioning and softlinking 2024-02-05 21:44:03 +01:00
Martin Kroeker e5d2725e5a
Merge pull request #4185 from XiWeiGu/mips_enable_msa
MIPS: Enable MSA
2024-02-05 15:50:16 +01:00
Martin Kroeker a4fde2c5ac
Merge pull request #4451 from martin-frbg/overflow_reset
Reset "buffer management structure overflowed" state and free auxiliary struct on blas_shutdown
2024-02-05 07:27:04 +01:00
Martin Kroeker b537528feb
Merge pull request #4480 from XiWeiGu/loongarch64-fixed-{s/d}amin-lsx
LoongArch64: Fixed {s/d}amin LSX optimization
2024-02-05 06:24:50 +01:00
Martin Kroeker bc7154a80d
Merge pull request #4482 from martin-frbg/issue4476
Fix missing NO_AVX2 fallback for SapphireRapids in DYNAMIC_ARCH
2024-02-04 23:13:10 +01:00
Martin Kroeker 6d8a273cca
Handle zero increment(s) in C910V ?AXPBY (#4483)
* Handle zero increment(s)
2024-02-04 22:07:51 +01:00
Martin Kroeker dbcf4f8b7d
Merge pull request #4479 from XiWeiGu/loongarch-opt-axpby
Loongarch opt axpby
2024-02-04 19:50:28 +01:00
Martin Kroeker dc802dd637
Merge pull request #4474 from ChipKerchner/sgemmIncopy_PR
Vectorize in-copy packing/copying for SGEMM - up to 4X faster.
2024-02-04 18:51:09 +01:00
Martin Kroeker e307675222
Merge pull request #4478 from martin-frbg/issue4475
Fix incompatible pointer type in BFLOAT16 GEMMT
2024-02-04 16:36:40 +01:00
Martin Kroeker 033168cdf0
Merge pull request #4481 from martin-frbg/cpuid_riscv
Update lowercase cpunames for RISC-V
2024-02-04 14:09:44 +01:00
Martin Kroeker a29f91ae9a
Merge pull request #4471 from ChipKerchner/fixMakefileAIXOpenMP
Fix Makefiles to support OpenMP on AIX for xlc (clang) with xlf.
2024-02-04 12:13:26 +01:00
Martin Kroeker e61d96303d
Fix missing NO_AVX2 fallback for SapphireRapids 2024-02-04 10:05:20 +01:00
Martin Kroeker d02c61e82e
Update lowercase cpunames for RISC-V 2024-02-04 10:01:27 +01:00
Martin Kroeker 7228c708d7
Merge pull request #4461 from markdryan/cpuid_riscv64_crash
Fix two issues with cpuid_riscv64.c
2024-02-04 09:57:00 +01:00
gxw adde725321 LoongArch64: Fixed {s/d}amin LSX optimization 2024-02-04 14:44:47 +08:00
gxw 7bc93d95a1 LoongArch64: Opt {c/z}axpby 2024-02-04 11:23:31 +08:00
gxw 1e1f487dc7 LoongArch64: Fixed {s/d}axpby 2024-02-04 09:41:37 +08:00
gxw 3597827c93 utest: add axpby 2024-02-04 09:41:30 +08:00
Martin Kroeker 68d354814f
Fix incompatible pointer type in BFLOAT16 mode 2024-02-04 01:14:22 +01:00
Martin Kroeker 3848d4e9f4
Merge pull request #4477 from martin-frbg/c910caxpy
Temporarily disable the CAXPY/ZAXPY kernels for C910V to workaround a CI hang
2024-02-04 01:10:57 +01:00
Martin Kroeker 4d8dee508c
temporarily disable the CAXPY/ZAXPY kernels 2024-02-04 01:05:03 +01:00
Martin Kroeker 27816fa929
Merge pull request #4472 from sergei-lewis/dev/slewis/merge-from-riscv
Merge risc-v branch to develop
2024-02-03 20:56:11 +01:00
Chip Kerchner 2bb7ea64a1 Only vectorize 64-bit version for Power8. 2024-02-01 08:11:43 -06:00
Sergei Lewis 3ffd6868d7 Merge branch 'develop' into dev/slewis/merge-from-riscv 2024-02-01 11:29:41 +00:00
Sergei Lewis a3b0ef6596 Restore riscv64 fixes from develop branch: dot product double precision accumulation, zscal NaN handling 2024-02-01 10:32:00 +00:00
Martin Kroeker ec74dcd213
Merge pull request #4470 from martin-frbg/issue4455
Add CBLAS interfaces for BLAS extensions ?AMIN/?AMAX and C/ZAXPYC
2024-01-31 23:51:01 +01:00
Chip Kerchner 61c8e19f95 Fix Makefile to support OpenMP on AIX for xlc (clang) with xlf. 2024-01-31 15:27:50 -06:00
Martin Kroeker 47bd064763
Fix names in build rules 2024-01-31 20:49:43 +01:00
Martin Kroeker a7d004e820
Fix CBLAS prototype 2024-01-31 17:55:42 +01:00
Martin Kroeker b54cda8490
Unify creation of CBLAS interfaces for ?AMIN/?AMAX and C/ZAXPYC between gmake and cmake builds 2024-01-31 16:00:52 +01:00
Martin Kroeker 1a6fdb0353
Add prototypes for extensions ?AMIN/?AMAX and CAXPYC/ZAXPYC 2024-01-31 15:57:57 +01:00
Martin Kroeker d1343302bd
Merge pull request #4465 from XiWeiGu/utest-zscal
utest: Add tests for zscal
2024-01-31 14:19:19 +01:00
gxw 969601a1dc X86_64: Fixed bug in zscal
Fixed handling of NAN and INF arguments when
inc is greater than 1.
2024-01-31 11:23:59 +08:00
Martin Kroeker 98c9ff3194
Merge pull request #4464 from XiWeiGu/loongarch64-zscal
LoongArch64: Handle NAN and INF
2024-01-30 22:53:29 +01:00
Martin Kroeker 9f0630187a
Merge pull request #4463 from XiWeiGu/loongarch64-zamax-zamin
Loongarch64: amax and amin
2024-01-30 18:01:30 +01:00
Chip Kerchner 09bb48d1b9 Vectorize in-copy packing/copying for SGEMM - 4X faster. 2024-01-30 09:13:16 -06:00
gxw bb043a021f utest: Add tests for zscal 2024-01-30 17:42:37 +08:00
gxw 83ce97a4ca LoongArch64: Handle NAN and INF 2024-01-30 17:17:30 +08:00
gxw 3d4dfd0085 Benchmark: Rename the executable file names for {sc/dz}a{min/max}
No interface named {c/z}a{min/max}, keeping it would
cause ambiguity
2024-01-30 11:33:01 +08:00
gxw a79d117405 LoogArch64: Fixed bug for {s/d}amin 2024-01-30 11:32:57 +08:00
gxw 519ea6e87a utest: Add utest for the {sc/dz}amax and {s/d/sc/dz}amin 2024-01-30 11:32:36 +08:00
Sergei Lewis 1093def0d1 Merge branch 'risc-v' into develop 2024-01-29 11:11:39 +00:00
Martin Kroeker 8892121130
Merge pull request #4462 from martin-frbg/issue4449
Use +sve in arch declarations of the fallback paths for SVE targets
2024-01-26 22:41:16 +01:00