Commit Graph

8428 Commits

Author SHA1 Message Date
Dmitry Mikushin 8698f9e37f Adding basic support of benchmarks into CMake for single, double, single complex and double complex cases. Each benchmarking target has a suffix to identify the data type, for example ./benchmark_gemm3m_COMPLEX_DOUBLE is a gemm3m.c source compiled with COMPLEX and DOUBLE macros defined 2024-02-10 19:12:16 +01:00
kseniyazaytseva 7e9b1c0807 fix uninitialized data usage 2024-02-10 00:49:42 +03:00
kseniyazaytseva c6f30fd414 check for zero inc 2024-02-10 00:48:07 +03:00
kseniyazaytseva 5e9ead09ac fix info return 2024-02-10 00:47:25 +03:00
kseniyazaytseva 4c554bd527 check abs zero inc 2024-02-10 00:46:52 +03:00
kseniyazaytseva 46de7c8a2b Merge remote-tracking branch 'origin/risc-v-new-tests' into new-tests 2024-02-09 23:52:51 +03:00
Martin Kroeker 10ea3fb742
fix duplication of name parts 2024-02-09 17:09:55 +01:00
Martin Kroeker b1ae777afb
Merge pull request #4497 from sergei-lewis/dev/slewis/zaxpy
Fix axpy test hangs when n==0. Reenable zaxpy_vector kernel for C910V.
2024-02-09 16:22:00 +01:00
Martin Kroeker bb96e466ae
Introduce LIBNAMEPREFIX to avoid messing with the internal LIBPREFIX 2024-02-09 15:50:11 +01:00
Chip-Kerchner 32ed6e391a Merge branch 'develop' of https://github.com/openmathlib/openblas into develop 2024-02-09 07:25:04 -06:00
Sergei Lewis ff1523163f Fix axpy test hangs when n==0. Reenable zaxpy_vector kernel for C910V. 2024-02-09 12:59:14 +00:00
Martin Kroeker ba3bfe85ee
Merge pull request #4495 from martin-frbg/update-gensymbol
Update gensymbol with recently added CBLAS interfaces and LAPACK/LAPACKE functions
2024-02-09 08:55:22 +01:00
Martin Kroeker 93872f4681
drop the ?laqz? symbols for now (not translatable by f2c) 2024-02-08 23:02:09 +01:00
Mark Seminatore 98c56a7314 more cleanup 2024-02-08 13:50:15 -08:00
Martin Kroeker 83bec51355
Update with recently added CBLAS interfaces and LAPACK/LAPACKE functions 2024-02-08 21:23:48 +01:00
Martin Kroeker 974f29c4e9
Merge pull request #4494 from ChipKerchner/fixPower10CPUID
Make sure CPU ID works for all POWER_10 conditions
2024-02-08 21:21:32 +01:00
Chip Kerchner d408ecedba Add environment variable to display coretype for dynamic arch. 2024-02-08 12:17:18 -06:00
Martin Kroeker a96a04ee61
Merge pull request #4493 from martin-frbg/issue4475-3
Fix incompatible pointer types in the declarations of C/ZAXPBY
2024-02-08 16:50:06 +01:00
Chip Kerchner ac6b4b7aa4 Make sure CPU ID works for all POWER_10 conditions 2024-02-08 08:56:30 -06:00
Martin Kroeker 500ac4de5e
fix incompatible pointer types 2024-02-08 13:18:34 +01:00
Martin Kroeker b3fa16345d
fix prototype for c/zaxpby 2024-02-08 13:15:34 +01:00
kseniyazaytseva cfabc48190 Update rotg tests 2024-02-08 00:22:15 +03:00
kseniyazaytseva ec5cfe3bc8 Fix invalid tests 2024-02-08 00:21:38 +03:00
kseniyazaytseva ff10e6b6dc Fix zero step tests 2024-02-08 00:19:54 +03:00
Martin Kroeker e9cfb7fd30
Merge pull request #4491 from martin-frbg/fixup-4488
fix sbgemm bfloat16 conversion errors introduced in PR 4488
2024-02-07 21:34:40 +01:00
Chip-Kerchner cb9aa2a587 Merge branch 'develop' of https://github.com/openmathlib/openblas into develop 2024-02-07 13:09:58 -06:00
Martin Kroeker e9f480111e
fix sbgemm bfloat16 conversion errors introduced in PR 4488 2024-02-07 19:57:18 +01:00
Martin Kroeker 22b487b622
Merge pull request #4488 from martin-frbg/issue4475-2
Separate the interface for SBGEMMT from GEMMT
2024-02-07 18:40:35 +01:00
Martin Kroeker 818bf30628
Merge pull request #4490 from ChipKerchner/missingCPUIDsForAIX
Add missing CPU ID definitions for old versions of AIX.
2024-02-07 17:31:26 +01:00
Martin Kroeker 344763331a
Merge pull request #4484 from martin-frbg/lapack981
Rescale input vector more often in C/ZLARFGP (Reference-LAPACK PR 981)
2024-02-07 15:22:48 +01:00
Chip Kerchner 574912f534 Add missing CPU ID definitions for old versions of AIX. 2024-02-07 08:21:34 -06:00
Chip Kerchner 08ce6b1c1c Add missing CPU ID definitions for old versions of AIX. 2024-02-07 07:54:06 -06:00
Martin Kroeker fb99fc2e6e
fix type conversion warnings 2024-02-07 13:42:08 +01:00
Martin Kroeker 08e479f956
Merge pull request #4487 from ErnstPeng/feature-branch
Optimized zgemm kernel 8x4 LASX, 4x4 LSX and cgemm kernel 8x4 LSX for LoongArch
2024-02-07 13:19:04 +01:00
Martin Kroeker 25b300bbee
improve internal names 2024-02-06 23:40:01 +01:00
Martin Kroeker 9ef10ffa49
Handle prefixed and suffixed libnames, optionally suppress softlinking 2024-02-06 23:38:19 +01:00
Martin Kroeker 1ed69ea1c0
improve naming 2024-02-06 23:35:12 +01:00
Martin Kroeker d4db6a9f16
Separate the interface for SBGEMMT from GEMMT due to differences in GEMV arguments 2024-02-06 22:23:47 +01:00
pengxu fe3da43b7d Optimized zgemm kernel 8*4 LASX, 4*4 LSX and cgemm kernel 8*4 LSX for LoongArch 2024-02-06 11:49:01 +08:00
Martin Kroeker 440edfd997
Add option to suppress versioning of the internal name 2024-02-05 21:44:50 +01:00
Martin Kroeker 63fbffddf8
Add option FIXED_LIBNAME to suppress versioning and softlinking 2024-02-05 21:44:03 +01:00
Martin Kroeker e5d2725e5a
Merge pull request #4185 from XiWeiGu/mips_enable_msa
MIPS: Enable MSA
2024-02-05 15:50:16 +01:00
Martin Kroeker 479e4af089
Rescale input vector more often to minimize relative error (Reference-LAPACK PR 981) 2024-02-05 15:35:24 +01:00
Martin Kroeker a4fde2c5ac
Merge pull request #4451 from martin-frbg/overflow_reset
Reset "buffer management structure overflowed" state and free auxiliary struct on blas_shutdown
2024-02-05 07:27:04 +01:00
Martin Kroeker b537528feb
Merge pull request #4480 from XiWeiGu/loongarch64-fixed-{s/d}amin-lsx
LoongArch64: Fixed {s/d}amin LSX optimization
2024-02-05 06:24:50 +01:00
Martin Kroeker bc7154a80d
Merge pull request #4482 from martin-frbg/issue4476
Fix missing NO_AVX2 fallback for SapphireRapids in DYNAMIC_ARCH
2024-02-04 23:13:10 +01:00
Martin Kroeker 6d8a273cca
Handle zero increment(s) in C910V ?AXPBY (#4483)
* Handle zero increment(s)
2024-02-04 22:07:51 +01:00
Martin Kroeker dbcf4f8b7d
Merge pull request #4479 from XiWeiGu/loongarch-opt-axpby
Loongarch opt axpby
2024-02-04 19:50:28 +01:00
Martin Kroeker dc802dd637
Merge pull request #4474 from ChipKerchner/sgemmIncopy_PR
Vectorize in-copy packing/copying for SGEMM - up to 4X faster.
2024-02-04 18:51:09 +01:00
Martin Kroeker e307675222
Merge pull request #4478 from martin-frbg/issue4475
Fix incompatible pointer type in BFLOAT16 GEMMT
2024-02-04 16:36:40 +01:00