Commit Graph

7973 Commits

Author SHA1 Message Date
Martin Kroeker
8fc2c2db04 Fix missing support for INTERFACE64 on ARM64 and MIPS64 2024-02-22 22:14:13 +01:00
Martin Kroeker
82b81c0bbe Dont fail if there is no Fortran compiler 2024-02-22 22:11:50 +01:00
Martin Kroeker
5e8722a963 Merge pull request #4517 from ayappanec/SharedLibforAIX
Shared library support for AIX
2024-02-22 19:08:52 +01:00
Martin Kroeker
e5c93d1f37 Merge pull request #4516 from XiWeiGu/loongarch64-cgemv-zgemv-opt
Loongarch64 cgemv zgemv opt
2024-02-22 17:34:27 +01:00
Martin Kroeker
78a9ef35b4 Merge pull request #4515 from frjohnst/second_conflict
resolve second_ conflict which breaks xlf timef
2024-02-22 16:23:12 +01:00
Ayappan Perumal
892f8ff3e5 Shared library support for AIX 2024-02-22 07:05:37 -06:00
Martin Kroeker
9d6eeea867 Merge pull request #4513 from ChipKerchner/fixNumCoresAIX
Fix get_num_cores for AIX.
2024-02-22 12:42:15 +01:00
gxw
990507e3b8 LoongArch64: Opt zgemv with LASX 2024-02-22 11:58:02 +08:00
gxw
d51ffec3a2 LoongArch64: Opt cgemv with LASX 2024-02-22 11:56:04 +08:00
frjohnst
9b24b31419 resolve second_ conflict which breaks xlf timef 2024-02-21 15:52:29 -05:00
Chip-Kerchner
bf2310442b Fix get_num_cores for AIX. 2024-02-21 13:26:28 -06:00
Martin Kroeker
99ef76f9bb Merge pull request #4511 from ErnstPeng/feature-branch
Optimized cgemm kernel 16x4 LASX for LoongArch
2024-02-21 14:25:57 +01:00
pengxu
4787a55c64 Optimized cgemm kernel 16x4 LASX for LoongArch 2024-02-21 15:28:47 +08:00
Martin Kroeker
ebbf5b3ea0 Merge pull request #4504 from sergei-lewis/dev/slewis/ci
Add builds and unit tests for new RISCV platforms to CI
2024-02-16 22:48:28 +01:00
Sergei Lewis
461ecabb22 add RISCV64_ZVL128B and RISCV64_ZVL256B targets to CI flows and to README.md 2024-02-16 16:26:29 +00:00
Sergei Lewis
ba17758c02 fix axpy implementations where y has a stride of 0 2024-02-16 16:00:38 +00:00
Martin Kroeker
5266998b9f Merge pull request #4498 from mseminatore/win_tidy
blas_server_win32.c pass to clean up code
2024-02-15 14:37:37 +01:00
Martin Kroeker
57dd894af0 Merge pull request #4502 from dmikushin/add-missing-use_gemm3m-macro
Add missing USE_GEMM3M macro into CMake
2024-02-15 11:13:36 +01:00
Mark Seminatore
b29fd48998 Merge branch 'develop' into win_tidy 2024-02-12 10:23:17 -08:00
Mark Seminatore
0a7ae326d2 Merge branch 'win_tidy' of https://github.com/mseminatore/OpenBLAS into win_tidy 2024-02-12 10:22:26 -08:00
Mark Seminatore
10548a0460 update contributors 2024-02-12 10:22:12 -08:00
Dmitry Mikushin
d0f5dc763b Adding USE_GEMM3M macro to kernel targets, so that the *gemm3m functions and parameters can be included into the gotoblas structure. Fixes #4500 2024-02-12 02:29:58 +01:00
Martin Kroeker
b1ae777afb Merge pull request #4497 from sergei-lewis/dev/slewis/zaxpy
Fix axpy test hangs when n==0. Reenable zaxpy_vector kernel for C910V.
2024-02-09 16:22:00 +01:00
Sergei Lewis
ff1523163f Fix axpy test hangs when n==0. Reenable zaxpy_vector kernel for C910V. 2024-02-09 12:59:14 +00:00
Martin Kroeker
ba3bfe85ee Merge pull request #4495 from martin-frbg/update-gensymbol
Update gensymbol with recently added CBLAS interfaces and LAPACK/LAPACKE functions
2024-02-09 08:55:22 +01:00
Martin Kroeker
93872f4681 drop the ?laqz? symbols for now (not translatable by f2c) 2024-02-08 23:02:09 +01:00
Mark Seminatore
98c56a7314 more cleanup 2024-02-08 13:50:15 -08:00
Martin Kroeker
83bec51355 Update with recently added CBLAS interfaces and LAPACK/LAPACKE functions 2024-02-08 21:23:48 +01:00
Martin Kroeker
974f29c4e9 Merge pull request #4494 from ChipKerchner/fixPower10CPUID
Make sure CPU ID works for all POWER_10 conditions
2024-02-08 21:21:32 +01:00
Chip Kerchner
d408ecedba Add environment variable to display coretype for dynamic arch. 2024-02-08 12:17:18 -06:00
Martin Kroeker
a96a04ee61 Merge pull request #4493 from martin-frbg/issue4475-3
Fix incompatible pointer types in the declarations of C/ZAXPBY
2024-02-08 16:50:06 +01:00
Chip Kerchner
ac6b4b7aa4 Make sure CPU ID works for all POWER_10 conditions 2024-02-08 08:56:30 -06:00
Martin Kroeker
500ac4de5e fix incompatible pointer types 2024-02-08 13:18:34 +01:00
Martin Kroeker
b3fa16345d fix prototype for c/zaxpby 2024-02-08 13:15:34 +01:00
Martin Kroeker
e9cfb7fd30 Merge pull request #4491 from martin-frbg/fixup-4488
fix sbgemm bfloat16 conversion errors introduced in PR 4488
2024-02-07 21:34:40 +01:00
Martin Kroeker
e9f480111e fix sbgemm bfloat16 conversion errors introduced in PR 4488 2024-02-07 19:57:18 +01:00
Martin Kroeker
22b487b622 Merge pull request #4488 from martin-frbg/issue4475-2
Separate the interface for SBGEMMT from GEMMT
2024-02-07 18:40:35 +01:00
Martin Kroeker
818bf30628 Merge pull request #4490 from ChipKerchner/missingCPUIDsForAIX
Add missing CPU ID definitions for old versions of AIX.
2024-02-07 17:31:26 +01:00
Martin Kroeker
344763331a Merge pull request #4484 from martin-frbg/lapack981
Rescale input vector more often in C/ZLARFGP (Reference-LAPACK PR 981)
2024-02-07 15:22:48 +01:00
Chip Kerchner
08ce6b1c1c Add missing CPU ID definitions for old versions of AIX. 2024-02-07 07:54:06 -06:00
Martin Kroeker
fb99fc2e6e fix type conversion warnings 2024-02-07 13:42:08 +01:00
Martin Kroeker
08e479f956 Merge pull request #4487 from ErnstPeng/feature-branch
Optimized zgemm kernel 8x4 LASX, 4x4 LSX and cgemm kernel 8x4 LSX for LoongArch
2024-02-07 13:19:04 +01:00
Martin Kroeker
d4db6a9f16 Separate the interface for SBGEMMT from GEMMT due to differences in GEMV arguments 2024-02-06 22:23:47 +01:00
pengxu
fe3da43b7d Optimized zgemm kernel 8*4 LASX, 4*4 LSX and cgemm kernel 8*4 LSX for LoongArch 2024-02-06 11:49:01 +08:00
Martin Kroeker
e5d2725e5a Merge pull request #4185 from XiWeiGu/mips_enable_msa
MIPS: Enable MSA
2024-02-05 15:50:16 +01:00
Martin Kroeker
479e4af089 Rescale input vector more often to minimize relative error (Reference-LAPACK PR 981) 2024-02-05 15:35:24 +01:00
Martin Kroeker
a4fde2c5ac Merge pull request #4451 from martin-frbg/overflow_reset
Reset "buffer management structure overflowed" state and free auxiliary struct on blas_shutdown
2024-02-05 07:27:04 +01:00
Martin Kroeker
b537528feb Merge pull request #4480 from XiWeiGu/loongarch64-fixed-{s/d}amin-lsx
LoongArch64: Fixed {s/d}amin LSX optimization
2024-02-05 06:24:50 +01:00
Martin Kroeker
bc7154a80d Merge pull request #4482 from martin-frbg/issue4476
Fix missing NO_AVX2 fallback for SapphireRapids in DYNAMIC_ARCH
2024-02-04 23:13:10 +01:00
Martin Kroeker
6d8a273cca Handle zero increment(s) in C910V ?AXPBY (#4483)
* Handle zero increment(s)
2024-02-04 22:07:51 +01:00