Commit Graph

8796 Commits

Author SHA1 Message Date
Martin Kroeker
f66e6d32c2 Merge pull request #4953 from NickelWenzel/fix_trtrs_return_types
fix: return types of *trtrs routines
2024-10-25 23:29:24 +02:00
Martin Kroeker
a8bb105ed6 Merge pull request #4848 from haampie/fix/cmake-min-version
cmake: set `CMP0042` to `NEW`
2024-10-25 20:59:13 +02:00
Martin Kroeker
0e6a2cc93c bump the minimum_required version instead 2024-10-25 16:47:52 +02:00
Martin Kroeker
ac736820d7 Merge pull request #4955 from cdaley/optimize_gemv_forwarding
Optimize gemv forwarding on ARM64 systems
2024-10-25 13:43:54 +02:00
Chris Daley
cb48505251 optimize gemv forwarding on ARM64 systems 2024-10-24 21:05:26 -07:00
nickel
79f4bbd4cd fix: return types of *trtrs routines 2024-10-24 11:20:02 +02:00
Martin Kroeker
72461f1c8c Merge pull request #4950 from ayappanec/fix-aix-build
Fix AIX build
2024-10-23 16:40:02 +02:00
Ayappan Perumal
020cce1068 Fix build issues with gcc compiler as well 2024-10-23 04:24:06 -05:00
Ayappan Perumal
b6ec73e77c Fix AIX build 2024-10-21 07:38:03 -05:00
Martin Kroeker
8a0cd5fcef Merge pull request #4949 from martin-frbg/mingw32-14.2
work around mingw32-gfortran 14.2 miscompiling CBLAS1 tests
2024-10-20 21:52:57 +02:00
Martin Kroeker
4dba6ce6ea work around mingw32-gfortran 14.2 miscompiling CBLAS1 tests 2024-10-20 20:25:06 +02:00
Martin Kroeker
a93ec74e95 Merge pull request #4948 from martin-frbg/fixhavesve
Properly report HAVE_SVE in ARM64 autodetection where applicable
2024-10-18 20:00:42 +02:00
Martin Kroeker
c4bb4e74fc NeoverseN2 has SVE too 2024-10-18 14:50:55 +02:00
Martin Kroeker
86720778ef write HAVE_SVE to config where applicable 2024-10-18 14:14:43 +02:00
Martin Kroeker
016bdb9b0b Merge pull request #4946 from XiWeiGu/la64_omatcopy_lasx
LoongArch64: Opt somatcopy with LASX
2024-10-18 14:03:06 +02:00
gxw
ffaa5765a4 Bench: Add omatcopy 2024-10-18 11:07:52 +08:00
Martin Kroeker
a93897276b Merge pull request #4943 from martin-frbg/update_readme
Update README.md
2024-10-17 21:13:48 +02:00
Martin Kroeker
3fc1225dd6 Merge branch 'OpenMathLib:develop' into update_readme 2024-10-17 21:08:58 +02:00
Martin Kroeker
33078d11e4 stress importance of TARGET setting in DYNAMIC_ARCH builds 2024-10-17 21:07:49 +02:00
Martin Kroeker
15a57598f5 Merge pull request #4944 from ChipKerchner/vectorizeBF16GEMV
[POWER] Vectorize BF16 GEMV
2024-10-17 19:21:07 +02:00
Chip Kerchner
ab71a1edf2 Better VSX. 2024-10-17 08:25:02 -05:00
gxw
bb31bbef52 LoongArch64: Opt somatcopy_ct with LASX 2024-10-17 11:45:13 +00:00
gxw
b37129341b LoongArch64: Opt somatcopy_cn with LASX 2024-10-17 11:27:55 +00:00
gxw
acf6cab304 LoongArch64: Opt somatcopy_rn with LASX 2024-10-17 09:50:02 +00:00
gxw
15edb441bf LoongArch64: Opt somatcopy_rt with LASX 2024-10-17 09:15:42 +00:00
Martin Kroeker
457d1c6972 remove unused CI badges, wiki->docs, xianyi->OpenMathLib 2024-10-17 10:33:08 +02:00
Martin Kroeker
6a60eb1a02 Merge pull request #4924 from XiWeiGu/la64_readme
LoongArch64: Update README.md
2024-10-16 09:38:18 +02:00
Martin Kroeker
8483a71169 Merge pull request #4937 from martin-frbg/lapack1064
Fix leading dimension for B in LAPACK tests for GGEV (Reference-LAPACK PR 1064)
2024-10-14 21:52:41 +02:00
Martin Kroeker
22628f1a69 Fix leading dimension for B (Reference-LAPACK PR 1064) 2024-10-14 18:59:03 +02:00
Martin Kroeker
27ed6da331 Fix leading dimension for B (Reference-LAPACK PR 1064) 2024-10-14 18:57:50 +02:00
Martin Kroeker
7018c1b001 Fix leading dimension for B (Reference-LAPACK PR 1064) 2024-10-14 18:56:44 +02:00
Martin Kroeker
a659f40fe1 Fix leading dimension for B (Reference-LAPACK PR 1064) 2024-10-14 18:53:30 +02:00
Martin Kroeker
c979c1d948 Merge pull request #4936 from martin-frbg/fixmips64generic
Fix unroll parameter selection for MIPS64_GENERIC
2024-10-14 08:13:27 +02:00
Martin Kroeker
a47b3c8867 Fix unroll parameter selection for MIPS64_GENERIC 2024-10-13 22:54:34 +02:00
Chip Kerchner
2391dc1c0f Merge branch 'vectorizeBF16GEMV' of github.ibm.com:PowerAppLibs/OpenBLAS into vectorizeBF16GEMV 2024-10-13 13:48:33 -05:00
Chip Kerchner
36bd3eeddf Vectorize BF16 GEMV (VSX & MMA). Use GEMM_GEMV_FORWARD_BF16 (for Power). 2024-10-13 13:46:11 -05:00
Chip Kerchner
f8e113f27b Replace types with include file. 2024-10-13 10:55:03 -05:00
Chip Kerchner
a53a197934 Merge remote-tracking branch 'origin/develop' into vectorizeBF16GEMV 2024-10-12 15:15:17 -05:00
Martin Kroeker
3184b7f209 Merge pull request #4933 from ChipKerchner/thread_sbgemv
Change multi-threading logic for SBGEMV to be the same as SGEMV.
2024-10-12 17:19:41 +02:00
Chip Kerchner
0082240044 Merge branch 'thread_sbgemv' into vectorizeBF16GEMV 2024-10-11 16:13:59 -05:00
Chip Kerchner
1d51ca5798 Change multi-threading logic for SBGEMV to be the same as SGEMV. 2024-10-11 16:08:48 -05:00
Chip Kerchner
c8f53b85ce Merge remote-tracking branch 'origin/develop' into vectorizeBF16GEMV 2024-10-11 11:10:20 -05:00
Martin Kroeker
18a23c23f7 Merge pull request #4929 from martin-frbg/issue4905
Fix CBLAS_?GEMMT filling in the wrong triangle for Row-Major
2024-10-11 08:54:02 +02:00
Martin Kroeker
5a79446bdb Merge pull request #4918 from HaoZeke/testFixes
TST,BUG: Explicitly allow running tests multiple times
2024-10-10 21:53:18 +02:00
Martin Kroeker
7ba6591ff2 Merge branch 'OpenMathLib:develop' into issue4905 2024-10-10 21:50:38 +02:00
Martin Kroeker
550bc77832 Fix expectation values for CblasRowMajor order 2024-10-10 20:39:29 +02:00
Martin Kroeker
e0ad20f72b Merge pull request #4932 from martin-frbg/cirrusosxndk
Update Android NDK install path for M1/armv7 crossbuild on CirrusCI
2024-10-10 16:18:07 +02:00
Martin Kroeker
e4bc5e4718 remove stray quote 2024-10-10 11:02:56 +02:00
Martin Kroeker
b89fb9632f Update Android NDK install path for M1/armv7 crossbuild 2024-10-10 10:19:11 +02:00
Martin Kroeker
e52d9b4cf1 Merge pull request #4928 from austinpagan/czgemm_in_c
CGEMM & ZGEMM using C code, Power only, P10 only.
2024-10-09 20:26:21 +02:00