Commit Graph

8033 Commits

Author SHA1 Message Date
Martin Kroeker f81c1d4b59
Fix portability problem 2024-02-27 07:19:52 +01:00
Martin Kroeker f7ffab870b
fix missing malloc 2024-02-26 23:03:10 +01:00
Martin Kroeker 38283f678e
Fix portability problems 2024-02-26 22:22:48 +01:00
Martin Kroeker 28f151808e
Avoid overriding the global USE_GEMM3M 2024-02-26 21:01:46 +01:00
Martin Kroeker 5d929d2706
avoid overriding the global USE_GEMM3M 2024-02-26 21:00:57 +01:00
Martin Kroeker a1ec94c258
Readd proper f2c'd sources for the GEMM3M tests 2024-02-26 17:46:07 +01:00
Martin Kroeker 175e357f5d
run apt-get update before fetching Ubuntu packages 2024-02-26 14:19:50 +01:00
Martin Kroeker ea167328f1
Add f2c-converted sources for GEMM3M tests 2024-02-26 14:14:58 +01:00
Martin Kroeker 5aaeca2896
fix name 2024-02-26 09:26:14 +01:00
Martin Kroeker 87dd1c710e
fix conditional gemm3m build 2024-02-26 07:37:30 +01:00
Martin Kroeker ba201c1939
Enable GEMM3M tests on supported platforms 2024-02-25 23:39:24 +01:00
Martin Kroeker 0ce794f0c3
Enable GEMM3M tests on supported platforms 2024-02-25 23:38:36 +01:00
Martin Kroeker cb8131cfd9
Merge pull request #4499 from kseniyazaytseva/new-tests
Tests for BLAS-like and BLAS API
2024-02-25 22:40:59 +01:00
Martin Kroeker 07e62a4619
Merge pull request #4523 from martin-frbg/gemmtstack
Fix a potential buffer overflow in GEMMT
2024-02-25 21:26:21 +01:00
Martin Kroeker baf88564bc
Fix potential buffer overflow 2024-02-25 19:23:41 +01:00
Martin Kroeker f860e82166
Merge pull request #4522 from martin-frbg/arm64scsum
Fix SCSUM on ARMV8 and add optimized CSUM/ZSUM for ARMV8SVE
2024-02-25 19:20:11 +01:00
Martin Kroeker 7d506984fa
fix assignment of default CSUM kernel 2024-02-25 17:57:11 +01:00
Martin Kroeker 12787775d9
add csum/zsum kernels (trivially derived from the asum ones)s) 2024-02-25 17:55:36 +01:00
Martin Kroeker 1c93e6a5e4
Merge pull request #4521 from martin-frbg/fixczsum
Fix BLAS extension kernels for SCSUM and DZSUM on x86_64 targets
2024-02-25 10:46:51 +01:00
Martin Kroeker 8f8ef3492a
Add CSUM and ZSUM kernels (trivially derived from their existing ASUM counterparts) 2024-02-24 23:57:50 +01:00
Martin Kroeker be5e18c6f9
Add kernel definitions for CSUM and ZSUM 2024-02-24 23:55:43 +01:00
Martin Kroeker 5403900539
Merge pull request #4520 from frjohnst/new_branch
fix conlict between PR 4515 and AIX shared obj support
2024-02-23 20:58:27 +01:00
frjohnst bdaa6705ca fix conlict between PR 4515 and AIX shared obj support 2024-02-23 10:20:48 -05:00
Martin Kroeker 0d976acdd7
Merge pull request #4485 from martin-frbg/issue4468
[WIP] Add a build option to suppress versioning and softlinking of the library
2024-02-23 14:54:12 +01:00
Martin Kroeker 2e86faa657
Merge branch 'develop' into issue4468 2024-02-23 11:39:49 +01:00
Martin Kroeker 0ff854921c
Merge pull request #4519 from martin-frbg/gh-applem1
CI: Move most Apple M1 jobs from Cirrus to Github workflow
2024-02-23 08:03:59 +01:00
Martin Kroeker 00ae343db0
Merge pull request #4518 from martin-frbg/cmakefixes
Prevent mixed gomp/omp linking and enable INTERFACE64 for ARM64 and MIPS in CMAKE builds
2024-02-22 23:15:05 +01:00
Martin Kroeker 5b953f2f8d
Disable most AppleM1 builds (replaced by gh workflows) 2024-02-22 22:41:08 +01:00
Martin Kroeker 16b488cabe
CI: Add various Apple M1 build configurations to gh workflow 2024-02-22 22:38:05 +01:00
Martin Kroeker be20588a3c
Avoid linking both libgomp and libomp in mixed clang/gfortran builds 2024-02-22 22:17:48 +01:00
Martin Kroeker ca121eb5ed
Avoid linking both libgomp and libomp in mixed clang/gfortran builds 2024-02-22 22:17:05 +01:00
Martin Kroeker 4adfe4d531
Avoid linking both libgomp and libomp in mixed clang/gfortran builds 2024-02-22 22:16:01 +01:00
Martin Kroeker 3516fff378
Avoid linking both libgomp and libomp in mixed clang/gfortran builds 2024-02-22 22:15:28 +01:00
Martin Kroeker 8fc2c2db04
Fix missing support for INTERFACE64 on ARM64 and MIPS64 2024-02-22 22:14:13 +01:00
Martin Kroeker 82b81c0bbe
Dont fail if there is no Fortran compiler 2024-02-22 22:11:50 +01:00
Martin Kroeker 5e8722a963
Merge pull request #4517 from ayappanec/SharedLibforAIX
Shared library support for AIX
2024-02-22 19:08:52 +01:00
Martin Kroeker e5c93d1f37
Merge pull request #4516 from XiWeiGu/loongarch64-cgemv-zgemv-opt
Loongarch64 cgemv zgemv opt
2024-02-22 17:34:27 +01:00
Martin Kroeker 78a9ef35b4
Merge pull request #4515 from frjohnst/second_conflict
resolve second_ conflict which breaks xlf timef
2024-02-22 16:23:12 +01:00
Ayappan Perumal 892f8ff3e5 Shared library support for AIX 2024-02-22 07:05:37 -06:00
Martin Kroeker 9d6eeea867
Merge pull request #4513 from ChipKerchner/fixNumCoresAIX
Fix get_num_cores for AIX.
2024-02-22 12:42:15 +01:00
gxw 990507e3b8 LoongArch64: Opt zgemv with LASX 2024-02-22 11:58:02 +08:00
gxw d51ffec3a2 LoongArch64: Opt cgemv with LASX 2024-02-22 11:56:04 +08:00
frjohnst 9b24b31419 resolve second_ conflict which breaks xlf timef 2024-02-21 15:52:29 -05:00
Chip-Kerchner bf2310442b Fix get_num_cores for AIX. 2024-02-21 13:26:28 -06:00
Martin Kroeker 99ef76f9bb
Merge pull request #4511 from ErnstPeng/feature-branch
Optimized cgemm kernel 16x4 LASX for LoongArch
2024-02-21 14:25:57 +01:00
pengxu 4787a55c64 Optimized cgemm kernel 16x4 LASX for LoongArch 2024-02-21 15:28:47 +08:00
Martin Kroeker ebbf5b3ea0
Merge pull request #4504 from sergei-lewis/dev/slewis/ci
Add builds and unit tests for new RISCV platforms to CI
2024-02-16 22:48:28 +01:00
Sergei Lewis 461ecabb22 add RISCV64_ZVL128B and RISCV64_ZVL256B targets to CI flows and to README.md 2024-02-16 16:26:29 +00:00
Sergei Lewis ba17758c02 fix axpy implementations where y has a stride of 0 2024-02-16 16:00:38 +00:00
Martin Kroeker 5266998b9f
Merge pull request #4498 from mseminatore/win_tidy
blas_server_win32.c pass to clean up code
2024-02-15 14:37:37 +01:00