Commit Graph

8147 Commits

Author SHA1 Message Date
Martin Kroeker
58659730a4 Merge pull request #4595 from martin-frbg/issue4583-2
Temporarily revert S/DNRM2 on NeoverseN1 and Apple M to the older NEON kernel
2024-04-02 17:06:03 +02:00
Martin Kroeker
9ead81bd39 Revert S/DNRM2 to the base NEON kernel to fix precision loss 2024-04-02 15:59:20 +02:00
Martin Kroeker
29995b2969 Merge pull request #4594 from mattip/openblas.pc.il
fix typo
2024-04-02 08:27:43 +02:00
Matti Picus
5b33e648b4 fix typo 2024-04-02 08:54:56 +11:00
Martin Kroeker
b1e8ba5017 Merge pull request #4587 from mseminatore/win_queue_fix
Address Windows thread server re-entrant queue bug #4582
2024-03-29 14:36:21 +01:00
Martin Kroeker
8267fcfda8 Merge pull request #4588 from XiWeiGu/loongarch_fixed_dzamax
loongarch: Fixed dzamax
2024-03-29 13:48:27 +01:00
Mark Seminatore
b0ad8a78ff code to fix lost work in case of re-entrant calls to exec_blas_async() 2024-03-28 15:24:52 -07:00
Martin Kroeker
e1638ea43a Merge pull request #4586 from martin-frbg/potrf-para
use atomic acq/rel operations in potrf_parallel as in the corresponding getrf_parallel
2024-03-28 14:51:20 +01:00
Martin Kroeker
2dda40d280 use atomic operations as in the corresponding getrf 2024-03-28 11:33:31 +01:00
gxw
96607cbb98 loongarch: Fixed dzamax
Initialize the registers to prevent sporadic errors.
2024-03-25 23:17:53 -04:00
Martin Kroeker
9af2a9dc3b Merge pull request #4579 from ChipKerchner/fixInializerPriority
Fix global (static) constructor priority so that OpenBLAS gets initialized before other libraries.  Other unit test AIX fix.
2024-03-25 22:36:29 +01:00
Chip Kerchner
0e0d0bce1a Fix global (static) constructor priorty so that OpenBLAS gets initialized before other libraries. Other unit test AIX fix. 2024-03-25 15:11:55 -05:00
Martin Kroeker
4059a75c9c Merge pull request #4578 from jerryz123/patch-1
Fix README formatting error
2024-03-25 17:17:01 +01:00
Jerry Zhao
0b814ab8b9 Fix README formatting error 2024-03-25 08:14:00 -07:00
Martin Kroeker
87f83ebe9c Merge pull request #4575 from martin-frbg/fixup4503
Restore outer loop of blas_buffer_inuse setup for parallel OpenMP
2024-03-24 20:00:09 +01:00
Martin Kroeker
88b5330ae7 Restore outer loop of blas_buffer_inuse setup 2024-03-24 18:33:21 +01:00
Martin Kroeker
52b71a1673 Filter out FFLAGS that flang-new from LLVM18 no longer supports (#4569)
* Filter out FFLAGS that flang-new from LLVM18 no longer supports
2024-03-22 17:02:39 +01:00
Martin Kroeker
3d2a9e4a61 Merge pull request #4567 from RajalakshmiSR/P9AIX
POWER9: Use default param values from POWER8 on AIX
2024-03-21 00:12:37 +01:00
Martin Kroeker
9ad9b52884 Merge pull request #4568 from martin-frbg/fixbenchloops
Fix bad assignment of OPENBLAS_LOOPS  variable in several benchmarks
2024-03-21 00:09:34 +01:00
Martin Kroeker
3f1ec74fe7 Fix OPENBLAS_LOOPS assignment 2024-03-20 19:22:48 +01:00
Martin Kroeker
fe39c891a6 Fix OPENBLAS_LOOPS assignment 2024-03-20 19:21:37 +01:00
Martin Kroeker
ffcbaca167 Fix OPENBLAS_LOOPS assignment 2024-03-20 19:20:16 +01:00
Martin Kroeker
05d0438c25 Fix OPENBLAS_LOOPS assignment 2024-03-20 19:19:11 +01:00
Rajalakshmi Srinivasaraghavan
f5b2a877e2 POWER9: Use default param values from POWER8 on AIX
AIX uses KERNEL.POWER8 optimization on POWER9 and changing
the default GEMM parameters in param.h to use POWER8 values
on POWER9.
2024-03-20 10:17:49 -05:00
Martin Kroeker
b4a1153648 Merge pull request #4566 from XiWeiGu/fix_loongarch_lsx
LoongArch: Fixed  LSX opt
2024-03-19 10:21:21 +01:00
gxw
50869f6ca8 loongarch: Fixed zrot LSX opt 2024-03-19 10:08:11 +08:00
gxw
b5eb9d6bac loongarch: Fixed {sc/dz}amax LSX opt 2024-03-19 09:56:11 +08:00
gxw
ad13e04669 loongarch: Fixed {s/d/sc/dz}amin LSX opt 2024-03-19 09:18:44 +08:00
gxw
bbf82cb624 loongarch: Fixed {s/d}axpby LSX opt 2024-03-18 17:51:42 +08:00
gxw
ac460eb42a loongarch: Fixed i{c/z}amin LSX opt 2024-03-18 17:15:58 +08:00
Martin Kroeker
56d114b245 Merge pull request #4565 from martin-frbg/issue4564
Fix argument lists of RELAPACK_?gemmt for good
2024-03-17 20:41:58 +01:00
Martin Kroeker
2e9ce9bb07 Fix argument lists of RELAPACK_?gemmt for good 2024-03-17 19:20:19 +01:00
Martin Kroeker
79cb121ab9 Merge pull request #4563 from XiWeiGu/loongarch_fix_lasx
Loongarch: Fixed LASX opt
2024-03-16 10:34:32 +01:00
gxw
60e251a1f8 loongarch: Fixed {sc/dz}amax LASX opt 2024-03-16 14:52:17 +08:00
gxw
a10dde5554 loongarch: Fixed {s/d/sc/dz}amin LASX opt 2024-03-16 14:52:14 +08:00
gxw
6534d378b7 loongarch: Fixed {s/d/c/z}sum LASX opt 2024-03-16 14:52:10 +08:00
gxw
6159cffc58 loongarch: Fixed i{s/c/z}amin LASX opt 2024-03-16 14:52:06 +08:00
gxw
7d755912b9 loongarch: Fixed {s/d/c/z}axpby LASX opt 2024-03-16 14:51:56 +08:00
Martin Kroeker
66bde6243e Merge pull request #4503 from shivammonaka/OpenMP-Locks
OpenMP locks instead of busy-waiting with NUM_PARALLEL
2024-03-14 20:56:04 +01:00
Martin Kroeker
dc0338af47 Merge pull request #4560 from martin-frbg/issue4551-3
Add support for negative increments to the ?NRM2 kernels for RISC-V RVV targets
2024-03-13 14:48:56 +01:00
Martin Kroeker
cf80bd8500 Update nrm2_rvv.c 2024-03-13 13:07:26 +01:00
Martin Kroeker
9baa757905 Update nrm2_vector.c 2024-03-13 11:40:14 +01:00
Martin Kroeker
18a6db6862 Update nrm2_vector.c 2024-03-13 11:10:26 +01:00
Martin Kroeker
855bbdda4f Merge pull request #4556 from ChipKerchner/updateREADMEAIX
Update README for build instructions on AIX and OpenXL.
2024-03-12 23:12:50 +01:00
Martin Kroeker
3752e73919 handle incx < 0 2024-03-12 20:44:01 +01:00
Martin Kroeker
db70c7f7fb handle incx < 0 2024-03-12 20:42:11 +01:00
Martin Kroeker
dee8557d58 handle incx < 0 2024-03-12 20:40:29 +01:00
Martin Kroeker
d9dff17aec handle incx < 0 2024-03-12 20:38:23 +01:00
Martin Kroeker
5802e7a62f Merge pull request #4559 from martin-frbg/issue4551-2
Remove another unwanted early exit in the ThunderX2/NeoN1/AppleM ?NRM2 kernels
2024-03-12 20:34:20 +01:00
Martin Kroeker
552c521353 remove another early exit for incx < 0 2024-03-12 18:49:27 +01:00