Martin Kroeker
2020569705
fix NAN handling and make it depend on dummy2 parameter
2024-07-17 23:55:54 +02:00
Martin Kroeker
3870995f01
make NAN handling depend on dummy2 parameter
2024-07-17 23:54:24 +02:00
Martin Kroeker
7284c533b5
make NAN handling depend on dummy2 parameter
2024-07-17 23:50:40 +02:00
Martin Kroeker
73751218a4
make NAN handling depend on dummy2 parameter
2024-07-17 23:41:26 +02:00
Martin Kroeker
b9bfc8ce09
make NAN handling depend on dummy2 parameter
2024-07-17 23:29:50 +02:00
Martin Kroeker
eb4879e04c
make NAN handling depend on the dummy2 parameter
2024-07-17 23:24:19 +02:00
Martin Kroeker
ee87cb90d0
Merge pull request #4803 from iha-taisei/SVESupportSDGEMV
...
A64FX: Add support for SVE to SGEMV/DGEMV kernels.
2024-07-17 23:14:21 +02:00
Martin Kroeker
e9f6aa46a4
Merge pull request #4800 from vlad0x00/patch-2
...
Add missing parentheses
2024-07-16 16:32:04 +02:00
Martin Kroeker
b1aa2e1768
Merge pull request #4802 from markdryan/markdryan/rvv_axpby_incy0
...
Fix axpby_rvv kernels for cases where inc_y = 0
2024-07-16 14:22:38 +02:00
iha fujitsu
0985fdc82b
A64FX: Add support for SVE to SGEMV/DGEMV kernels.
2024-07-16 17:31:33 +09:00
Vladimir Nikolić
56e1782ffb
Add another missing parenthesis
2024-07-15 15:15:23 -07:00
Vladimir Nikolić
127ea5d0d9
Add missing parenthesis
2024-07-15 15:12:21 -07:00
Martin Kroeker
a3c10c6c25
Merge pull request #4799 from martin-frbg/issue4762
...
Improve the error message for (p)thread creation failure
2024-07-15 20:57:56 +02:00
Martin Kroeker
a373d0f107
Improve the error message for thread creation failure
2024-07-15 18:32:21 +02:00
Mark Ryan
67bf4b6998
Fix axpby_rvv kernels for cases where inc_y = 0
...
The following openblas_utest tests fail when the RISCV64_ZVL128B is
enabled.
TEST 89/103 axpby:zaxpby_inc_0 [FAIL]
TEST 92/103 axpby:caxpby_inc_0 [FAIL]
TEST 95/103 axpby:daxpby_inc_0 [FAIL]
TEST 98/103 axpby:saxpby_inc_0 [FAIL]
The issue is that the vectorized kernels do not work when inc_y == 0.
This patch updates the kernels to fall back to the scalar algorithms
when inc_y == 0, fixing the failing tests.
Signed-off-by: Mark Ryan <markdryan@rivosinc.com>
2024-07-15 14:24:47 +00:00
Martin Kroeker
6013b36b16
Merge pull request #4796 from martin-frbg/ppcbuf
...
Suffix BUFFER_SIZEs on POWER as UL to prevent int overflow in computations
2024-07-12 11:06:45 +02:00
Martin Kroeker
9789034281
Merge branch 'OpenMathLib:develop' into ppcbuf
2024-07-12 11:05:46 +02:00
Martin Kroeker
5d08ec7ff3
Merge pull request #4782 from martin-frbg/azurewincl
...
Fix NAN handling in ARM/generic SCAL; have AzureCI Windows show errors on failure
2024-07-11 23:55:15 +02:00
Martin Kroeker
dfc11ef248
Merge pull request #4791 from ChipKerchner/vectorizeSBGEMMincopy
...
Vectorize SBGEMM incopy for Power10 - 4x faster.
2024-07-11 21:38:57 +02:00
Martin Kroeker
2fefdfa2b8
Merge branch 'OpenMathLib:develop' into azurewincl
2024-07-11 21:38:21 +02:00
Martin Kroeker
475bd2452b
Suffix BUFFERSIZEs as UL to prevent int overflow in computations
2024-07-11 20:13:57 +02:00
Martin Kroeker
b70227ad62
Merge pull request #4795 from pkubaj/patch-1
...
Fix build on FreeBSD/powerpc64*
2024-07-11 19:00:07 +02:00
Martin Kroeker
8277828fdc
Merge pull request #4785 from rgommers/docs-install
...
Rewrite "Install OpenBLAS" docs page
2024-07-11 18:49:07 +02:00
Martin Kroeker
f0fc7249f1
Merge pull request #4792 from martin-frbg/issue4790
...
Fix core assignment in cpu detection for Intel family 15
2024-07-11 17:38:43 +02:00
Martin Kroeker
362856fece
Merge pull request #4778 from JAicewizard/develop
...
Add support for RISCV64_GENERIC in cmake
2024-07-11 15:12:46 +02:00
Martin Kroeker
1d77647d1b
Merge pull request #4769 from drupol/fix-buffersize-value
...
openblas: fix `BUFFERSIZE` value
2024-07-11 14:45:50 +02:00
Piotr Kubaj
4c12090776
Fix build on FreeBSD/powerpc64*
2024-07-10 22:21:48 +00:00
Chip Kerchner
f708944fea
Add all 4 variations of the SBGEMM to compare_sgemm_sbgemm
2024-07-10 13:07:48 -05:00
Martin Kroeker
e706bc1ec0
Fix core assignment for Intel family 15
2024-07-09 20:22:56 +02:00
Chip Kerchner
cb154832f8
Vectorize SBGEMM incopy - 4x faster.
2024-07-09 13:10:03 -05:00
Martin Kroeker
a5c04e326a
Update scal.c
2024-07-04 22:28:01 +02:00
Ralf Gommers
268dcd8f45
docs: convert remaining install sections (Android, iOS, FreeBSD, Cortex-M)
2024-07-04 19:16:30 +02:00
Ralf Gommers
452014341e
docs: rework building from source on Windows section
2024-07-04 19:16:30 +02:00
Ralf Gommers
4547908901
docs: rewrite "Install OpenBLAS" page (part 1: binaries, basic from source)
2024-07-04 19:15:51 +02:00
Martin Kroeker
e1eef56e05
Merge pull request #4783 from martin-frbg/cpuid_meteor
...
Add another CPUID for Intel Meteor Lake
2024-07-04 18:09:27 +02:00
Martin Kroeker
536200bc9e
fix handling of INF or NAN
2024-07-04 17:47:19 +02:00
Martin Kroeker
3063d03021
Add another CPUID for Meteor Lake
2024-07-04 16:05:05 +02:00
Martin Kroeker
b422742899
collect error output from ctest, if any
2024-07-04 15:42:34 +02:00
Jaap Aarts
cea4abcac0
Fix compiling on mingw
2024-07-04 14:56:16 +02:00
Martin Kroeker
f729013d2e
Merge pull request #4781 from rgommers/fix-docs-deployment
...
fix CI job to deploy docs, and make it run on pull requests too
2024-07-03 21:00:18 +02:00
Ralf Gommers
6ede8b14c6
ci: fix CI job to deploy docs, and make it run on pull requests too
2024-07-03 20:14:02 +02:00
Martin Kroeker
9836883ee9
Merge pull request #4780 from martin-frbg/azureosx12
...
AzureCI: Update OSX jobs to use the macos-12 image
2024-07-03 19:53:05 +02:00
Martin Kroeker
df81b159e8
Merge pull request #4774 from rgommers/improve-docs
...
Improve documention content, formatting, and html theme
2024-07-03 17:10:44 +02:00
Martin Kroeker
2df4007425
Update compiler and sdk versions for osx
2024-07-03 16:48:43 +02:00
Martin Kroeker
acf0c3ccaf
Merge pull request #4777 from ev-br/sgesdd_ci_err
...
ignore the gesdd failure on codspeed
2024-07-03 15:21:33 +02:00
Martin Kroeker
74f059a3ce
Update OSX jobs to use the macos-12 image
2024-07-03 13:24:02 +02:00
Evgeni Burovski
cd3c167c28
ignore sgesdd failure on codspeed
...
In https://github.com/OpenMathLib/OpenBLAS/issues/4776
we're hitting
** On entry to SLASCL parameter number 4 had an illegal value
on codspeed, but not outside (either locally or on github runners)
2024-07-03 12:35:26 +03:00
Jaap Aarts
9d0abe2d26
Add support for RISCV64_GENERIC in cmake
2024-07-03 01:49:37 +02:00
Evgeni Burovski
5b385fd453
WIP: fish out the gesdd failure?
2024-07-02 20:10:26 +03:00
Ralf Gommers
c1c0dbfd60
docs: address review comments on PR 4774
2024-07-02 14:05:47 +02:00