Isuru Fernando
6b2651ece3
Fix building test_sbgemm
2023-11-19 02:57:13 -06:00
Martin Kroeker
22aa401656
Temporarily disable the AVX512 CASUM/ZASUM microkernels for any version of NVIDIA HPC ( #4327 )
...
* Temporarily disable the C/ZASUM microkernels for any version of NVHPC
2023-11-19 00:04:31 +01:00
Martin Kroeker
47b03fd4b4
Copy XCode15-specific workaround to Fortran flags to fix build of tests
2023-11-18 23:45:02 +01:00
Martin Kroeker
df4cd7e82c
Merge pull request #4326 from bartoldeman/fix-casum-backup-kernel
...
Fix casum fallback kernel for x86_64
2023-11-18 19:06:06 +01:00
Bart Oldeman
f8ad5344c2
Fix casum fallback kernel.
...
This kernel is only used on Skylake+ if the kernel with AVX512
intrinsics can't be used, but used the variable x1 incorrectly
in the tail end of the loop, as it is still at the initial
value instead of where x points to.
This caused 55 "other error"s in the LAPACK tests
(https://github.com/OpenMathLib/OpenBLAS/issues/4282 )
This change makes casum.c as similar as possible as zasum.c,
because zasum.c does this correctly.
2023-11-17 23:53:56 +00:00
Martin Kroeker
cb2950709f
Merge pull request #4322 from martin-frbg/lapack891
...
Add truncated QR with pivoting (Reference-LAPACK PR 891)
2023-11-16 08:40:17 +01:00
Martin Kroeker
cc622f2406
restore OpenBLAS-specific target_link_libraries
2023-11-15 22:51:09 +01:00
Martin Kroeker
ee47e4e494
run m1/llvm/cmake buid on all 4 cores
2023-11-15 15:21:32 +01:00
Martin Kroeker
8b2a956890
Implement truncated QR with pivot (Reference-LAPACK PR 891)
2023-11-15 14:20:12 +01:00
Martin Kroeker
20a2a83f49
Implement truncated QR with pivoting (Reference-LAPACK PR 891)
2023-11-15 12:18:15 +01:00
Martin Kroeker
f437339130
Implement truncated QR with pivoting (Reference-LAPACK PR 891)
2023-11-15 12:12:26 +01:00
Martin Kroeker
5bf87c86f5
Implement truncated QR with pivoting (Reference-LAPACK PR 891)
2023-11-15 12:10:20 +01:00
Martin Kroeker
0eb8a87977
Implement truncated QR with pivoting (Reference-LAPACK PR 891)
2023-11-15 09:56:37 +01:00
Martin Kroeker
387830b9d5
Implement truncated QR with pivoting (Reference-LAPACK PR 891)
2023-11-15 09:53:06 +01:00
Martin Kroeker
40109c0392
Implement truncated QR with pivoting (Reference-LAPACK PR 891)
2023-11-15 09:50:30 +01:00
Martin Kroeker
23cda457fb
Implement truncated QR with pivoting (Reference-LAPACK PR 891)
2023-11-15 09:48:23 +01:00
Martin Kroeker
d36b86a794
Merge pull request #4320 from ChipKerchner/fixOldGCCPower
...
Fix older versions of gcc - missing __has_builtin, cpuid and no support of P10.
2023-11-15 08:48:17 +01:00
Chip-Kerchner
d99aad8ee3
Fix older version of gcc - missing __has_builtin, cpuid and no support of P10.
2023-11-14 11:07:08 -06:00
Martin Kroeker
46440a0486
Merge pull request #4317 from OpenMathLib/release-0.3.0
...
Merge release 0.3.25 back into develop to copy tag
2023-11-12 23:09:47 +01:00
Martin Kroeker
f4cc1b7a6f
Update version to 0.3.25.dev
2023-11-12 23:07:19 +01:00
Martin Kroeker
dff686a86c
Update version to 0.3.25.dev
2023-11-12 23:06:46 +01:00
Martin Kroeker
5e1a429eab
Merge pull request #4316 from OpenMathLib/develop
...
Merge develop into release-0.3.0 for 0.3.25
2023-11-12 22:55:00 +01:00
Martin Kroeker
64c96716f7
Merge branch 'release-0.3.0' into develop
2023-11-12 22:54:42 +01:00
Martin Kroeker
0e54cbd18c
Update version to 0.3.25
2023-11-12 22:52:05 +01:00
Martin Kroeker
f1940010e4
Update version to 0.3.25
2023-11-12 22:51:26 +01:00
Martin Kroeker
a47ceda465
Merge pull request #4315 from martin-frbg/m3_cpufamily
...
Add OSX hw.cpufamily autodetection for Apple M3 as VORTEX
2023-11-12 22:49:58 +01:00
Martin Kroeker
e1f529d024
Add OSX hw.cpufamily value for Apple M3
2023-11-12 22:37:11 +01:00
Martin Kroeker
c245c12dc2
Update Changelog for 0.3.25 ( #4314 )
...
* Update Changelog.txt for 0.3.25
2023-11-12 22:17:39 +01:00
Martin Kroeker
fa615967cd
Merge pull request #4312 from martin-frbg/fixotherproto
...
Fix empty function prototypes
2023-11-12 21:10:27 +01:00
Martin Kroeker
9b5f8eb33a
Fix empty function prototypes
2023-11-12 19:35:53 +01:00
Martin Kroeker
ecaaece695
Merge pull request #4311 from martin-frbg/lapack930
...
Make vector orthogonalization more reliable (Reference-LAPACK PR 930)
2023-11-12 18:42:32 +01:00
Martin Kroeker
6f094c35ee
Merge pull request #4305 from rgommers/ci-limit-runs
...
Limit CI runs to pushes and pull requests on main repo
2023-11-12 18:39:27 +01:00
Martin Kroeker
3d38da2bc4
Make vector orthogonalization more reliable (Reference-LAPACK PR 930)
2023-11-12 16:50:52 +01:00
Martin Kroeker
d58c88cf42
Merge pull request #4310 from martin-frbg/lapack904
...
Apply rounding up to workspace calculations done with reals (Reference-LAPACK PR 904)
2023-11-12 16:45:10 +01:00
Martin Kroeker
feeb10435b
Merge pull request #4309 from martin-frbg/lapack926
...
Change ?GECON to return INFO=1 if RCOND is NaN (Reference-LAPACK PR 926)
2023-11-12 15:28:16 +01:00
Martin Kroeker
2ce67e2ada
Apply ROUNDUP_LWORK (Reference-LAPACK PR 904)
2023-11-12 14:42:52 +01:00
Martin Kroeker
f5664740cd
Apply ROUNDUP_LWORK (Reference-LAPACK PR 904)
2023-11-12 14:29:04 +01:00
Martin Kroeker
71fbdd908d
Apply ROUNDUP_LWORK (Reference-LAPACK PR 904)
2023-11-12 14:10:16 +01:00
Martin Kroeker
c9378badd9
Apply ROUNDUP_LWORK (Reference-LAPACK PR 904)
2023-11-12 13:56:06 +01:00
Martin Kroeker
225036fd92
Apply ROUNDUP_LWORK (Reference-LAPACK PR 904)
2023-11-12 13:43:22 +01:00
Martin Kroeker
eef4d15369
Merge pull request #4308 from martin-frbg/issue4277-2
...
Add workaround for omp_get_max_threads hanging on FreeBSD/LLVM14
2023-11-12 13:08:43 +01:00
Martin Kroeker
58427ff74d
Deprecate ?GELQS and ?GEQRS from TESTING/LIN (Reference-LAPACK PR 900) ( #4307 )
...
* Move ?GELQS and ?GEQRS from TESTING/LIN to DEPRECATED (Reference-LAPACK PR 900)
* Add f2c-converted versions of ?GELQS and ?GEQRS
2023-11-12 10:54:39 +01:00
Martin Kroeker
b6144f70ff
Change ?GECON to return INFO=1 if RCOND is NaN (Reference-LAPACK PR 926)
2023-11-11 23:41:18 +01:00
Martin Kroeker
00ef1bb58a
Merge pull request #4306 from angsch/develop
...
Improve matcopy interface
2023-11-11 23:19:10 +01:00
Martin Kroeker
9324520d0e
typo fix
2023-11-11 23:14:58 +01:00
Martin Kroeker
ff6437f2d7
Add workaround for omp_get_max_threads hanging on FreeBSD with libomp from LLVM14
2023-11-11 21:30:32 +01:00
Martin Kroeker
9c3c1cfbd6
Merge pull request #4304 from martin-frbg/issue4277
...
Move clang/gfortran OpenMP dependency rewriting out of f_check
2023-11-11 20:58:21 +01:00
Martin Kroeker
cad10a3caa
Merge pull request #4303 from martin-frbg/ryzen-avx512
...
Enable autodetection of Zen 3/4 cpus as their AVX512 Intel counterparts
2023-11-11 18:36:24 +01:00
Martin Kroeker
95ed8125fa
Merge pull request #4290 from martin-frbg/issue4275-2
...
Put more build information into Makefile.conf_last
2023-11-11 15:28:57 +01:00
Angelika Schwarz
5ffbe646e1
Improve matcopy interface
...
* rows = 0 or cols = 0 is now a legal input and
takes quick return path
* Follow BLAS/LAPACK convention that the leading
dimensions must be at least 1.
2023-11-11 11:16:10 +01:00