Commit Graph

6820 Commits

Author SHA1 Message Date
Martin Kroeker
b6a4ef98b9 Merge pull request #3845 from Mousius/asimd-dot-opt
Remove unnecessary instructions from Advanced SIMD dot
2022-11-30 21:07:30 +01:00
Chris Sidebottom
4f7b77e08a Remove unnecessary instructions from Advanced SIMD dot
The existing kernel was issuing extra instructions to organise the arguments into the same registers they would usually be in and similarly to put the result into the appropriate register.

This has an impact on smaller sized dots and seemed like a quick fix
2022-11-25 16:19:03 +00:00
Martin Kroeker
e9a911fb9f Merge pull request #3841 from martin-frbg/lapack755+764
Fix SLATRS3 and CLATRS3 tests in TESTING/LIN (Reference-LAPACK PRs 755+764)
2022-11-23 22:38:06 +01:00
Martin Kroeker
bf0e8d67b5 Merge pull request #3840 from martin-frbg/lapack760
Fix typo in EIG tests and spurious return in lapacke_?tz_trans utility (Reference-LAPACK PR760)
2022-11-23 19:16:25 +01:00
Martin Kroeker
a5470521ee Fix array indexation in copy, and fix test (Reference-LAPACK PR764) 2022-11-23 15:31:25 +01:00
Martin Kroeker
b0393ea4e1 Fix test (Reference-LAPACK PR764) 2022-11-23 15:27:46 +01:00
Martin Kroeker
0d26f1a4c7 Fix wrong indexation in test (Reference-LAPACK PR755) 2022-11-23 15:22:27 +01:00
Martin Kroeker
19fd2d7f00 Use LSAME for character comparison (Reference-LAPACK PR755) 2022-11-23 15:19:07 +01:00
Martin Kroeker
663bf68dbd Merge pull request #3839 from martin-frbg/lapack758
Fix array dimesion in complex SYL01 test (Reference-LAPACK PR758)
2022-11-23 14:57:56 +01:00
Martin Kroeker
c2ba4e6249 Remove unnecessary return in void function call (Reference-LAPACK PR760) 2022-11-23 10:43:34 +01:00
Martin Kroeker
74962c7f53 Remove unnecessary return in void function call (Reference-LAPACK PR760) 2022-11-23 10:42:29 +01:00
Martin Kroeker
d952cbf7bc Remove unnecessary return in void function call (Reference-LAPACK PR760) 2022-11-23 10:41:50 +01:00
Martin Kroeker
7694ff495f Remove unnecessary return in void function call (Reference-LAPACK PR760) 2022-11-23 10:40:59 +01:00
Martin Kroeker
825ae316e2 Fix typo in EXTERNAL (Reference-LAPACK PR760) 2022-11-23 10:36:10 +01:00
Martin Kroeker
730ed549e6 Fix typo in EXTERNAL (Reference-LAPACK PR760) 2022-11-23 10:35:23 +01:00
Martin Kroeker
bc3393f703 Fix array dimension (Reference-LAPACK 758) 2022-11-23 10:31:18 +01:00
Martin Kroeker
0b2f8dabbf Fix array dimension (Reference-LAPACK 758) 2022-11-23 10:30:35 +01:00
Martin Kroeker
b4c9228441 Merge pull request #3838 from martin-frbg/lapa311
Update the version number of the included LAPACK to 3.11.0
2022-11-22 17:39:51 +01:00
Martin Kroeker
e6e2a63650 Update LAPACK version number to 3.11.0 2022-11-22 14:02:21 +01:00
Martin Kroeker
8408357bab Update LAPACK version number to 3.11.0 2022-11-22 14:01:48 +01:00
Martin Kroeker
ba8fb8b4b2 Merge pull request #3837 from martin-frbg/lapack655+697
Improve convergence of LAPACK ?LAED4 and fix a bug in DORCSD2BY1 (Reference-LAPACK PRs 655+697)
2022-11-22 13:51:57 +01:00
Martin Kroeker
cabf9453e2 Merge pull request #3836 from martin-frbg/lapack665+735
Fix documentation of LAPACK functions ?TPRFB and IEEECK (Reference-LAPACK PRs 665+735)
2022-11-22 09:25:24 +01:00
Martin Kroeker
d321357558 Fix bug in DORCSD2BY1 (from Reference-LAPACK PR697) 2022-11-21 21:19:44 +01:00
Martin Kroeker
afcd7e88b6 Improve convergence of DLAED4/SLAED4 (Reference-LAPACK PR655) 2022-11-21 21:18:39 +01:00
Martin Kroeker
f8f2bebf11 Fix function documentation for LAPACK ?TPRFB (Reference-LAPACK PR665) 2022-11-21 20:01:47 +01:00
Martin Kroeker
c45edcb537 Fix typo in comment (Reference-LAPACK PR735) 2022-11-21 19:59:33 +01:00
Martin Kroeker
880a3fb20f Merge pull request #3835 from martin-frbg/lapack217
Simplify ?SYSWAPR and fix its documentation (Reference-LAPACK 217)
2022-11-21 19:56:28 +01:00
Martin Kroeker
50aba02910 Simplify ?SYSWAPR and fix its documentation (Reference-LAPACK 217) 2022-11-21 18:00:31 +01:00
Martin Kroeker
0b68dd6a9b Merge pull request #3834 from martin-frbg/lapack631
Use new algorithms for computing Givens rotations (Reference-LAPACK PR631)
2022-11-21 08:30:14 +01:00
Martin Kroeker
9343499256 Merge pull request #3833 from martin-frbg/lapack712+747
Set scale early in ?LATBS/?LATRS and fix documentation of ?LASCL2 (Reference-LAPACK PRs 712+747)
2022-11-21 08:29:49 +01:00
Martin Kroeker
7ae4269add Use new algorithms for computing Givens rotations (Reference-LAPACK PR631) 2022-11-20 22:52:28 +01:00
Martin Kroeker
e00f0fb26a Fix function documentation (Reference-LAPACK PR747) 2022-11-20 22:46:58 +01:00
Martin Kroeker
31d2145988 Set scale early for robust triangular solvers (Reference-LAPACK PR712) 2022-11-20 22:44:36 +01:00
Martin Kroeker
1d5a3aff0d Merge pull request #3832 from martin-frbg/lapack681+698
Improve ?LAQR5 and use normwise criterion in ?LAQZ0 (Reference-LAPACK PRs 681+698)
2022-11-20 22:40:52 +01:00
Martin Kroeker
c6816bb576 Use normwise criterion in multishift QZ (Reference-LAPACK PR698) 2022-11-20 19:39:12 +01:00
Martin Kroeker
6f09e4c121 Improve FMA usage in ?LAQR5 (Reference-LAPACK PR681) 2022-11-20 19:37:28 +01:00
Martin Kroeker
f63c93274c Merge pull request #3831 from martin-frbg/lapack647+697+702
Fix code and documentation for ?SORBDB?/?CUNBDB? (Reference-LAPACK PRs 647+697+702)
2022-11-20 19:34:41 +01:00
Martin Kroeker
aaea0804bc Fix function documentation (Reference-LAPACK PR697) 2022-11-20 16:38:57 +01:00
Martin Kroeker
b946820502 Fix uninitialized variable (Reference-LAPACK PR647) 2022-11-20 16:36:19 +01:00
Martin Kroeker
9e29312c83 Fix type precision and function documentation (Reference-LAPACK PRs 647+702) 2022-11-20 16:34:45 +01:00
Martin Kroeker
b1102fe250 Merge pull request #3830 from martin-frbg/lapack691+698
Add quick return in ?LASCL; use normwise criterion for INF in QZ; fix workspace calcn for ?SYEVD (Reference-LAPACK PRs 674+691+698)
2022-11-20 16:29:46 +01:00
Martin Kroeker
3f31b69121 Add quick return if scaling with one (Reference-LAPACK PR674) 2022-11-20 13:30:25 +01:00
Martin Kroeker
60af35bfab Fix workspace query for ?SYEVD and ?HEEVD (Reference-LAPACK PR691) 2022-11-20 13:25:21 +01:00
Martin Kroeker
eea1636380 Use normwise criterion for INF eigenvalues in QZ (Reference-LAPACK PR698) 2022-11-20 13:22:55 +01:00
Martin Kroeker
1714d640f1 Merge pull request #3829 from martin-frbg/lapack684+739
Cast workspace sizes for ?GELSS and add new ?GELST functions (Reference-LAPACK PRs 684+739)
2022-11-20 13:06:51 +01:00
Martin Kroeker
88cd91c490 Fix stray character 2022-11-19 23:15:20 +01:00
Martin Kroeker
f157d6d671 Add C equivalents of ?GELST (for Reference-LAPACK PR739) 2022-11-19 22:50:57 +01:00
Martin Kroeker
5ff46f4092 Add ?GELST (Reference-LAPACK PR739) 2022-11-19 22:49:31 +01:00
Martin Kroeker
1d32ce5135 Add ?GELST (Reference-LAPACK PR739) 2022-11-19 22:42:50 +01:00
Martin Kroeker
1497336b20 Add tests for ?GELST (Reference-LAPACK PR739) 2022-11-19 22:39:16 +01:00