Commit Graph

6065 Commits

Author SHA1 Message Date
Martin Kroeker dc4fcb48df
Fix inverted conditional for caxpy/zaxpy 2021-06-10 11:14:03 +02:00
Martin Kroeker 7a48247761
fix c/zrot and sgemv for POWER5 2021-06-10 11:11:56 +02:00
Martin Kroeker 7dfc45e840
Remove casts for PPC/POWER and complete parameters for POWER3/4 2021-06-10 11:09:50 +02:00
Arthur Williams 7fb6e576c2 Removed use of non portable '-p' arg to install
Not all versions of install support '-p' flag and it isn't worth failing
the build in the installed files' timestamps get updated.
2021-06-09 20:50:36 -05:00
Rajalakshmi Srinivasaraghavan cbb70438df POWER10: Fixes for sbgemm kernel
While testing bfloat16 sbgemm kernel, there are some failures
for odd value inputs due to array access beyond the boundary.
2021-06-09 12:20:09 -05:00
Ma, Yu 706a08d4a0 Optimized sgemv_t for small N based on AVX512 2021-06-08 15:08:28 -04:00
Zhang Xianyi 9f3d903817
Merge pull request #3259 from zhaofengli/riscv64-fixes
riscv64 fixes
2021-06-08 16:26:56 +08:00
Zhaofeng Li 590be3fae3 riscv64: Add Makefile 2021-06-07 22:55:56 +00:00
Zhaofeng Li 3521cd48cb RISCV64_GENERIC: Use generic kernel for DSDOT for better precision
The implementation in `riscv64/dot.c` fails the `test_dsdot` test, and
the generic kernel seems to have better precision. Tested on SiFive
FU740 (HiFive Unmatched) and QEMU.

Also see #1469.
2021-06-07 22:50:23 +00:00
Zhaofeng Li 1e0192a5cc riscv64/imin: Fix wrong comparison
Same as #1990.
2021-06-07 22:49:39 +00:00
Martin Kroeker fe9aff17fe
Merge pull request #3258 from martin-frbg/hbaction
revert "try to work around gcc update problems" in Homebrew workflow
2021-06-06 22:15:29 +02:00
Martin Kroeker 8c25b440a0
revert "try to work around gcc update problems"
...as homebrew has dropped at least gcc8 now
2021-06-06 19:17:36 +02:00
Martin Kroeker f84197c1a7
Add shortcuts for (small) cases that do not need expensive buffer allocation 2021-05-29 22:28:00 +02:00
Martin Kroeker 734bd265a8
revert symv changes for now 2021-05-29 15:40:03 +02:00
Martin Kroeker 1217eb910d
Fix copy-paste errors in variables used 2021-05-28 09:38:48 +02:00
Martin Kroeker d6d7a6685d
Add shortcuts for (small) cases that do not need expensive buffer allocation 2021-05-27 22:39:18 +02:00
Martin Kroeker f0e7345fb8
Add shortcut for small-size gemv_n with increments of one 2021-05-26 22:02:34 +02:00
Martin Kroeker 42f048cf6c
Merge pull request #3249 from MikaelUrankar/develop
Fix typo
2021-05-26 15:26:30 +02:00
MikaelUrankar 4fbc0777f4 Fix typo 2021-05-26 12:14:57 +02:00
Martin Kroeker d7472606d5
Merge pull request #3244 from martin-frbg/issue3237
Add fast path for small xSYR with INCX==1
2021-05-22 22:38:09 +02:00
Martin Kroeker 03297ff9f0
Add fast path for small xSYR with INCX==1 2021-05-22 20:41:18 +02:00
Martin Kroeker 2d8d0af0ea
Merge pull request #3243 from martin-frbg/lapack564
Fix spurious error exit test failures in the ?chktsqr tests (LAPACK564)
2021-05-22 19:25:56 +02:00
Martin Kroeker 5f677e782e
Merge pull request #3196 from guowangy/skylakex-gemm-batch-k
GEMM: skylake: improve the performance when m is small
2021-05-22 19:25:28 +02:00
Martin Kroeker 04c60cee5d
Merge pull request #3242 from martin-frbg/issue3239
Handle inadvertent use of DYNAMIC_ARCH=0
2021-05-22 19:24:46 +02:00
Martin Kroeker 3a53207cc9
Fix spurious error exit test failures in the ?chktsqr tests (LAPACK564) 2021-05-22 14:29:45 +02:00
Martin Kroeker 0e73d20629
Handle inadvertent use of DYNAMIC_ARCH=0 2021-05-22 14:23:49 +02:00
Martin Kroeker 02087a62e7
Merge pull request #3205 from intelmy/sgemv_n_opt
optimize on sgemv_n for small n
2021-05-17 17:49:01 +02:00
Martin Kroeker 03b4d79a7e
Merge pull request #3238 from martin-frbg/lapack555
Correct function name in error message from SLASQ2 (LAPACK PR555)
2021-05-17 17:32:23 +02:00
Martin Kroeker 5c729c6dce
Correct function name in error message from SLASQ2 (Reference-LAPACK PR 555) 2021-05-17 14:47:14 +02:00
Martin Kroeker e1911b2e60
Merge pull request #3236 from martin-frbg/issue3234
Add -lm for FreeBSD on ARM/ARM64
2021-05-16 17:17:18 +02:00
Martin Kroeker 8f33da4f94
Merge pull request #3235 from dnoan/develop
Update Makefile.arm64
2021-05-16 17:15:45 +02:00
Martin Kroeker 26ccf643a3
Add -lm for FreeBSD on ARM/ARM64 2021-05-16 13:04:38 +02:00
Noan 32264ba496
Update Makefile.arm64
Added -march and -mtune flags for EMAG processors when GCC 9 or later
2021-05-16 09:49:13 +00:00
Martin Kroeker 4ecf631f95
Merge pull request #3228 from martin-frbg/issue3226
filter out -mavx flag on Sandybridge zgemm/ztrmm kernels
2021-05-15 09:06:12 +02:00
Martin Kroeker 5af510081d
Merge pull request #3233 from martin-frbg/issue3230
Add autodetection for Intel Ice Lake SP
2021-05-15 01:04:09 +02:00
Martin Kroeker 164551d5a2
Merge pull request #3232 from martin-frbg/lapack553
Reduce stack size requirements in the LAPACK LIN tests (LAPACK PR 553)
2021-05-14 23:28:45 +02:00
Martin Kroeker 310b76aad7
Merge pull request #3231 from martin-frbg/issue3227
Support compilation with pre-C99 versions of MSVC
2021-05-14 23:28:06 +02:00
Martin Kroeker c4da892ba0
Only filter out -mavx on Sandybridge ZGEMM/ZTRMM kernels 2021-05-14 23:19:10 +02:00
Martin Kroeker cbfd3c87e1
Recognize Intel Ice Lake SP as Cooper Lake 2021-05-14 20:44:06 +02:00
Martin Kroeker 26e87ac517
Support Intel Ice Lake SP as Cooper Lake 2021-05-14 20:39:55 +02:00
Martin Kroeker 15b9d6b4a7
Delete zchkaa.f 2021-05-14 19:55:31 +02:00
Martin Kroeker f7bcd962c1
Delete schkaa.f 2021-05-14 19:54:54 +02:00
Martin Kroeker 93cc066921
Delete dchkaa.f 2021-05-14 19:54:13 +02:00
Martin Kroeker 2c7d4a7766
Delete cchkaa.f 2021-05-14 19:53:38 +02:00
Martin Kroeker eef1c42f03
Convert ?chkaa to use dynamic allocation for the larger arrays 2021-05-14 19:53:03 +02:00
Martin Kroeker 73f637e584
Support compilation with pre-C99 versions of MSVC 2021-05-14 15:08:12 +02:00
Martin Kroeker 8b90e5f202
Drop redundant inclusion of complex.h 2021-05-14 15:06:44 +02:00
Martin Kroeker bd60fb6ffc
filter out -mavx flag on zgemm kernels as it can cause problems with older gcc 2021-05-13 23:05:00 +02:00
Martin Kroeker 37ea8702ee
Merge pull request #3192 from damonyu1989/develop
Update the intrinsic api to the offical name.
2021-05-11 16:00:45 +02:00
Martin Kroeker ec7d6c02bc
Add an Android crossbuild on OSX to Azure CI (#3224)
* Add an Android crossbuild on OSX
2021-05-10 08:02:01 +02:00