Martin Kroeker
dc4fcb48df
Fix inverted conditional for caxpy/zaxpy
2021-06-10 11:14:03 +02:00
Martin Kroeker
7a48247761
fix c/zrot and sgemv for POWER5
2021-06-10 11:11:56 +02:00
Martin Kroeker
7dfc45e840
Remove casts for PPC/POWER and complete parameters for POWER3/4
2021-06-10 11:09:50 +02:00
Arthur Williams
7fb6e576c2
Removed use of non portable '-p' arg to install
...
Not all versions of install support '-p' flag and it isn't worth failing
the build in the installed files' timestamps get updated.
2021-06-09 20:50:36 -05:00
Rajalakshmi Srinivasaraghavan
cbb70438df
POWER10: Fixes for sbgemm kernel
...
While testing bfloat16 sbgemm kernel, there are some failures
for odd value inputs due to array access beyond the boundary.
2021-06-09 12:20:09 -05:00
Ma, Yu
706a08d4a0
Optimized sgemv_t for small N based on AVX512
2021-06-08 15:08:28 -04:00
Zhang Xianyi
9f3d903817
Merge pull request #3259 from zhaofengli/riscv64-fixes
...
riscv64 fixes
2021-06-08 16:26:56 +08:00
Zhaofeng Li
590be3fae3
riscv64: Add Makefile
2021-06-07 22:55:56 +00:00
Zhaofeng Li
3521cd48cb
RISCV64_GENERIC: Use generic kernel for DSDOT for better precision
...
The implementation in `riscv64/dot.c` fails the `test_dsdot` test, and
the generic kernel seems to have better precision. Tested on SiFive
FU740 (HiFive Unmatched) and QEMU.
Also see #1469 .
2021-06-07 22:50:23 +00:00
Zhaofeng Li
1e0192a5cc
riscv64/imin: Fix wrong comparison
...
Same as #1990 .
2021-06-07 22:49:39 +00:00
Martin Kroeker
fe9aff17fe
Merge pull request #3258 from martin-frbg/hbaction
...
revert "try to work around gcc update problems" in Homebrew workflow
2021-06-06 22:15:29 +02:00
Martin Kroeker
8c25b440a0
revert "try to work around gcc update problems"
...
...as homebrew has dropped at least gcc8 now
2021-06-06 19:17:36 +02:00
Martin Kroeker
f84197c1a7
Add shortcuts for (small) cases that do not need expensive buffer allocation
2021-05-29 22:28:00 +02:00
Martin Kroeker
734bd265a8
revert symv changes for now
2021-05-29 15:40:03 +02:00
Martin Kroeker
1217eb910d
Fix copy-paste errors in variables used
2021-05-28 09:38:48 +02:00
Martin Kroeker
d6d7a6685d
Add shortcuts for (small) cases that do not need expensive buffer allocation
2021-05-27 22:39:18 +02:00
Martin Kroeker
f0e7345fb8
Add shortcut for small-size gemv_n with increments of one
2021-05-26 22:02:34 +02:00
Martin Kroeker
42f048cf6c
Merge pull request #3249 from MikaelUrankar/develop
...
Fix typo
2021-05-26 15:26:30 +02:00
MikaelUrankar
4fbc0777f4
Fix typo
2021-05-26 12:14:57 +02:00
Martin Kroeker
d7472606d5
Merge pull request #3244 from martin-frbg/issue3237
...
Add fast path for small xSYR with INCX==1
2021-05-22 22:38:09 +02:00
Martin Kroeker
03297ff9f0
Add fast path for small xSYR with INCX==1
2021-05-22 20:41:18 +02:00
Martin Kroeker
2d8d0af0ea
Merge pull request #3243 from martin-frbg/lapack564
...
Fix spurious error exit test failures in the ?chktsqr tests (LAPACK564)
2021-05-22 19:25:56 +02:00
Martin Kroeker
5f677e782e
Merge pull request #3196 from guowangy/skylakex-gemm-batch-k
...
GEMM: skylake: improve the performance when m is small
2021-05-22 19:25:28 +02:00
Martin Kroeker
04c60cee5d
Merge pull request #3242 from martin-frbg/issue3239
...
Handle inadvertent use of DYNAMIC_ARCH=0
2021-05-22 19:24:46 +02:00
Martin Kroeker
3a53207cc9
Fix spurious error exit test failures in the ?chktsqr tests (LAPACK564)
2021-05-22 14:29:45 +02:00
Martin Kroeker
0e73d20629
Handle inadvertent use of DYNAMIC_ARCH=0
2021-05-22 14:23:49 +02:00
Martin Kroeker
02087a62e7
Merge pull request #3205 from intelmy/sgemv_n_opt
...
optimize on sgemv_n for small n
2021-05-17 17:49:01 +02:00
Martin Kroeker
03b4d79a7e
Merge pull request #3238 from martin-frbg/lapack555
...
Correct function name in error message from SLASQ2 (LAPACK PR555)
2021-05-17 17:32:23 +02:00
Martin Kroeker
5c729c6dce
Correct function name in error message from SLASQ2 (Reference-LAPACK PR 555)
2021-05-17 14:47:14 +02:00
Martin Kroeker
e1911b2e60
Merge pull request #3236 from martin-frbg/issue3234
...
Add -lm for FreeBSD on ARM/ARM64
2021-05-16 17:17:18 +02:00
Martin Kroeker
8f33da4f94
Merge pull request #3235 from dnoan/develop
...
Update Makefile.arm64
2021-05-16 17:15:45 +02:00
Martin Kroeker
26ccf643a3
Add -lm for FreeBSD on ARM/ARM64
2021-05-16 13:04:38 +02:00
Noan
32264ba496
Update Makefile.arm64
...
Added -march and -mtune flags for EMAG processors when GCC 9 or later
2021-05-16 09:49:13 +00:00
Martin Kroeker
4ecf631f95
Merge pull request #3228 from martin-frbg/issue3226
...
filter out -mavx flag on Sandybridge zgemm/ztrmm kernels
2021-05-15 09:06:12 +02:00
Martin Kroeker
5af510081d
Merge pull request #3233 from martin-frbg/issue3230
...
Add autodetection for Intel Ice Lake SP
2021-05-15 01:04:09 +02:00
Martin Kroeker
164551d5a2
Merge pull request #3232 from martin-frbg/lapack553
...
Reduce stack size requirements in the LAPACK LIN tests (LAPACK PR 553)
2021-05-14 23:28:45 +02:00
Martin Kroeker
310b76aad7
Merge pull request #3231 from martin-frbg/issue3227
...
Support compilation with pre-C99 versions of MSVC
2021-05-14 23:28:06 +02:00
Martin Kroeker
c4da892ba0
Only filter out -mavx on Sandybridge ZGEMM/ZTRMM kernels
2021-05-14 23:19:10 +02:00
Martin Kroeker
cbfd3c87e1
Recognize Intel Ice Lake SP as Cooper Lake
2021-05-14 20:44:06 +02:00
Martin Kroeker
26e87ac517
Support Intel Ice Lake SP as Cooper Lake
2021-05-14 20:39:55 +02:00
Martin Kroeker
15b9d6b4a7
Delete zchkaa.f
2021-05-14 19:55:31 +02:00
Martin Kroeker
f7bcd962c1
Delete schkaa.f
2021-05-14 19:54:54 +02:00
Martin Kroeker
93cc066921
Delete dchkaa.f
2021-05-14 19:54:13 +02:00
Martin Kroeker
2c7d4a7766
Delete cchkaa.f
2021-05-14 19:53:38 +02:00
Martin Kroeker
eef1c42f03
Convert ?chkaa to use dynamic allocation for the larger arrays
2021-05-14 19:53:03 +02:00
Martin Kroeker
73f637e584
Support compilation with pre-C99 versions of MSVC
2021-05-14 15:08:12 +02:00
Martin Kroeker
8b90e5f202
Drop redundant inclusion of complex.h
2021-05-14 15:06:44 +02:00
Martin Kroeker
bd60fb6ffc
filter out -mavx flag on zgemm kernels as it can cause problems with older gcc
2021-05-13 23:05:00 +02:00
Martin Kroeker
37ea8702ee
Merge pull request #3192 from damonyu1989/develop
...
Update the intrinsic api to the offical name.
2021-05-11 16:00:45 +02:00
Martin Kroeker
ec7d6c02bc
Add an Android crossbuild on OSX to Azure CI ( #3224 )
...
* Add an Android crossbuild on OSX
2021-05-10 08:02:01 +02:00