Commit Graph

7452 Commits

Author SHA1 Message Date
Martin Kroeker 3be660c000
Add interface declarations for ?potri 2021-06-26 23:44:56 +02:00
Martin Kroeker 1a8b6134c2
Merge pull request #3278 from brada4/A55
Add CORTEXA55 cpuid 0xd05 support
2021-06-23 13:05:17 +02:00
Martin Kroeker f0b822a709
Update cpuid_arm64.c 2021-06-23 10:11:01 +02:00
User User-User 130327e9af OK 2021-06-22 23:58:59 +02:00
User User-User 750719528a bugz 2021-06-20 16:40:43 +02:00
User User-User 91e2b11d3c add to cmake listings too 2021-06-20 15:32:42 +02:00
User User-User 548aa522e5 remove misplaced file 2021-06-20 15:29:25 +02:00
User User-User 6423b282a1 dynamic_arch 2021-06-20 14:19:41 +02:00
User User-User 9335d42740 add gcc8 version matching 2021-06-19 22:21:39 +02:00
User User-User 39ef0880ae copy conf 2021-06-19 21:49:58 +02:00
User User-User b7da75e4fd WiP CORTEX A55 support 2021-06-19 21:37:51 +02:00
Martin Kroeker a7627c5afd
Merge pull request #3276 from martin-frbg/issue3274
Add workaround for another macro name collision with Windows 10 SDK winnt.h
2021-06-16 16:37:30 +02:00
Martin Kroeker 9499ab0d45
Merge pull request #3275 from martin-frbg/lapack580
Fix missing EXTERNAL declarations in LAPACK TESTING (LAPACK PR 580)
2021-06-16 13:41:38 +02:00
Martin Kroeker 307c4c0786
Fix typo 2021-06-16 13:41:16 +02:00
Martin Kroeker e83df93975
Work around another recent macro name collision with winnt.h 2021-06-16 12:32:34 +02:00
Martin Kroeker 13fa9f737d
Modify defines for CR and RC to work around name collision on Windows 2021-06-16 12:17:25 +02:00
Martin Kroeker 5958ffc9b6
Declare DZASUM as EXTERNAL 2021-06-16 09:43:39 +02:00
Martin Kroeker cd0e4aadb1
Declare ZDROT as EXTERNAL 2021-06-16 09:41:18 +02:00
Martin Kroeker e2621ef93a
Declare SROT as EXTERNAL 2021-06-16 09:40:15 +02:00
Martin Kroeker 9e1b43ea9b
Declare DROT as EXTERNAL 2021-06-16 09:39:28 +02:00
Martin Kroeker 5269348178
Declare CSROT as EXTERNAL 2021-06-16 09:35:12 +02:00
Martin Kroeker 92e024bbb3
Declare SCASUM as EXTERNAL 2021-06-16 09:33:23 +02:00
Martin Kroeker c4b464cac6
Merge pull request #3273 from austinpagan/sbgemm_gcc10_fix
Power10: Fix for SBGEMM
2021-06-15 22:58:48 +02:00
Gordon Fossum e6dd44d989 Power10: Fix for SBGEMM
While testing bfloat16 sbgemm kernel, there are some failures for odd value inputs due to updating result for
additional bytes.
2021-06-15 13:07:47 -05:00
Martin Kroeker baf03a0937
Merge pull request #3252 from martin-frbg/more_shortcuts
Further shortcuts for (small) cases that do not need buffer allocation
2021-06-15 16:14:20 +02:00
Martin Kroeker 7aab5e826c
Merge pull request #3250 from martin-frbg/gemv-shortcut
Add shortcut for small-size S/D GEMV_N with increments of one
2021-06-15 14:50:14 +02:00
Martin Kroeker 29417adf4c
Merge pull request #3270 from ggouaillardet/topic/dznrm2_tx2
arm64: add the missing d9 register to the clobber list
2021-06-14 13:00:33 +02:00
Gilles Gouaillardet 9d292d37b2 arm64: add the missing d9 register to the clobber list
Refs. numpy/numpy#18422

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2021-06-14 17:01:28 +09:00
Martin Kroeker 2e8ff4a781
Merge pull request #3266 from martin-frbg/powerparam
Remove spurious casts from PPC parameters and fix compilation for older targets
2021-06-10 18:05:47 +02:00
Martin Kroeker dbba381dc3
Merge pull request #3260 from intelmy/sgemv_t_opt
Optimized sgemv_t for small N based on AVX512
2021-06-10 16:08:24 +02:00
Martin Kroeker f61991d439
Merge pull request #3264 from RajalakshmiSR/sbgemmp10
POWER10: Fixes for sbgemm kernel
2021-06-10 16:07:47 +02:00
Martin Kroeker efdbdd8f82
Add prefetch values for power3 2021-06-10 11:20:29 +02:00
Martin Kroeker 3906ef3b0f
Add prefetch values for power3 2021-06-10 11:19:40 +02:00
Martin Kroeker 8adf0971d8
Add prefetch values for power3 2021-06-10 11:18:22 +02:00
Martin Kroeker 08e2e60762
Add prefetch values for power3 2021-06-10 11:17:33 +02:00
Martin Kroeker fb9e678235
Fix caxpy/zaxpy for big-endian 2021-06-10 11:15:48 +02:00
Martin Kroeker dc4fcb48df
Fix inverted conditional for caxpy/zaxpy 2021-06-10 11:14:03 +02:00
Martin Kroeker 7a48247761
fix c/zrot and sgemv for POWER5 2021-06-10 11:11:56 +02:00
Martin Kroeker 7dfc45e840
Remove casts for PPC/POWER and complete parameters for POWER3/4 2021-06-10 11:09:50 +02:00
Arthur Williams 7fb6e576c2 Removed use of non portable '-p' arg to install
Not all versions of install support '-p' flag and it isn't worth failing
the build in the installed files' timestamps get updated.
2021-06-09 20:50:36 -05:00
Rajalakshmi Srinivasaraghavan cbb70438df POWER10: Fixes for sbgemm kernel
While testing bfloat16 sbgemm kernel, there are some failures
for odd value inputs due to array access beyond the boundary.
2021-06-09 12:20:09 -05:00
Ma, Yu 706a08d4a0 Optimized sgemv_t for small N based on AVX512 2021-06-08 15:08:28 -04:00
Zhang Xianyi 9f3d903817
Merge pull request #3259 from zhaofengli/riscv64-fixes
riscv64 fixes
2021-06-08 16:26:56 +08:00
Zhaofeng Li 590be3fae3 riscv64: Add Makefile 2021-06-07 22:55:56 +00:00
Zhaofeng Li 3521cd48cb RISCV64_GENERIC: Use generic kernel for DSDOT for better precision
The implementation in `riscv64/dot.c` fails the `test_dsdot` test, and
the generic kernel seems to have better precision. Tested on SiFive
FU740 (HiFive Unmatched) and QEMU.

Also see #1469.
2021-06-07 22:50:23 +00:00
Zhaofeng Li 1e0192a5cc riscv64/imin: Fix wrong comparison
Same as #1990.
2021-06-07 22:49:39 +00:00
Martin Kroeker fe9aff17fe
Merge pull request #3258 from martin-frbg/hbaction
revert "try to work around gcc update problems" in Homebrew workflow
2021-06-06 22:15:29 +02:00
Martin Kroeker 8c25b440a0
revert "try to work around gcc update problems"
...as homebrew has dropped at least gcc8 now
2021-06-06 19:17:36 +02:00
Martin Kroeker f84197c1a7
Add shortcuts for (small) cases that do not need expensive buffer allocation 2021-05-29 22:28:00 +02:00
Martin Kroeker 734bd265a8
revert symv changes for now 2021-05-29 15:40:03 +02:00