Commit Graph

60 Commits

Author SHA1 Message Date
Andreas Arnez
d117dfd505 Change bad usage of "asum" to "sum" in ZARCH versions of ?sum
The ZARCH implementations of ?sum contain a cut & paste-error: An inline
assembly argument is named "sum", but the assembly references "asum"
instead.  The mismatch causes a build error.  This is fixed.
2019-11-21 13:49:13 +01:00
Martin Kroeker
246ca29679 Add ZARCH implementation of ?sum
as trivial copies of the respective ?asum kernels with the ABS and vflpsb calls removed
2019-03-30 22:49:05 +01:00
maamountki
0a54c98b9d [ZARCH] Modify constraints 2019-02-13 21:06:25 +02:00
maamountki
bec54ae366 [ZARCH] Fix caxpy 2019-02-13 12:54:35 +02:00
maamountki
f583674109 [ZARCH] Fix cgemv_t_4 2019-02-12 13:12:28 +02:00
maamountki
77fe70019f [ZARCH] Fix constraints and source code formatting 2019-02-11 16:01:13 +02:00
maamountki
7039770165 [ZARCH] Undo the last commit 2019-02-06 20:11:44 +02:00
maamountki
11a43e8116 [ZARCH] Set alignment hint for vl/vst 2019-02-05 19:17:08 +02:00
maamountki
61526480f9 [ZARCH] Fix copy constraint 2019-02-05 07:51:19 +02:00
maamountki
81daf6bc38 [ZARCH] Format source code, Fix constraints 2019-02-05 07:30:38 +02:00
Martin Kroeker
874df65491 Fix incorrect sgemv results for IBM z14
part of PR #1993 that was inadvertently misplaced into the toplevel directory
2019-02-01 12:58:59 +01:00
Martin Kroeker
877023e1e1 Fix precision of zarch DSDOT
from patch provided by aarnez in #991
2019-01-31 21:22:26 +01:00
Martin Kroeker
265142edd5 Fix typo in the zarch min/max kernels
from patch provided by aarnez in #991
2019-01-31 21:21:40 +01:00
maamountki
29416cb5a3 [ZARCH] Add Z13 version for max/min functions 2019-01-31 19:11:11 +02:00
maamountki
48b9b94f7f [ZARCH] Improve loading performance for camax/icamax 2019-01-31 18:52:11 +02:00
maamountki
fcd814a8d2 [ZARCH] Fix bug in max/min functions 2019-01-29 17:59:38 +02:00
maamountki
dc4d3bccd5 [ZARCH] Fix icamax/icamin 2019-01-29 03:47:49 +02:00
maamountki
c7143c1019 [ZARCH] Fix iamax/imax single precision 2019-01-28 17:52:23 +02:00
maamountki
04873bb174 [ZARCH] Undo the last commit 2019-01-28 17:32:24 +02:00
maamountki
c8ef9fb220 [ZARCH] Fix bug in iamax/iamin/imax/imin 2019-01-28 17:16:18 +02:00
maamountki
b111829226 [ZARCH] Update max/min functions 2019-01-21 15:56:04 +02:00
maamountki
b815a04c87 [ZARCH] fix a bug in max/min functions 2019-01-15 21:04:22 +02:00
maamountki
1a7925b3a3 [ZARCH] Update dgemv_n_4.c 2019-01-11 17:43:11 +02:00
maamountki
406f835f00 [ZARCH] update cgemv_n_4.c 2019-01-11 17:39:17 +02:00
maamountki
621dedb37b [ZARCH] Update cgemv_t_4.c 2019-01-11 17:37:11 +02:00
maamountki
b731e8246f Update sgemv_t_4.c 2019-01-11 17:14:04 +02:00
maamountki
ecc31b743f Update dgemv_t_4.c 2019-01-11 17:13:02 +02:00
maamountki
5d89d6b143 [ZARCH] fix sgemv_n_4.c 2019-01-11 17:08:24 +02:00
maamountki
67432b23c2 [ZARCH] fix cgemv_n_4.c 2019-01-11 16:44:46 +02:00
maamountki
be66f5d5c2 [ZARCH] fix data prefetch type in sdot 2019-01-09 16:50:07 +02:00
maamountki
c2ffef8156 [ZARCH] fix data prefetch type in ddot 2019-01-09 16:49:44 +02:00
maamountki
e7455f500c [ZARCH] fix dsdot.c 2019-01-09 16:33:54 +02:00
maamountki
3eafcfa650 [ZARCH] fix cgemv_n_4.c 2019-01-09 07:43:45 +02:00
maamountki
94cd946b96 [ZARCH] fix cgemv_n_4.c 2019-01-04 17:45:56 +02:00
maamountki
1aa840a0a2 [ZARCH] fix sgemv_t_4.c 2019-01-04 01:38:18 +02:00
maamountki
e6c0e39492 Optimize Zgemv 2018-08-13 12:23:40 +03:00
maamountki
23229011db [ZARCH] Z14 support, BLAS 1/2 single precision implementations, Some missing double precision implementations, Gemv optimization 2018-08-06 18:20:40 +03:00
Martin Kroeker
c7b55b6082 Merge pull request #1499 from quickwritereader/develop
Implemented missing vsx simd  kernels for power8 blas1/2 double. z13 modifications
2018-03-27 21:43:23 +02:00
QWR QWR
28ca97015d power8:Added initial zgemv_(t|n) ,i(d|z)amax,i(d|z)amin,dgemv_t(transposed),zrot
z13: improved zgemv_(t|n)_4,zscal,zaxpy
2018-03-27 14:54:41 +00:00
Martin Kroeker
22167170b3 Merge pull request #1477 from quickwritereader/develop
Power8 blas3 copy-pack routines
2018-02-28 18:46:54 +01:00
Martin Kroeker
58f236ad73 Use generic/dot.c for DSDOT on zarch 2018-02-25 19:52:14 +01:00
Martin Kroeker
e207107150 Use generic/dot.c for DSDOT on z13
The implementation in arm/dot.c has lower precision, as shown by the utest for dsdot.
2018-02-25 19:51:25 +01:00
the mslm
c5425daa6b power8 ?gemm_tcopy save/restore 2018-02-16 23:36:46 +00:00
Abdelrauf
60596a1abc Merge branch 'develop' into develop 2018-01-31 16:17:04 -08:00
Abdelrauf
afd514c25d small fix inside ifdef z13mvc . (z13mvc code is not used in production) 2018-01-31 18:30:59 -05:00
Martin Kroeker
f45776ec1f Merge pull request #1440 from quickwritereader/develop
small corrections
2018-01-31 23:48:47 +01:00
Abdelrauf
f653e7a18d small fix
small fix inside ifdef z13mvc . (z13mvc code is not used in production)
2018-01-31 07:49:38 -08:00
the mslm
f946a89432 zscal (case: real alpha=0 ) mikrokernel shift&mem fix , da_i as input reg. small typo fixes 2018-01-26 19:25:27 -08:00
Martin Kroeker
e4c71a799a Merge pull request #1426 from quickwritereader/develop
(Z13 ) Blas1 mikrokernels can be inlined by gcc. Refactoring,fixes,tunings
2018-01-20 17:34:54 +01:00
the mslm
2619ad7ea5 Blas1 mikrokernels can be inlined by gcc. Refactoring ( symbolic operan
names). Some fixes and tunings
2018-01-19 19:24:35 -08:00