Martin Kroeker
|
246ca29679
|
Add ZARCH implementation of ?sum
as trivial copies of the respective ?asum kernels with the ABS and vflpsb calls removed
|
2019-03-30 22:49:05 +01:00 |
maamountki
|
0a54c98b9d
|
[ZARCH] Modify constraints
|
2019-02-13 21:06:25 +02:00 |
maamountki
|
bec54ae366
|
[ZARCH] Fix caxpy
|
2019-02-13 12:54:35 +02:00 |
maamountki
|
f583674109
|
[ZARCH] Fix cgemv_t_4
|
2019-02-12 13:12:28 +02:00 |
maamountki
|
77fe70019f
|
[ZARCH] Fix constraints and source code formatting
|
2019-02-11 16:01:13 +02:00 |
maamountki
|
7039770165
|
[ZARCH] Undo the last commit
|
2019-02-06 20:11:44 +02:00 |
maamountki
|
11a43e8116
|
[ZARCH] Set alignment hint for vl/vst
|
2019-02-05 19:17:08 +02:00 |
maamountki
|
61526480f9
|
[ZARCH] Fix copy constraint
|
2019-02-05 07:51:19 +02:00 |
maamountki
|
81daf6bc38
|
[ZARCH] Format source code, Fix constraints
|
2019-02-05 07:30:38 +02:00 |
Martin Kroeker
|
874df65491
|
Fix incorrect sgemv results for IBM z14
part of PR #1993 that was inadvertently misplaced into the toplevel directory
|
2019-02-01 12:58:59 +01:00 |
Martin Kroeker
|
877023e1e1
|
Fix precision of zarch DSDOT
from patch provided by aarnez in #991
|
2019-01-31 21:22:26 +01:00 |
Martin Kroeker
|
265142edd5
|
Fix typo in the zarch min/max kernels
from patch provided by aarnez in #991
|
2019-01-31 21:21:40 +01:00 |
maamountki
|
29416cb5a3
|
[ZARCH] Add Z13 version for max/min functions
|
2019-01-31 19:11:11 +02:00 |
maamountki
|
48b9b94f7f
|
[ZARCH] Improve loading performance for camax/icamax
|
2019-01-31 18:52:11 +02:00 |
maamountki
|
fcd814a8d2
|
[ZARCH] Fix bug in max/min functions
|
2019-01-29 17:59:38 +02:00 |
maamountki
|
dc4d3bccd5
|
[ZARCH] Fix icamax/icamin
|
2019-01-29 03:47:49 +02:00 |
maamountki
|
c7143c1019
|
[ZARCH] Fix iamax/imax single precision
|
2019-01-28 17:52:23 +02:00 |
maamountki
|
04873bb174
|
[ZARCH] Undo the last commit
|
2019-01-28 17:32:24 +02:00 |
maamountki
|
c8ef9fb220
|
[ZARCH] Fix bug in iamax/iamin/imax/imin
|
2019-01-28 17:16:18 +02:00 |
maamountki
|
b111829226
|
[ZARCH] Update max/min functions
|
2019-01-21 15:56:04 +02:00 |
maamountki
|
b815a04c87
|
[ZARCH] fix a bug in max/min functions
|
2019-01-15 21:04:22 +02:00 |
maamountki
|
1a7925b3a3
|
[ZARCH] Update dgemv_n_4.c
|
2019-01-11 17:43:11 +02:00 |
maamountki
|
406f835f00
|
[ZARCH] update cgemv_n_4.c
|
2019-01-11 17:39:17 +02:00 |
maamountki
|
621dedb37b
|
[ZARCH] Update cgemv_t_4.c
|
2019-01-11 17:37:11 +02:00 |
maamountki
|
b731e8246f
|
Update sgemv_t_4.c
|
2019-01-11 17:14:04 +02:00 |
maamountki
|
ecc31b743f
|
Update dgemv_t_4.c
|
2019-01-11 17:13:02 +02:00 |
maamountki
|
5d89d6b143
|
[ZARCH] fix sgemv_n_4.c
|
2019-01-11 17:08:24 +02:00 |
maamountki
|
67432b23c2
|
[ZARCH] fix cgemv_n_4.c
|
2019-01-11 16:44:46 +02:00 |
maamountki
|
be66f5d5c2
|
[ZARCH] fix data prefetch type in sdot
|
2019-01-09 16:50:07 +02:00 |
maamountki
|
c2ffef8156
|
[ZARCH] fix data prefetch type in ddot
|
2019-01-09 16:49:44 +02:00 |
maamountki
|
e7455f500c
|
[ZARCH] fix dsdot.c
|
2019-01-09 16:33:54 +02:00 |
maamountki
|
3eafcfa650
|
[ZARCH] fix cgemv_n_4.c
|
2019-01-09 07:43:45 +02:00 |
maamountki
|
94cd946b96
|
[ZARCH] fix cgemv_n_4.c
|
2019-01-04 17:45:56 +02:00 |
maamountki
|
1aa840a0a2
|
[ZARCH] fix sgemv_t_4.c
|
2019-01-04 01:38:18 +02:00 |
maamountki
|
e6c0e39492
|
Optimize Zgemv
|
2018-08-13 12:23:40 +03:00 |
maamountki
|
23229011db
|
[ZARCH] Z14 support, BLAS 1/2 single precision implementations, Some missing double precision implementations, Gemv optimization
|
2018-08-06 18:20:40 +03:00 |
Martin Kroeker
|
c7b55b6082
|
Merge pull request #1499 from quickwritereader/develop
Implemented missing vsx simd kernels for power8 blas1/2 double. z13 modifications
|
2018-03-27 21:43:23 +02:00 |
QWR QWR
|
28ca97015d
|
power8:Added initial zgemv_(t|n) ,i(d|z)amax,i(d|z)amin,dgemv_t(transposed),zrot
z13: improved zgemv_(t|n)_4,zscal,zaxpy
|
2018-03-27 14:54:41 +00:00 |
Martin Kroeker
|
22167170b3
|
Merge pull request #1477 from quickwritereader/develop
Power8 blas3 copy-pack routines
|
2018-02-28 18:46:54 +01:00 |
Martin Kroeker
|
58f236ad73
|
Use generic/dot.c for DSDOT on zarch
|
2018-02-25 19:52:14 +01:00 |
Martin Kroeker
|
e207107150
|
Use generic/dot.c for DSDOT on z13
The implementation in arm/dot.c has lower precision, as shown by the utest for dsdot.
|
2018-02-25 19:51:25 +01:00 |
the mslm
|
c5425daa6b
|
power8 ?gemm_tcopy save/restore
|
2018-02-16 23:36:46 +00:00 |
Abdelrauf
|
60596a1abc
|
Merge branch 'develop' into develop
|
2018-01-31 16:17:04 -08:00 |
Abdelrauf
|
afd514c25d
|
small fix inside ifdef z13mvc . (z13mvc code is not used in production)
|
2018-01-31 18:30:59 -05:00 |
Martin Kroeker
|
f45776ec1f
|
Merge pull request #1440 from quickwritereader/develop
small corrections
|
2018-01-31 23:48:47 +01:00 |
Abdelrauf
|
f653e7a18d
|
small fix
small fix inside ifdef z13mvc . (z13mvc code is not used in production)
|
2018-01-31 07:49:38 -08:00 |
the mslm
|
f946a89432
|
zscal (case: real alpha=0 ) mikrokernel shift&mem fix , da_i as input reg. small typo fixes
|
2018-01-26 19:25:27 -08:00 |
Martin Kroeker
|
e4c71a799a
|
Merge pull request #1426 from quickwritereader/develop
(Z13 ) Blas1 mikrokernels can be inlined by gcc. Refactoring,fixes,tunings
|
2018-01-20 17:34:54 +01:00 |
the mslm
|
2619ad7ea5
|
Blas1 mikrokernels can be inlined by gcc. Refactoring ( symbolic operan
names). Some fixes and tunings
|
2018-01-19 19:24:35 -08:00 |
Martin Kroeker
|
3d23f45107
|
Merge pull request #1415 from quickwritereader/develop
(Z systems Z13) small fixes, some (i(dz)amin,i(dz)amax,(dz)dot,(dz)asum) mikrokernels…
|
2018-01-11 08:35:02 +01:00 |