Martin Kroeker
|
874df65491
|
Fix incorrect sgemv results for IBM z14
part of PR #1993 that was inadvertently misplaced into the toplevel directory
|
2019-02-01 12:58:59 +01:00 |
Martin Kroeker
|
877023e1e1
|
Fix precision of zarch DSDOT
from patch provided by aarnez in #991
|
2019-01-31 21:22:26 +01:00 |
Martin Kroeker
|
265142edd5
|
Fix typo in the zarch min/max kernels
from patch provided by aarnez in #991
|
2019-01-31 21:21:40 +01:00 |
maamountki
|
29416cb5a3
|
[ZARCH] Add Z13 version for max/min functions
|
2019-01-31 19:11:11 +02:00 |
maamountki
|
48b9b94f7f
|
[ZARCH] Improve loading performance for camax/icamax
|
2019-01-31 18:52:11 +02:00 |
maamountki
|
fcd814a8d2
|
[ZARCH] Fix bug in max/min functions
|
2019-01-29 17:59:38 +02:00 |
maamountki
|
dc4d3bccd5
|
[ZARCH] Fix icamax/icamin
|
2019-01-29 03:47:49 +02:00 |
maamountki
|
c7143c1019
|
[ZARCH] Fix iamax/imax single precision
|
2019-01-28 17:52:23 +02:00 |
maamountki
|
04873bb174
|
[ZARCH] Undo the last commit
|
2019-01-28 17:32:24 +02:00 |
maamountki
|
c8ef9fb220
|
[ZARCH] Fix bug in iamax/iamin/imax/imin
|
2019-01-28 17:16:18 +02:00 |
maamountki
|
b111829226
|
[ZARCH] Update max/min functions
|
2019-01-21 15:56:04 +02:00 |
maamountki
|
b815a04c87
|
[ZARCH] fix a bug in max/min functions
|
2019-01-15 21:04:22 +02:00 |
maamountki
|
1a7925b3a3
|
[ZARCH] Update dgemv_n_4.c
|
2019-01-11 17:43:11 +02:00 |
maamountki
|
406f835f00
|
[ZARCH] update cgemv_n_4.c
|
2019-01-11 17:39:17 +02:00 |
maamountki
|
621dedb37b
|
[ZARCH] Update cgemv_t_4.c
|
2019-01-11 17:37:11 +02:00 |
maamountki
|
b731e8246f
|
Update sgemv_t_4.c
|
2019-01-11 17:14:04 +02:00 |
maamountki
|
ecc31b743f
|
Update dgemv_t_4.c
|
2019-01-11 17:13:02 +02:00 |
maamountki
|
5d89d6b143
|
[ZARCH] fix sgemv_n_4.c
|
2019-01-11 17:08:24 +02:00 |
maamountki
|
67432b23c2
|
[ZARCH] fix cgemv_n_4.c
|
2019-01-11 16:44:46 +02:00 |
maamountki
|
be66f5d5c2
|
[ZARCH] fix data prefetch type in sdot
|
2019-01-09 16:50:07 +02:00 |
maamountki
|
c2ffef8156
|
[ZARCH] fix data prefetch type in ddot
|
2019-01-09 16:49:44 +02:00 |
maamountki
|
e7455f500c
|
[ZARCH] fix dsdot.c
|
2019-01-09 16:33:54 +02:00 |
maamountki
|
3eafcfa650
|
[ZARCH] fix cgemv_n_4.c
|
2019-01-09 07:43:45 +02:00 |
maamountki
|
94cd946b96
|
[ZARCH] fix cgemv_n_4.c
|
2019-01-04 17:45:56 +02:00 |
maamountki
|
1aa840a0a2
|
[ZARCH] fix sgemv_t_4.c
|
2019-01-04 01:38:18 +02:00 |
maamountki
|
e6c0e39492
|
Optimize Zgemv
|
2018-08-13 12:23:40 +03:00 |
maamountki
|
23229011db
|
[ZARCH] Z14 support, BLAS 1/2 single precision implementations, Some missing double precision implementations, Gemv optimization
|
2018-08-06 18:20:40 +03:00 |
Martin Kroeker
|
c7b55b6082
|
Merge pull request #1499 from quickwritereader/develop
Implemented missing vsx simd kernels for power8 blas1/2 double. z13 modifications
|
2018-03-27 21:43:23 +02:00 |
QWR QWR
|
28ca97015d
|
power8:Added initial zgemv_(t|n) ,i(d|z)amax,i(d|z)amin,dgemv_t(transposed),zrot
z13: improved zgemv_(t|n)_4,zscal,zaxpy
|
2018-03-27 14:54:41 +00:00 |
Martin Kroeker
|
22167170b3
|
Merge pull request #1477 from quickwritereader/develop
Power8 blas3 copy-pack routines
|
2018-02-28 18:46:54 +01:00 |
Martin Kroeker
|
58f236ad73
|
Use generic/dot.c for DSDOT on zarch
|
2018-02-25 19:52:14 +01:00 |
Martin Kroeker
|
e207107150
|
Use generic/dot.c for DSDOT on z13
The implementation in arm/dot.c has lower precision, as shown by the utest for dsdot.
|
2018-02-25 19:51:25 +01:00 |
the mslm
|
c5425daa6b
|
power8 ?gemm_tcopy save/restore
|
2018-02-16 23:36:46 +00:00 |
Abdelrauf
|
60596a1abc
|
Merge branch 'develop' into develop
|
2018-01-31 16:17:04 -08:00 |
Abdelrauf
|
afd514c25d
|
small fix inside ifdef z13mvc . (z13mvc code is not used in production)
|
2018-01-31 18:30:59 -05:00 |
Martin Kroeker
|
f45776ec1f
|
Merge pull request #1440 from quickwritereader/develop
small corrections
|
2018-01-31 23:48:47 +01:00 |
Abdelrauf
|
f653e7a18d
|
small fix
small fix inside ifdef z13mvc . (z13mvc code is not used in production)
|
2018-01-31 07:49:38 -08:00 |
the mslm
|
f946a89432
|
zscal (case: real alpha=0 ) mikrokernel shift&mem fix , da_i as input reg. small typo fixes
|
2018-01-26 19:25:27 -08:00 |
Martin Kroeker
|
e4c71a799a
|
Merge pull request #1426 from quickwritereader/develop
(Z13 ) Blas1 mikrokernels can be inlined by gcc. Refactoring,fixes,tunings
|
2018-01-20 17:34:54 +01:00 |
the mslm
|
2619ad7ea5
|
Blas1 mikrokernels can be inlined by gcc. Refactoring ( symbolic operan
names). Some fixes and tunings
|
2018-01-19 19:24:35 -08:00 |
Martin Kroeker
|
3d23f45107
|
Merge pull request #1415 from quickwritereader/develop
(Z systems Z13) small fixes, some (i(dz)amin,i(dz)amax,(dz)dot,(dz)asum) mikrokernels…
|
2018-01-11 08:35:02 +01:00 |
Abdelrauf
|
87669d1c0a
|
small fixes, some (i(dz)amin,i(dz)amax,(dz)dot,(dz)asum) mikrokernels can be inlined
|
2018-01-10 20:36:53 -05:00 |
Andrew
|
7e9b29b9b8
|
fix spurious compiler warning (no code change)
|
2017-11-24 18:36:37 +01:00 |
Abdurrauf
|
1cfdb2295d
|
Optimized standard Blas Level-1,2 (excluding nrm2 functions) for z13 (double precision)
|
2017-09-06 16:41:08 +04:00 |
Abdurrauf
|
08786c4b95
|
strmm and ctrmm
|
2017-03-13 01:23:16 +04:00 |
Abdurrauf
|
82e80fa82b
|
initial strmm(sgemm). not tuned yet
|
2017-03-06 04:27:40 +04:00 |
Abdurrauf
|
e831d6924e
|
changed to conventional register save area
|
2017-03-01 03:13:21 +04:00 |
Abdurrauf
|
848cb27b1e
|
ztrmm kernel.
|
2017-02-26 06:14:12 +04:00 |
Abdurrauf
|
6418667818
|
dtrmm and dgemm for z13
|
2017-01-04 19:32:33 +04:00 |
Zhang Xianyi
|
dd43661cfd
|
Init IBM z system (s390x) porting.
|
2016-04-15 18:02:24 -04:00 |