Martin Kroeker
673e5a0495
Replace several POWER8/9 C kernels with their gcc7-generated assembly versions ( #2263 )
...
* Add gcc7-generated assembly files for POWER8/9 isa/ica-min/max and POWER9 caxpy
To work around internal compiler errors encountered when compiling the original C source with gcc 4 and 5, and wrong code generated by gcc 8.3.0
* Use gcc-generated assembly instead of original C sources
to work around internal compiler errors encountered with gcc 4.8/5.4 and wrong code generation by gcc 8.3
* Use gcc-generated assembly instead of the original C source
to work around internal compiler errors encountered with gcc 4.8 and 5.4, and wrong code generation by gcc 8.3
* Add gcc7-generated assembler version of caxpy for power8
to work around wrong code generated by gcc 8.3
* Handle CONJ define for caxpyc
* Handle CONJ define for caxpyc
* Add gcc7-generated assembly cdot for POWER9
* Use prebuilt assembly for POWER9 cdot
created with gcc 7.3.1 to work around ICE in older gcc versions
* Exclude POWER9 from DYNAMIC_ARCH when gcc versions is lower than 6
* Update Makefile.system
* Use PROLOGUE macro to ensure correct function name for DYNAMIC_ARCH
* Disable POWER9 with old gcc versions
2019-09-22 22:35:22 +02:00
Rashmica Gupta
bcdf1d4917
Add in runtime CPU detection for POWER.
2019-04-09 14:20:16 +10:00
Ubuntu
4abc375a91
sgemv cgemv pairs
2019-02-01 13:45:00 +00:00
Ubuntu
8c3386be87
Added missing Blas1 single fp {saxpy, caxpy, cdot, crot(refactored version of srot),isamax ,isamin, icamax, icamin},
...
Fixed idamin,icamin choosing the first occurance index of equal minimals
2019-01-16 15:16:21 +00:00
QWR QWR
28ca97015d
power8:Added initial zgemv_(t|n) ,i(d|z)amax,i(d|z)amin,dgemv_t(transposed),zrot
...
z13: improved zgemv_(t|n)_4,zscal,zaxpy
2018-03-27 14:54:41 +00:00
martin
7a4b3cfbf8
Add trivially optimized DSDOT for POWER8
2017-11-28 18:38:07 +01:00
Zhang Xianyi
515bc56ea9
Refs #946 . Use nrm2 reference implementation for Power8.
2016-08-18 18:59:43 -07:00
Zhang Xianyi
ae70b916f4
Refs #929 . Deal with zero and NaNs for scale.
2016-08-18 10:24:42 -07:00
Werner Saar
8fb5a1aaff
added optimized dtrsm_LT kernel for POWER8
2016-05-22 13:09:05 +02:00
Werner Saar
56948dbf0f
optimized dgemm for POWER8
2016-04-29 12:52:47 +02:00
Werner Saar
0d0c6f7d7d
optimized dgemm for POWER8
2016-04-27 14:01:08 +02:00
Werner Saar
a3da10662f
added sgemm_tcopy_8_power8.S
2016-04-23 10:04:41 +02:00
Werner Saar
d46f07bb4e
added cgemm_tcopy_8_power8.S
2016-04-23 07:37:18 +02:00
Werner Saar
879a51165f
Optimized zgemm and tested zgemm again
2016-04-22 13:07:12 +02:00
Werner Saar
9276c9012f
Optimized sgemm and dgemm and tested again.
2016-04-21 11:37:57 +02:00
Werner Saar
3c6294ca3d
added optimized sgemm_tcopy for power8
2016-04-19 16:08:54 +02:00
Werner Saar
68a69c5b50
added optimized dgemv_n kernel for POWER8
2016-03-30 11:10:53 +02:00
Werner Saar
c2464a7c4a
added optimized casum kernel for POWER8
2016-03-28 14:12:08 +02:00
Werner Saar
294f933869
added optimized zasum kernel for POWER8
2016-03-28 13:37:32 +02:00
Werner Saar
f59c9bd6ef
added optimized sasum kernel for POWER8
2016-03-28 12:44:25 +02:00
Werner Saar
c53be46d78
added optimized dasum kernel for POWER8
2016-03-28 12:17:15 +02:00
Werner Saar
659ed16591
added otimized cswap and zswap kernels for POWER8
2016-03-27 18:31:37 +02:00
Werner Saar
35c98a3556
added optimized zscal kernel for POWER8
2016-03-27 16:31:50 +02:00
Werner Saar
f1a5dd06c5
added optimized sscal kernel for POWER8
2016-03-27 11:05:56 +02:00
Werner Saar
35f1f21a7f
added drot- and srot-kernel optimimized for POWER8
2016-03-27 08:57:11 +02:00
Werner Saar
3d9a50e841
added optimized sswap kernel for POWER8
2016-03-25 17:34:55 +01:00
Werner Saar
828c849b44
added optimized ccopy kernel for POWER8
2016-03-25 16:54:25 +01:00
Werner Saar
ecc0bc9813
added optimized scopy kernel for POWER8
2016-03-25 16:06:56 +01:00
Werner Saar
12f209b7b0
added optimized zswap kernel for POWER8
2016-03-25 15:27:34 +01:00
Werner Saar
7316a87930
added optimized dswap kernel for POWER8
2016-03-25 14:35:43 +01:00
Werner Saar
0bff057a87
added optimized dcopy kernel for POWER8
2016-03-25 13:03:02 +01:00
Werner Saar
1e6cf9808c
added optimized dscal kernel for POWER8
2016-03-25 09:42:08 +01:00
Werner Saar
55eda3813b
added optimized zaxpy kernel for POWER8
2016-03-23 11:20:23 +01:00
Werner Saar
0664ba4c97
added optimized daxpy kernel for POWER8
2016-03-22 14:50:03 +01:00
Werner Saar
11c44dede1
added optimized sdot kernel for POWER8
2016-03-21 13:18:23 +01:00
Werner Saar
9e4584d069
added optimized zdot kernel for POWER8
2016-03-21 10:12:07 +01:00
Werner Saar
84b92e6373
added optimized ddot kernel for POWER8
2016-03-20 11:06:06 +01:00
Werner Saar
5c658f8746
add optimized cgemm- and ctrmm-kernel for POWER8
2016-03-18 08:17:25 +01:00
Werner Saar
dcd15b546c
BUGFIX: KERNEL.POWER8
2016-03-14 14:36:59 +01:00
Werner Saar
96284ab295
added sgemm- and strmm-kernel for POWER8
2016-03-14 13:52:44 +01:00
Werner Saar
91e1c5080c
modified configuration, to use power6 sgemm kernel for power8
2016-03-04 13:38:57 +01:00
Werner Saar
73f04c2c72
enabled hemv assemly function for power8
2016-03-04 13:20:50 +01:00
Werner Saar
3e633152c6
enabled symv assembly kernels on power8
2016-03-04 13:08:18 +01:00
Werner Saar
d5130ce7e3
enabled gemv assembly on power8
2016-03-04 12:53:31 +01:00
Werner Saar
4824b88fcb
enabled all level1 assembly kernels for power8
2016-03-04 12:35:25 +01:00
Werner Saar
b752858d6c
added dgemm-, dtrmm-, zgemm- and ztrmm-kernel for power8
2016-03-01 07:33:56 +01:00
Zhang Xianyi
3e8d6ea74f
Init POWER8 kernels by POWER6.
2015-11-03 12:34:23 +08:00