Martin Kroeker
|
6b6c9b1441
|
Merge pull request #2172 from quickwritereader/develop
power9 cgemm/ctrmm. new sgemm 8x16
|
2019-07-01 21:06:02 +02:00 |
AbdelRauf
|
a97b301aaa
|
cgemm/ctrmm power9
|
2019-07-01 14:07:54 +00:00 |
Martin Kroeker
|
a17cf36225
|
Merge pull request #2153 from quickwritereader/develop
improved power9 zgemm,sgemm
|
2019-06-06 07:42:56 +02:00 |
AbdelRauf
|
148c4cc5fd
|
conflict resolve
|
2019-06-05 20:50:50 +00:00 |
AbdelRauf
|
d0c3543c3f
|
power9 zgemm ztrmm optimized
|
2019-06-05 20:07:16 +00:00 |
AbdelRauf
|
a469b32cf4
|
sgemm pipeline improved, zgemm rewritten without inner packs, ABI lxvx v20 fixed with vs52
|
2019-06-04 07:11:30 +00:00 |
AbdelRauf
|
8fe794f059
|
improved zgemm power9 based on power8
|
2019-05-30 15:31:25 +00:00 |
Martin Kroeker
|
3f427c0cf9
|
Merge pull request #2107 from quickwritereader/develop
sgemm/strmm kernel for power9
|
2019-05-02 07:56:57 +02:00 |
AbdelRauf
|
47f892198c
|
conflict resolve
|
2019-05-01 19:36:22 +00:00 |
AbdelRauf
|
0f105dd8a5
|
sgemm/strmm
|
2019-04-29 08:49:50 +00:00 |
Rashmica Gupta
|
bcdf1d4917
|
Add in runtime CPU detection for POWER.
|
2019-04-09 14:20:16 +10:00 |
AbdelRauf
|
853a18bc17
|
power9 makefile. dgemm based on power8 kernel with following changes : 32x unrolled 16x4 kernel and 8x4 kernel using (lxv stxv butterfly rank1 update). improvement from 17 to 22-23gflops. dtrmm cases were added into dgemm itself
|
2019-03-29 15:49:40 +00:00 |