Martin Kroeker
d2093a40d3
Merge pull request #2277 from martin-frbg/issue2275
...
Rewrite ARMV8 code to allow cross-compilation for IOS
2019-10-06 23:01:54 +02:00
Martin Kroeker
56837e9d92
Make local labels in macro compatible with the xcode assembler
...
... which does not perform the automatic numbering on instantiation that the _@ suffix signifies
2019-10-04 14:53:23 +02:00
Martin Kroeker
5e244d80f2
Merge pull request #2271 from quickwritereader/strmm_fix
...
fixed bug power9 strmm . BLAS-TESTER passes
2019-09-29 13:53:45 +02:00
AbdelRauf
ede5efebab
trmm fix
2019-09-29 02:28:34 +00:00
Martin Kroeker
596a22325a
Fix prologue of power9 assembly cdot(c) kernel to provide cdotc
2019-09-27 00:47:18 +02:00
Martin Kroeker
7f58f3ad0e
Fix mis-edits in the gcc-derived power8 caxpy kernel
2019-09-27 00:44:26 +02:00
Martin Kroeker
673e5a0495
Replace several POWER8/9 C kernels with their gcc7-generated assembly versions ( #2263 )
...
* Add gcc7-generated assembly files for POWER8/9 isa/ica-min/max and POWER9 caxpy
To work around internal compiler errors encountered when compiling the original C source with gcc 4 and 5, and wrong code generated by gcc 8.3.0
* Use gcc-generated assembly instead of original C sources
to work around internal compiler errors encountered with gcc 4.8/5.4 and wrong code generation by gcc 8.3
* Use gcc-generated assembly instead of the original C source
to work around internal compiler errors encountered with gcc 4.8 and 5.4, and wrong code generation by gcc 8.3
* Add gcc7-generated assembler version of caxpy for power8
to work around wrong code generated by gcc 8.3
* Handle CONJ define for caxpyc
* Handle CONJ define for caxpyc
* Add gcc7-generated assembly cdot for POWER9
* Use prebuilt assembly for POWER9 cdot
created with gcc 7.3.1 to work around ICE in older gcc versions
* Exclude POWER9 from DYNAMIC_ARCH when gcc versions is lower than 6
* Update Makefile.system
* Use PROLOGUE macro to ensure correct function name for DYNAMIC_ARCH
* Disable POWER9 with old gcc versions
2019-09-22 22:35:22 +02:00
Martin Kroeker
e7c4d6705a
Revert #2051 and replace with a better fix ( #2261 )
...
* Revert #2051 and add a better fix for TARGET=generic with DYNAMIC_ARCH
fixes #2257 without breaking #2048 again
2019-09-17 18:56:04 +02:00
Martin Kroeker
f3c314550c
Merge pull request #2243 from quickwritereader/develop
...
possible cgemv,caxpy,cdot fix
2019-08-30 23:06:23 +02:00
AbdelRauf
847c20c9b7
fix uninitialized variables i
2019-08-30 11:14:55 +00:00
AbdelRauf
4c22828812
caxpy and cdot are using vec_vsx_ld
2019-08-30 04:09:15 +00:00
AbdelRauf
e79712d969
cgemv using vec_vsx_ld instead of letting gcc to decide
2019-08-30 02:52:04 +00:00
AbdelRauf
be09551cdf
aligned
2019-08-29 23:22:23 +00:00
Martin Kroeker
11c59acfb1
Keep both PGI/SUN and default code paths to avoid breaking Clang/WIndows
2019-08-28 18:07:44 +02:00
Martin Kroeker
3a55dca2dc
Make x86_64 zdot compile with PGI and Sun C again
...
broken by #2222 as CREAL,CIMAG do not expand to a valid lvalue with these compilers
2019-08-28 11:35:31 +02:00
Kavana Bhat
3dc6b26eff
AIX changes for Power8
2019-08-20 06:51:35 -05:00
Martin Kroeker
9ef96b32a6
Add multithreading support to the x86_64 zdot kernel ( #2222 )
...
* Add multithreading support
copied from the ThunderX2T99 kernel. For #2221
2019-08-15 22:09:12 +02:00
Martin Kroeker
103b32fdb7
Merge pull request #2216 from martin-frbg/issue2214
...
Remove case-sensitivity in x86 LSAME on (AMD) cpus without CMOV
2019-08-13 13:59:33 +02:00
Martin Kroeker
aef9804089
Fix unwanted case-sensitivity in x86 LSAME for (AMD) processors without CMOV
...
Problem was already noticed some years ago in #238 , but back then the problem was only corrected in one of the #ifdef branches.
Fixes #2214
2019-08-13 10:19:10 +02:00
Martin Kroeker
dccff2e785
Merge pull request #2206 from martin-frbg/zen-dtrmm
...
Replace vpermpd with vpermilpd in the Haswell DTRMM kernel
2019-08-09 07:55:20 +02:00
Martin Kroeker
5c3458a6e7
Merge pull request #2199 from martin-frbg/zen-dtrsm
...
Replace most vpermpd calls in the Haswell DTRSM_RN kernel
2019-08-09 07:55:02 +02:00
Martin Kroeker
acf6002ab2
Replace most vpermpd calls in the Haswell DTRSM_RN kernel
2019-08-03 12:40:13 +02:00
Martin Kroeker
2dfb804cb9
Replace vpermpd with vpermilpd in the Haswell DTRMM kernel
...
to improve performance on AMD Zen (#2180 ) applying wjc404's improvement of the DGEMM kernel from #2186
2019-07-28 23:17:28 +02:00
Martin Kroeker
4c153ec9da
Merge pull request #2196 from wjc404/develop
...
Add vbroadcastsd kernel to dgemm_kernel_4x8_haswell.S
2019-07-28 23:11:40 +02:00
wjc404
7eecd8e39c
Add files via upload
2019-07-28 07:39:09 +08:00
Martin Kroeker
7b0b7c11d2
Merge pull request #2190 from martin-frbg/zdot-zen
...
Replace vpermpd with vpermilpd in the Haswell/Zen zdot microkernel
2019-07-23 16:15:08 +02:00
Martin Kroeker
28e96458e5
Replace vpermpd with vpermilpd
...
to improve performance on Zen/Zen2 (as demonstrated by wjc404 in #2180 )
2019-07-22 08:28:16 +02:00
wjc404
95fb98f556
Update dgemm_kernel_4x8_haswell.S
2019-07-21 01:10:32 +08:00
wjc404
4801c6d36b
Update dgemm_kernel_4x8_haswell.S
2019-07-21 00:47:45 +08:00
wjc404
9440fa607d
Add files via upload
2019-07-20 22:08:22 +08:00
wjc404
94db259e5b
Add files via upload
2019-07-20 22:04:41 +08:00
wjc404
f49f8047ac
Add files via upload
2019-07-20 14:33:37 +08:00
wjc404
825777faab
Update dgemm_kernel_4x8_haswell.S
2019-07-19 23:58:24 +08:00
wjc404
9c89757562
Add files via upload
2019-07-19 23:47:58 +08:00
wjc404
9b04baeaee
Update dgemm_kernel_4x8_haswell.S
2019-07-17 23:50:03 +08:00
wjc404
8a074b3965
Update dgemm_kernel_4x8_haswell.S
2019-07-17 23:47:30 +08:00
wjc404
211ab03b14
Update dgemm_kernel_4x8_haswell.S
2019-07-17 22:39:15 +08:00
wjc404
1733f927e6
Update dgemm_kernel_4x8_haswell.S
2019-07-17 21:27:41 +08:00
wjc404
182b06d6ad
Update dgemm_kernel_4x8_haswell.S
2019-07-17 17:02:35 +08:00
wjc404
7a9050d681
Update dgemm_kernel_4x8_haswell.S
2019-07-17 00:55:06 +08:00
wjc404
0ba29fd262
Update dgemm_kernel_4x8_haswell.S for zen2
...
replaced a bunch of vpermpd instructions with vpermilpd and vperm2f128
2019-07-17 00:46:51 +08:00
Martin Kroeker
6b6c9b1441
Merge pull request #2172 from quickwritereader/develop
...
power9 cgemm/ctrmm. new sgemm 8x16
2019-07-01 21:06:02 +02:00
AbdelRauf
a97b301aaa
cgemm/ctrmm power9
2019-07-01 14:07:54 +00:00
Piotr Kubaj
eebfeba768
Fix build on FreeBSD/powerpc64.
...
Signed-off-by: Piotr Kubaj <pkubaj@anongoth.pl>
2019-06-25 10:58:56 +02:00
kavanabhat
a575f1e4c7
Update dtrmm_kernel_16x4_power8.S
2019-06-19 15:27:14 +05:30
AbdelRauf
cdbfb891da
new sgemm 8x16
2019-06-17 15:33:38 +00:00
Martin Kroeker
a17cf36225
Merge pull request #2153 from quickwritereader/develop
...
improved power9 zgemm,sgemm
2019-06-06 07:42:56 +02:00
AbdelRauf
148c4cc5fd
conflict resolve
2019-06-05 20:50:50 +00:00
AbdelRauf
d0c3543c3f
power9 zgemm ztrmm optimized
2019-06-05 20:07:16 +00:00
AbdelRauf
a469b32cf4
sgemm pipeline improved, zgemm rewritten without inner packs, ABI lxvx v20 fixed with vs52
2019-06-04 07:11:30 +00:00
AbdelRauf
8fe794f059
improved zgemm power9 based on power8
2019-05-30 15:31:25 +00:00
Martin Kroeker
74c10b57c6
Use generic kernels for complex (I)AMAX to support softfp
2019-05-30 11:38:11 +02:00
Martin Kroeker
c5495d2056
Ensure correct output for DAMAX with softfp
2019-05-30 11:25:43 +02:00
Martin Kroeker
c70496b108
Separate implementations of AMAX and IAMAX on arm
...
As noted in #1912 and comment on #1942 , the combined implementation happens to "do the right thing" on hardfp, but cannot return both value and index on softfp where they would have to share the return register
2019-05-29 15:02:51 +02:00
Martin Kroeker
9ea30f3788
Replace ISMIN and ISAMIN kernels on all x86_64 platforms ( #2125 )
...
* Mark iamax_sse.S as unsuitable for MIN due to issue #2116
* Use iamax.S rather than iamax_sse.S for ISMIN/ISAMIN on all x86_64 as workaround for #2116
2019-05-09 14:42:36 +02:00
Martin Kroeker
6a8b4269b5
Merge pull request #2111 from martin-frbg/issue1955
...
Disable the SkyLakeX DGEMMIxCOPY kernels as well
2019-05-05 18:08:49 +02:00
Martin Kroeker
b1561ecc68
Disable DGEMMINCOPY as well for now
...
#1955
2019-05-05 15:52:01 +02:00
Martin Kroeker
7ed8431527
Disable the SkyLakeX DGEMMITCOPY kernel as well
...
as a stopgap measure for https://github.com/numpy/numpy/issues/13401 as mentioned in #1955
2019-05-04 22:54:41 +02:00
Martin Kroeker
3f427c0cf9
Merge pull request #2107 from quickwritereader/develop
...
sgemm/strmm kernel for power9
2019-05-02 07:56:57 +02:00
AbdelRauf
47f892198c
conflict resolve
2019-05-01 19:36:22 +00:00
AbdelRauf
628b335e83
Merge branch 'develop' of https://github.com/quickwritereader/OpenBLAS into develop
2019-04-29 08:57:44 +00:00
AbdelRauf
0f105dd8a5
sgemm/strmm
2019-04-29 08:49:50 +00:00
Martin Kroeker
ccfb7ead15
Merge pull request #2072 from martin-frbg/sum
...
Add (C)BLAS extension ?sum
2019-04-23 20:11:36 +02:00
Rashmica Gupta
bcdf1d4917
Add in runtime CPU detection for POWER.
2019-04-09 14:20:16 +10:00
Martin Kroeker
c04a729081
Add ?sum definitions for generic kernel
2019-03-31 13:55:49 +02:00
Martin Kroeker
100d94f94e
Add ?sum
2019-03-31 13:55:05 +02:00
Martin Kroeker
246ca29679
Add ZARCH implementation of ?sum
...
as trivial copies of the respective ?asum kernels with the ABS and vflpsb calls removed
2019-03-30 22:49:05 +01:00
Martin Kroeker
9d717cb5ee
Add x86_64 implementation of ?sum
...
as trivial copy of ?asum with the fabs calls removed
2019-03-30 22:27:04 +01:00
Martin Kroeker
e3bc83f2a8
Add x86 implementation of ?sum
...
as trivial copy of ?asum with the fabs calls removed
2019-03-30 22:26:10 +01:00
Martin Kroeker
70f2a4e0d7
Add SPARC implementation of ?sum
...
as trivial copy of ?asum with the fabs replaced by fmov to preserve code structure
2019-03-30 22:25:06 +01:00
Martin Kroeker
706dfe263b
Add POWER implementation of ?sum
...
as trivial copy of ?asum with the fabs replaced by fmr to preserve code structure
2019-03-30 22:23:42 +01:00
Martin Kroeker
688fa9201c
Add MIPS64 implementation of ?sum
...
as trivial copy of ?asum with the fabs replaced by mov to preserve code structure
2019-03-30 22:22:15 +01:00
Martin Kroeker
cdbe0f0235
Add MIPS implementation of ?sum
...
as trivial copy of ?asum with the fabs calls removed
2019-03-30 22:20:14 +01:00
Martin Kroeker
f8b82bc6dc
Add ia64 implementation of ?sum
...
as trivial copy of asum with the fabs calls removed
2019-03-30 22:18:03 +01:00
Martin Kroeker
3e3ccb9011
Add ARM64 implementations of ?sum
...
as trivial copies of the respective ?asum kernels with the fabs calls removed
2019-03-30 22:13:36 +01:00
Martin Kroeker
94ab4e6fb2
Add ARM implementations of ?sum
...
(trivial copies of the respective ?asum with the fabs calls removed)
2019-03-30 22:11:38 +01:00
Martin Kroeker
c3cfc6986b
Add implementations of ssum/dsum and csum/zsum
...
as trivial copies of asum/zsasum with the fabs calls replaced by fmov to preserve code structure
2019-03-30 22:05:11 +01:00
Martin Kroeker
b9f4943a14
Add ?sum
2019-03-30 22:01:13 +01:00
Martin Kroeker
32c7063cb0
Merge pull request #2061 from martin-frbg/martin-frbg-patch-1
...
Disable the AVX512 DGEMM kernel (again)
2019-03-30 21:21:38 +01:00
Martin Kroeker
7c51cc8527
Merge branch 'develop' into develop
2019-03-29 19:36:29 +01:00
AbdelRauf
853a18bc17
power9 makefile. dgemm based on power8 kernel with following changes : 32x unrolled 16x4 kernel and 8x4 kernel using (lxv stxv butterfly rank1 update). improvement from 17 to 22-23gflops. dtrmm cases were added into dgemm itself
2019-03-29 15:49:40 +00:00
Martin Kroeker
e608d4f7fe
Disable the AVX512 DGEMM kernel (again)
...
Due to as yet unresolved errors seen in #1955 and #2029
2019-03-13 22:10:28 +01:00
Martin Kroeker
03d7110900
Merge pull request #2042 from maomao194313/develop
...
add TARGET support for HiSilicon tsv110 CPUs
2019-03-12 22:57:39 +01:00
Martin Kroeker
f18ab6c17b
Merge pull request #2051 from martin-frbg/issue2048
...
Make TARGET=GENERIC compatible with DYNAMIC_ARCH=1
2019-03-09 16:39:35 +01:00
Martin Kroeker
5b95534afc
Make TARGET=GENERIC compatible with DYNAMIC_ARCH=1
...
for issue #2048
2019-03-09 11:21:16 +01:00
Celelibi
b7f59da42d
Fix crash in sgemm SSE/nano kernel on x86_64
...
Fix bug #2047 .
Signed-off-by: Celelibi <celelibi@gmail.com>
2019-03-07 16:55:13 +01:00
maomao194313
783ba8058f
HiSilicon tsv110 CPUs optimization branch
...
add HiSilicon tsv110 CPUs optimization branch
2019-03-04 16:30:50 +08:00
Andrew
6eee1beac5
move fix to right place
2019-02-24 20:41:02 +02:00
Martin Kroeker
e12cdf58ef
Merge pull request #2024 from martin-frbg/gcc9fixes4
...
Fix inline assembly constraints in Bulldozer TRSM kernels
2019-02-17 11:49:15 +01:00
Martin Kroeker
1860c9456d
Merge pull request #2023 from martin-frbg/gcc9fixes3
...
Fix inline assembly constraints in various x86_64 GEMVN kernels
2019-02-17 11:48:57 +01:00
Martin Kroeker
f9bb76d29a
Fix inline assembly constraints in Bulldozer TRSM kernels
...
rework indices to allow marking i,as and bs as both input and output (marked operand n1 as well for simplicity). For #2009
2019-02-16 20:06:48 +01:00
Martin Kroeker
efb9038f72
Fix inline assembly constraints
2019-02-16 18:46:17 +01:00
Martin Kroeker
e976557d29
Fix inline assembly constraints
...
rework indices to allow marking argument lda as input and output.
2019-02-16 18:36:39 +01:00
Martin Kroeker
9d8be15789
Fix inline assembly constraints
...
rework indices to allow marking argument lda4 as input and output. For #2009
2019-02-16 18:24:11 +01:00
Martin Kroeker
d752799a0f
Merge pull request #2021 from martin-frbg/gcc9fixes2
...
Fix wrong constraints in inline assembly of Haswell DTRSM kernel
2019-02-16 18:05:40 +01:00
Martin Kroeker
c26c0b77a7
Fix wrong constraints in inline assembly
...
for #2009
2019-02-15 15:08:16 +01:00
Martin Kroeker
1c6da2d03c
Merge pull request #2019 from martin-frbg/gcc9fixes
...
Fix unannounced modification of input operand 8 (lda4) in Haswell GEMVN microkernel
2019-02-15 15:02:54 +01:00
Martin Kroeker
4255a58cd2
Rename operands to put lda on the input/output constraint list
2019-02-15 10:10:04 +01:00
Martin Kroeker
46e415b140
Save and restore input argument 8 (lda4)
...
Fixes miscompilation with gcc9 -ftree-vectorize (related to issue #2009 )
2019-02-14 22:43:18 +01:00
Bart Oldeman
69a97ca7b9
dgemv_kernel_4x4(Haswell): add missing clobbers for xmm0,xmm1,xmm2,xmm3
...
This fixes a crash in dblat2 when OpenBLAS is compiled using
-march=znver1 -ftree-vectorize -O2
See also:
https://github.com/easybuilders/easybuild-easyconfigs/issues/7180
2019-02-14 16:27:58 +00:00
Martin Kroeker
056917d616
Merge pull request #2013 from martin-frbg/issue2011
...
Fix invalid memory access in PPC gemm_beta
2019-02-14 09:29:34 +01:00
Martin Kroeker
718efcec6f
Fix out-of-bounds memory access in gemm_beta
...
Fixes #2011 (as suggested by davemq), assuming typo by K.Goto
2019-02-13 22:08:37 +01:00
Martin Kroeker
f9d67bb5e8
Fix out-of-bounds memory access in gemm_beta
...
Fixes #2011 (as suggested by davemq) presuming typo by K.Goto
2019-02-13 22:06:41 +01:00
Martin Kroeker
76bb74fcd4
Merge pull request #2012 from maamountki/z14
...
[ZARCH] Many improvements
2019-02-13 20:15:56 +01:00
maamountki
0a54c98b9d
[ZARCH] Modify constraints
2019-02-13 21:06:25 +02:00
maamountki
bec54ae366
[ZARCH] Fix caxpy
2019-02-13 12:54:35 +02:00
Martin Kroeker
ab1630f9fa
Fix declaration of arguments in inline assembly
...
Argument 0 is modified so should be input and output
2019-02-12 16:14:02 +01:00
Martin Kroeker
b824fa70eb
Fix declaration of assembly arguments in SSYMV and DSYMV microkernels
...
Arguments 0 and 1 are both input and output
2019-02-12 16:00:18 +01:00
Martin Kroeker
91481a3e4e
Fix declaration of input arguments in inline assembly
...
Argument 0 is modified as it doubles as a counter
2019-02-12 15:51:43 +01:00
Martin Kroeker
dc6ac9eab0
Fix declaration of input arguments in the x86_64 s/dGEMV_T and s/dGEMV_N kernels
...
Arguments 0 and 1 need to be tagged as both input and output
2019-02-12 15:33:48 +01:00
maamountki
f583674109
[ZARCH] Fix cgemv_t_4
2019-02-12 13:12:28 +02:00
maamountki
77fe70019f
[ZARCH] Fix constraints and source code formatting
2019-02-11 16:01:13 +02:00
maamountki
7039770165
[ZARCH] Undo the last commit
2019-02-06 20:11:44 +02:00
maamountki
11a43e8116
[ZARCH] Set alignment hint for vl/vst
2019-02-05 19:17:08 +02:00
maamountki
61526480f9
[ZARCH] Fix copy constraint
2019-02-05 07:51:19 +02:00
maamountki
81daf6bc38
[ZARCH] Format source code, Fix constraints
2019-02-05 07:30:38 +02:00
Martin Kroeker
729e925174
Merge pull request #1996 from quickwritereader/develop
...
NBMAX=4096 for gemvn, added sgemvn 8x8 for future
2019-02-04 16:52:04 +01:00
Ubuntu
498ac98581
Note for unused kernels
2019-02-04 15:41:56 +00:00
Ubuntu
cd9ea45463
NBMAX=4096 for gemvn, added sgemvn 8x8 for future
2019-02-04 06:57:11 +00:00
Martin Kroeker
f9c5023e04
Merge pull request #1994 from quickwritereader/develop
...
sgemv cgemv pairs
2019-02-01 21:04:47 +01:00
Ubuntu
4abc375a91
sgemv cgemv pairs
2019-02-01 13:45:00 +00:00
Martin Kroeker
874df65491
Fix incorrect sgemv results for IBM z14
...
part of PR #1993 that was inadvertently misplaced into the toplevel directory
2019-02-01 12:58:59 +01:00
Martin Kroeker
877023e1e1
Fix precision of zarch DSDOT
...
from patch provided by aarnez in #991
2019-01-31 21:22:26 +01:00
Martin Kroeker
265142edd5
Fix typo in the zarch min/max kernels
...
from patch provided by aarnez in #991
2019-01-31 21:21:40 +01:00
Martin Kroeker
885a3c4350
USE_TRMM on Z14
...
from patch provided by aarnez in #991
2019-01-31 21:18:09 +01:00
maamountki
82124729af
Merge branch 'develop' into z14
2019-01-31 19:36:41 +02:00
maamountki
29416cb5a3
[ZARCH] Add Z13 version for max/min functions
2019-01-31 19:11:11 +02:00
maamountki
48b9b94f7f
[ZARCH] Improve loading performance for camax/icamax
2019-01-31 18:52:11 +02:00
Martin Kroeker
86a824c97f
Fix wrong comparison that made IMIN identical to IMAX
...
as reported by aarnez in #1990
2019-01-31 15:27:21 +01:00
Martin Kroeker
808410c2c7
Fix wrong comparison that made IMIN identical to IMAX
...
as suggested in #1990
2019-01-31 15:25:15 +01:00
maamountki
fcd814a8d2
[ZARCH] Fix bug in max/min functions
2019-01-29 17:59:38 +02:00
maamountki
dc4d3bccd5
[ZARCH] Fix icamax/icamin
2019-01-29 03:47:49 +02:00
maamountki
c7143c1019
[ZARCH] Fix iamax/imax single precision
2019-01-28 17:52:23 +02:00
maamountki
04873bb174
[ZARCH] Undo the last commit
2019-01-28 17:32:24 +02:00
maamountki
c8ef9fb220
[ZARCH] Fix bug in iamax/iamin/imax/imin
2019-01-28 17:16:18 +02:00
maamountki
b111829226
[ZARCH] Update max/min functions
2019-01-21 15:56:04 +02:00
Martin Kroeker
32b0f1168e
Fix declaration of input arguments in the Sandybridge GER microkernels ( #1967 )
...
* Tag arguments 0 and 1 as both input and output
2019-01-18 08:11:39 +01:00
Martin Kroeker
b495e54310
Fix declaration of input arguments in the x86_64 SCAL microkernels ( #1966 )
...
* Tag arguments 0 and 1 as both input and output (see #1964 )
2019-01-18 08:11:07 +01:00
Martin Kroeker
d5e6940253
Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY ( #1965 )
...
* Tag operands 0 and 1 as both input and output
For #1964 (basically a continuation of coding problems first seen in #1292 )
2019-01-17 23:20:32 +01:00
Ubuntu
43a4572038
crot fix
2019-01-17 14:45:31 +00:00
Abdelrauf
a034e65512
Merge branch 'develop' into develop
2019-01-16 19:25:13 +04:00
Ubuntu
8c3386be87
Added missing Blas1 single fp {saxpy, caxpy, cdot, crot(refactored version of srot),isamax ,isamin, icamax, icamin},
...
Fixed idamin,icamin choosing the first occurance index of equal minimals
2019-01-16 15:16:21 +00:00
maamountki
b815a04c87
[ZARCH] fix a bug in max/min functions
2019-01-15 21:04:22 +02:00
maamountki
1a7925b3a3
[ZARCH] Update dgemv_n_4.c
2019-01-11 17:43:11 +02:00
maamountki
406f835f00
[ZARCH] update cgemv_n_4.c
2019-01-11 17:39:17 +02:00
maamountki
621dedb37b
[ZARCH] Update cgemv_t_4.c
2019-01-11 17:37:11 +02:00
maamountki
b731e8246f
Update sgemv_t_4.c
2019-01-11 17:14:04 +02:00
maamountki
ecc31b743f
Update dgemv_t_4.c
2019-01-11 17:13:02 +02:00
maamountki
5d89d6b143
[ZARCH] fix sgemv_n_4.c
2019-01-11 17:08:24 +02:00
maamountki
67432b23c2
[ZARCH] fix cgemv_n_4.c
2019-01-11 16:44:46 +02:00