wernsaar
6acbafe45b
added sgemv_n microkernel for haswell
2014-07-20 14:52:25 +02:00
wernsaar
5392d11b04
optimized sgemv_n_microk_sandy.c
2014-07-20 14:08:04 +02:00
wernsaar
c0fe95fb72
added sgemv_n microkernel for sandybridge
2014-07-20 13:17:47 +02:00
wernsaar
d9d4077c93
added sgemv_t microkernel for haswell
2014-07-20 11:30:32 +02:00
wernsaar
02eb72ac42
bugfix in sgemv_t_microk_sandy.c
2014-07-20 10:48:41 +02:00
wernsaar
c06f9986d4
added sgemv_t microkernel for sandybridge
2014-07-20 10:21:08 +02:00
wernsaar
2cce125c79
added optimized sgemv_t for bulldozer and piledriver
2014-07-19 15:48:07 +02:00
wernsaar
b3938fe371
don't use this sgemv_n on Windows
2014-07-19 07:15:34 +02:00
wernsaar
c8a4a56177
performance optimizations for sgemv_n
2014-07-18 11:25:21 +02:00
wernsaar
3c5732615d
added blocked sgemv_n and microkernel for bulldozer and piledriver
2014-07-17 23:15:07 +02:00
wernsaar
880597b301
segment violation in sgemv kernels
2014-07-13 10:46:14 +02:00
wernsaar
0884b73c69
Lapack-test Windows 32bit now error free
2014-07-10 11:01:47 +02:00
wernsaar
9bd9472ae9
Lapack-test: cleanup of x86 32bit KERNEL file
2014-07-09 16:08:19 +02:00
wernsaar
c4a423a642
bugfixes for lapack on ARM Platform
2014-07-09 12:21:39 +02:00
wernsaar
13348b2137
removed reference to daxpy_bulldozer kernel (Windows bug in lapack-test)
2014-07-06 16:39:32 +02:00
wernsaar
9964ed2f79
bugfix for CORE2
2014-07-06 11:47:28 +02:00
wernsaar
d5b976f92d
fallback to zgemm_kernel_4x2_sse.S
2014-07-06 11:05:28 +02:00
wernsaar
f7267d9b0e
added missing definition for DUNNINGTON
2014-07-06 10:17:07 +02:00
wernsaar
e0c080a28c
removed reference to zgemm_kernel_4x2_sse3.S (bug in lapack-test)
2014-07-05 16:13:17 +02:00
wernsaar
e80b144932
enabled compiling of *3M functions
2014-07-02 14:11:53 +02:00
wernsaar
be94db096c
disabled *3M functions for x86_64 platforms
2014-07-01 16:18:05 +02:00
wernsaar
b079df9ef4
added optimized sdot- and dsdot-kernel, written in C
2014-06-30 14:46:38 +02:00
wernsaar
01a119abfc
enabled SMP for sbmv and zsbmv, but only for 64bit binaries
2014-06-29 20:35:56 +02:00
Zhang Xianyi
99efbbbad5
Fixed #395 . Enable optimized cgemm for Sandybridge. Added optimized sdot kernel.
...
Fixed c/zgemm, zgemv computational error of haswell, piledriver, bullldozer, and
barcelona on Windows.
Merge branch 'develop' of https://github.com/wernsaar/OpenBLAS into wernsaar-develop
Conflicts:
kernel/Makefile.L1
kernel/x86_64/KERNEL
param.h
2014-06-29 10:34:51 +08:00
wernsaar
22e5aee2dd
fixed zgemv bug for older AMD Processors
2014-06-28 19:04:49 +02:00
wernsaar
35d37e124f
bugfix for barcelona zgemv-kernel
2014-06-28 12:36:11 +02:00
wernsaar
d8ba46efdb
bugfix for bulldozer cgemm-, zgemm- and zgemv-kernel
2014-06-28 12:16:20 +02:00
wernsaar
a15f22a1f6
bugfix for piledriver cgemm-, zgemm- and zgemv-kernel
2014-06-28 11:46:58 +02:00
wernsaar
b94ea89f52
bugfix for haswell cgemm- and zgemm-kernel
2014-06-28 10:22:40 +02:00
wernsaar
35f668bb14
bugfix for cgemm_kernel_8x2_sandy.S
2014-06-28 10:01:56 +02:00
Timothy Gu
6c2ead30f0
Remove all trailing whitespace except lapack-netlib
...
Signed-off-by: Timothy Gu <timothygu99@gmail.com>
2014-06-27 12:05:18 -07:00
wernsaar
365e8de346
added optimized cgemm-kernel for SANDYBRIDGE
2014-06-27 13:40:29 +02:00
wernsaar
578d1b6219
added DSDOT definition and enabled optimized sdot kernel
2014-06-27 11:30:29 +02:00
wernsaar
dabab2b5f4
added new optimized sgemm kernel for SANDYBRIGE
2014-06-26 21:42:08 +02:00
wernsaar
aa2709c4e0
enabled optimized dgemm kernel for NEHALEM
2014-06-26 12:22:29 +02:00
wernsaar
a13bcc1716
enabled optimized sgemv kernel for barcelona and piledriver
2014-06-25 13:50:57 +02:00
wernsaar
d2c82d7543
enabled optimized sgemv kernel for HASWELL
2014-06-25 12:56:45 +02:00
wernsaar
0517672dd0
enabled optimized sgemv kernels for nehalem, sandybridge and bulldozer
2014-06-25 12:38:14 +02:00
wernsaar
23203d52c1
Ref #380 : lowered stack usage for haswell kernels
2014-06-19 14:31:52 +02:00
wernsaar
73545a79cd
Ref #380 : lowered stack usage for piledriver and bulldozer kernels
2014-06-19 14:02:14 +02:00
wernsaar
ff9cfca24c
Ref #385 : added missing return instruction
2014-06-12 15:52:14 +02:00
wernsaar
cee257f384
Ref #51 : added blas extensions zomatcopy and comatcopy
2014-06-10 10:34:54 +02:00
wernsaar
7bfb3011e8
Ref #51 : added blas extension somatcopy
2014-06-09 20:21:13 +02:00
wernsaar
8c8f596238
Ref #51 : added blas extension domatcopy as not opimized reference
2014-06-09 17:11:07 +02:00
wernsaar
faf3ac0aad
Ref #285 : added axpby kernels
2014-06-08 11:54:24 +02:00
Zhang Xianyi
406f5bd22b
Merge branch 'develop' of https://github.com/wernsaar/OpenBLAS into wernsaar-develop
...
Conflicts:
kernel/arm/KERNEL.ARMV6
2014-05-21 11:24:39 +08:00
wernsaar
aaddb05411
bugfix for ARMV6
2014-05-17 13:00:36 +02:00
wernsaar
e826a5a6af
some modifications regarding lapack test
2014-05-16 20:37:41 +02:00
wernsaar
c38379c9dd
bugfixes for ARM regarding lapack tests
2014-05-14 13:03:45 +02:00
wernsaar
a0b07c1440
bugfixs for ARM regarding lapack tests
2014-05-14 12:59:20 +02:00