Werner Saar
|
6d0db0151f
|
added optimized zaxpy-kernels
|
2015-04-16 11:19:37 +02:00 |
Zhang Xianyi
|
37b9033c90
|
Merge pull request #543 from jeromerobert/develop
Fix a buffer overflow with MAX_STACK_ALLOC size in dgemv_t
|
2015-04-15 11:18:14 -05:00 |
wernsaar
|
59e7a518c6
|
Merge pull request #544 from wernsaar/develop
Optimized caxpy-kernels
|
2015-04-15 17:04:02 +02:00 |
Werner Saar
|
13889515b3
|
added optimized caxpy-kernel for sandybridge
|
2015-04-15 16:29:25 +02:00 |
Werner Saar
|
248c9340c3
|
added optimized caxpy-kernel for haswell
|
2015-04-15 15:16:31 +02:00 |
Werner Saar
|
e9f33b4ca7
|
added optimized caxpy-kernel for steamroller
|
2015-04-15 13:49:23 +02:00 |
Werner Saar
|
f5d847122a
|
updated caxpy_microk_bulldozer-2.c and caxpy.c
|
2015-04-15 11:59:38 +02:00 |
Jerome Robert
|
a4c96eca67
|
Fix a buffer overflow with MAX_STACK_ALLOC size in dgemv_t
Refs #478, #482, 9798481 , fd9fd42
|
2015-04-15 11:46:48 +02:00 |
wernsaar
|
fb02cb0a41
|
Merge pull request #540 from wernsaar/develop
Optimized dot- and axpy-kernels
|
2015-04-14 15:53:09 +02:00 |
Werner Saar
|
baa0363ea2
|
add optimized ddot-kernel for piledriver
|
2015-04-14 15:09:13 +02:00 |
Werner Saar
|
34ba66606a
|
add optimized daxpy-kernel for piledriver
|
2015-04-14 14:23:29 +02:00 |
Werner Saar
|
f615dc7603
|
added optimized saxpy kernel for steamroller
|
2015-04-14 09:09:39 +02:00 |
Werner Saar
|
331c417637
|
optimized saxpy for piledriver
|
2015-04-14 08:34:11 +02:00 |
Zhang Xianyi
|
6c3a0b5d46
|
Enable MAX_STACK_ALLOC by default.
|
2015-04-13 23:23:40 -05:00 |
Zhang Xianyi
|
fd9fd42936
|
Refs #478, #482. Fixed bug on previous commit.
|
2015-04-13 23:22:27 -05:00 |
Zhang Xianyi
|
9798481979
|
Refs #478, #482. Fix segfault bug for gemv_t with MAX_ALLOC_STACK flag.
For gemv_t, directly use malloc to create the buffer.
|
2015-04-13 19:45:27 -05:00 |
Werner Saar
|
d7a17ad85d
|
optimized sdot-kernel for pilediver
|
2015-04-13 13:19:21 +02:00 |
Werner Saar
|
d35f6c63c2
|
add optimized daxpy-kernel for steamroller
|
2015-04-13 12:22:43 +02:00 |
Werner Saar
|
166d76e864
|
added optimized sdot-kernel for steamroller
|
2015-04-11 08:48:18 +02:00 |
Werner Saar
|
f9f127d838
|
added optimized ddot kernel for steamroller
|
2015-04-10 16:18:03 +02:00 |
wernsaar
|
62231ab337
|
Merge pull request #538 from wernsaar/develop
Added optimized cdot- and zdot-kernels
|
2015-04-10 16:03:37 +02:00 |
Werner Saar
|
3119def9a7
|
updated cdot and zdot
|
2015-04-10 11:10:31 +02:00 |
Werner Saar
|
33b332372a
|
add optimized cdot- and zdot-kernel for sandybridge
|
2015-04-10 09:37:26 +02:00 |
Werner Saar
|
fd838c75bc
|
add optimized cdot- and zdot-kernel for haswell
|
2015-04-09 15:13:52 +02:00 |
Werner Saar
|
b57a60dac8
|
updated cdot and zdot for piledriver
|
2015-04-09 10:33:46 +02:00 |
Werner Saar
|
5c51163972
|
added optimized cdot- and zdot-kernel for steamroller
|
2015-04-09 09:45:23 +02:00 |
Werner Saar
|
9299d8cfd6
|
added optimized cdot- and zdot-kernels for bulldozer
|
2015-04-08 16:29:55 +02:00 |
Zhang Xianyi
|
0a3d3b945d
|
Refs #535. Fix the wrong vector instruction in sgemm sandy bridge kernel.
|
2015-04-08 03:55:49 +08:00 |
Zhang Xianyi
|
4f680a7d61
|
Merge pull request #534 from wernsaar/develop
Refs #533. added optimized saxpy- and daxpy-kernel for haswell and sandybridge
|
2015-04-07 12:48:11 -05:00 |
Werner Saar
|
ba926e807c
|
added cdot- and zdot benchmark
|
2015-04-07 11:56:06 +02:00 |
Werner Saar
|
60c6dec6e6
|
updated some lines for bulldozer
|
2015-04-06 18:47:16 +02:00 |
Werner Saar
|
47898cca35
|
added optimized saxpy- and daxpy-kernel for sandybridge
|
2015-04-06 16:05:16 +02:00 |
Werner Saar
|
53bb924287
|
added optimized saxpy- and daxpy-kernel for haswell
|
2015-04-06 12:33:16 +02:00 |
Zhang Xianyi
|
1e80b8b0d3
|
Merge pull request #531 from wernsaar/develop
added optimized sdot- and ddot-kernels for Haswell and Sandybridge
|
2015-04-05 16:42:39 -05:00 |
Werner Saar
|
a901b065d3
|
added optimized ddot-kernel for sandybridge
|
2015-04-05 20:19:38 +02:00 |
Werner Saar
|
3937e2a0a0
|
add optimized sdot-kernel for sandybridge
|
2015-04-05 19:47:05 +02:00 |
Werner Saar
|
9707d608d5
|
removed double definition line
|
2015-04-05 18:35:34 +02:00 |
Werner Saar
|
701b9d7556
|
added optimized sdot- and ddot-kernel for HASWELL
|
2015-04-05 17:57:53 +02:00 |
Zhang Xianyi
|
8977b3f235
|
Refs #529. Support Intel Broadwell by Haswell kernels.
|
2015-04-02 11:08:03 -05:00 |
Zhang Xianyi
|
f6426395ea
|
Merge pull request #527 from xantares/patch-1
fix mingw install
|
2015-03-30 10:16:11 -05:00 |
xantares
|
0ac787eefe
|
fix mingw install
|
2015-03-30 09:30:55 +02:00 |
Zhang Xianyi
|
e5b96e55a7
|
Fix build bug for ARM64.
|
2015-03-24 15:27:17 -05:00 |
Zhang Xianyi
|
d0c51c4de9
|
Merge branch 'develop'
|
2015-03-24 15:07:07 -05:00 |
Zhang Xianyi
|
a3491e1e88
|
Update the doc for 0.2.14.
|
2015-03-24 15:05:59 -05:00 |
Zhang Xianyi
|
e81a5d61e4
|
Merge branch 'develop' of github.com:xianyi/OpenBLAS into develop
|
2015-03-24 12:17:12 -05:00 |
Zhang Xianyi
|
c674fa32be
|
Add ARM targets.
|
2015-03-24 12:17:04 -05:00 |
Zhang Xianyi
|
e34911a73d
|
Fix compiling bug for ARM with setting BINARY.
|
2015-03-24 17:15:33 +00:00 |
Zhang Xianyi
|
76dcaf2281
|
Merge pull request #521 from maxlevesque/patch-1
Correct typo /proc/ instead of /pros/
|
2015-03-21 12:26:35 -05:00 |
Maximilien Levesque
|
770fac92eb
|
Correct typo /proc/ instead of /pros/
|
2015-03-20 23:25:11 +01:00 |
Zhang Xianyi
|
e95d64333a
|
Refs #519. Avoid calling strncpy.
|
2015-03-19 15:57:22 -05:00 |