Commit Graph

3414 Commits

Author SHA1 Message Date
Werner Saar 34ba66606a add optimized daxpy-kernel for piledriver 2015-04-14 14:23:29 +02:00
Werner Saar f615dc7603 added optimized saxpy kernel for steamroller 2015-04-14 09:09:39 +02:00
Werner Saar 331c417637 optimized saxpy for piledriver 2015-04-14 08:34:11 +02:00
Zhang Xianyi 6c3a0b5d46 Enable MAX_STACK_ALLOC by default. 2015-04-13 23:23:40 -05:00
Zhang Xianyi fd9fd42936 Refs #478, #482. Fixed bug on previous commit. 2015-04-13 23:22:27 -05:00
Zhang Xianyi 9798481979 Refs #478, #482. Fix segfault bug for gemv_t with MAX_ALLOC_STACK flag.
For gemv_t, directly use malloc to create the buffer.
2015-04-13 19:45:27 -05:00
Werner Saar d7a17ad85d optimized sdot-kernel for pilediver 2015-04-13 13:19:21 +02:00
Werner Saar d35f6c63c2 add optimized daxpy-kernel for steamroller 2015-04-13 12:22:43 +02:00
Werner Saar 166d76e864 added optimized sdot-kernel for steamroller 2015-04-11 08:48:18 +02:00
Werner Saar f9f127d838 added optimized ddot kernel for steamroller 2015-04-10 16:18:03 +02:00
wernsaar 62231ab337 Merge pull request #538 from wernsaar/develop
Added optimized cdot- and zdot-kernels
2015-04-10 16:03:37 +02:00
Werner Saar 3119def9a7 updated cdot and zdot 2015-04-10 11:10:31 +02:00
Werner Saar 33b332372a add optimized cdot- and zdot-kernel for sandybridge 2015-04-10 09:37:26 +02:00
Werner Saar fd838c75bc add optimized cdot- and zdot-kernel for haswell 2015-04-09 15:13:52 +02:00
Werner Saar b57a60dac8 updated cdot and zdot for piledriver 2015-04-09 10:33:46 +02:00
Werner Saar 5c51163972 added optimized cdot- and zdot-kernel for steamroller 2015-04-09 09:45:23 +02:00
Werner Saar 9299d8cfd6 added optimized cdot- and zdot-kernels for bulldozer 2015-04-08 16:29:55 +02:00
Zhang Xianyi 0a3d3b945d Refs #535. Fix the wrong vector instruction in sgemm sandy bridge kernel. 2015-04-08 03:55:49 +08:00
Zhang Xianyi 4f680a7d61 Merge pull request #534 from wernsaar/develop
Refs #533. added optimized saxpy- and daxpy-kernel for haswell and sandybridge
2015-04-07 12:48:11 -05:00
Werner Saar ba926e807c added cdot- and zdot benchmark 2015-04-07 11:56:06 +02:00
Werner Saar 60c6dec6e6 updated some lines for bulldozer 2015-04-06 18:47:16 +02:00
Werner Saar 47898cca35 added optimized saxpy- and daxpy-kernel for sandybridge 2015-04-06 16:05:16 +02:00
Werner Saar 53bb924287 added optimized saxpy- and daxpy-kernel for haswell 2015-04-06 12:33:16 +02:00
Zhang Xianyi 1e80b8b0d3 Merge pull request #531 from wernsaar/develop
added optimized sdot- and ddot-kernels for Haswell and Sandybridge
2015-04-05 16:42:39 -05:00
Werner Saar a901b065d3 added optimized ddot-kernel for sandybridge 2015-04-05 20:19:38 +02:00
Werner Saar 3937e2a0a0 add optimized sdot-kernel for sandybridge 2015-04-05 19:47:05 +02:00
Werner Saar 9707d608d5 removed double definition line 2015-04-05 18:35:34 +02:00
Werner Saar 701b9d7556 added optimized sdot- and ddot-kernel for HASWELL 2015-04-05 17:57:53 +02:00
Zhang Xianyi 8977b3f235 Refs #529. Support Intel Broadwell by Haswell kernels. 2015-04-02 11:08:03 -05:00
Zhang Xianyi f6426395ea Merge pull request #527 from xantares/patch-1
fix mingw install
2015-03-30 10:16:11 -05:00
xantares 0ac787eefe fix mingw install 2015-03-30 09:30:55 +02:00
Zhang Xianyi e5b96e55a7 Fix build bug for ARM64. 2015-03-24 15:27:17 -05:00
Zhang Xianyi d0c51c4de9 Merge branch 'develop' 2015-03-24 15:07:07 -05:00
Zhang Xianyi a3491e1e88 Update the doc for 0.2.14. 2015-03-24 15:05:59 -05:00
Zhang Xianyi e81a5d61e4 Merge branch 'develop' of github.com:xianyi/OpenBLAS into develop 2015-03-24 12:17:12 -05:00
Zhang Xianyi c674fa32be Add ARM targets. 2015-03-24 12:17:04 -05:00
Zhang Xianyi e34911a73d Fix compiling bug for ARM with setting BINARY. 2015-03-24 17:15:33 +00:00
Zhang Xianyi 76dcaf2281 Merge pull request #521 from maxlevesque/patch-1
Correct typo /proc/ instead of /pros/
2015-03-21 12:26:35 -05:00
Maximilien Levesque 770fac92eb Correct typo /proc/ instead of /pros/ 2015-03-20 23:25:11 +01:00
Zhang Xianyi e95d64333a Refs #519. Avoid calling strncpy. 2015-03-19 15:57:22 -05:00
Zhang Xianyi 75c40bcc48 Refs #520. Fixed ONLY_CBLAS=1 compiling bug on OSX. 2015-03-19 11:52:09 -05:00
Zhang Xianyi b62f9f4120 Merge pull request #518 from ton/issue-508
Fix issue #508
2015-03-18 13:00:07 -05:00
Ton van den Heuvel b6438dedea Fix issue #508
Fix race condition during shutdown causing a crash in
gotoblas_set_affinity().
2015-03-18 13:22:43 +01:00
Hank Anderson 1d183dcda8 Added lapacke sources. 2015-02-25 16:51:08 -06:00
Zhang Xianyi cdefdb21cd Refs #492. Fixed c/zsyr bug with negative incx. 2015-02-26 06:37:03 +08:00
Hank Anderson e19bf3a28b Removed MSVC cpuid func when using clang. 2015-02-25 14:44:49 -06:00
Hank Anderson 3649cfbd7b Fixed EPILOGUE for clang. 2015-02-25 12:23:26 -06:00
Hank Anderson 5ae8993752 Added intrinsics for MSVC. 2015-02-25 11:52:51 -06:00
Hank Anderson 84d90d6ed8 Fixed some compiler errors/warnings for clang. 2015-02-25 11:52:25 -06:00
Hank Anderson 518e2424a8 Fixed bad filename for cpuid.S compile. 2015-02-25 11:51:29 -06:00