Commit Graph

1338 Commits

Author SHA1 Message Date
Werner Saar d35f6c63c2 add optimized daxpy-kernel for steamroller 2015-04-13 12:22:43 +02:00
Werner Saar 166d76e864 added optimized sdot-kernel for steamroller 2015-04-11 08:48:18 +02:00
Werner Saar f9f127d838 added optimized ddot kernel for steamroller 2015-04-10 16:18:03 +02:00
wernsaar 62231ab337 Merge pull request #538 from wernsaar/develop
Added optimized cdot- and zdot-kernels
2015-04-10 16:03:37 +02:00
Werner Saar 3119def9a7 updated cdot and zdot 2015-04-10 11:10:31 +02:00
Werner Saar 33b332372a add optimized cdot- and zdot-kernel for sandybridge 2015-04-10 09:37:26 +02:00
Werner Saar fd838c75bc add optimized cdot- and zdot-kernel for haswell 2015-04-09 15:13:52 +02:00
Werner Saar b57a60dac8 updated cdot and zdot for piledriver 2015-04-09 10:33:46 +02:00
Werner Saar 5c51163972 added optimized cdot- and zdot-kernel for steamroller 2015-04-09 09:45:23 +02:00
Werner Saar 9299d8cfd6 added optimized cdot- and zdot-kernels for bulldozer 2015-04-08 16:29:55 +02:00
Zhang Xianyi 0a3d3b945d Refs #535. Fix the wrong vector instruction in sgemm sandy bridge kernel. 2015-04-08 03:55:49 +08:00
Zhang Xianyi 4f680a7d61 Merge pull request #534 from wernsaar/develop
Refs #533. added optimized saxpy- and daxpy-kernel for haswell and sandybridge
2015-04-07 12:48:11 -05:00
Werner Saar ba926e807c added cdot- and zdot benchmark 2015-04-07 11:56:06 +02:00
Werner Saar 60c6dec6e6 updated some lines for bulldozer 2015-04-06 18:47:16 +02:00
Werner Saar 47898cca35 added optimized saxpy- and daxpy-kernel for sandybridge 2015-04-06 16:05:16 +02:00
Werner Saar 53bb924287 added optimized saxpy- and daxpy-kernel for haswell 2015-04-06 12:33:16 +02:00
Zhang Xianyi 1e80b8b0d3 Merge pull request #531 from wernsaar/develop
added optimized sdot- and ddot-kernels for Haswell and Sandybridge
2015-04-05 16:42:39 -05:00
Werner Saar a901b065d3 added optimized ddot-kernel for sandybridge 2015-04-05 20:19:38 +02:00
Werner Saar 3937e2a0a0 add optimized sdot-kernel for sandybridge 2015-04-05 19:47:05 +02:00
Werner Saar 9707d608d5 removed double definition line 2015-04-05 18:35:34 +02:00
Werner Saar 701b9d7556 added optimized sdot- and ddot-kernel for HASWELL 2015-04-05 17:57:53 +02:00
Zhang Xianyi 8977b3f235 Refs #529. Support Intel Broadwell by Haswell kernels. 2015-04-02 11:08:03 -05:00
Zhang Xianyi f6426395ea Merge pull request #527 from xantares/patch-1
fix mingw install
2015-03-30 10:16:11 -05:00
xantares 0ac787eefe fix mingw install 2015-03-30 09:30:55 +02:00
Zhang Xianyi e5b96e55a7 Fix build bug for ARM64. 2015-03-24 15:27:17 -05:00
Zhang Xianyi a3491e1e88 Update the doc for 0.2.14. 2015-03-24 15:05:59 -05:00
Zhang Xianyi e81a5d61e4 Merge branch 'develop' of github.com:xianyi/OpenBLAS into develop 2015-03-24 12:17:12 -05:00
Zhang Xianyi c674fa32be Add ARM targets. 2015-03-24 12:17:04 -05:00
Zhang Xianyi e34911a73d Fix compiling bug for ARM with setting BINARY. 2015-03-24 17:15:33 +00:00
Zhang Xianyi 76dcaf2281 Merge pull request #521 from maxlevesque/patch-1
Correct typo /proc/ instead of /pros/
2015-03-21 12:26:35 -05:00
Maximilien Levesque 770fac92eb Correct typo /proc/ instead of /pros/ 2015-03-20 23:25:11 +01:00
Zhang Xianyi e95d64333a Refs #519. Avoid calling strncpy. 2015-03-19 15:57:22 -05:00
Zhang Xianyi 75c40bcc48 Refs #520. Fixed ONLY_CBLAS=1 compiling bug on OSX. 2015-03-19 11:52:09 -05:00
Zhang Xianyi b62f9f4120 Merge pull request #518 from ton/issue-508
Fix issue #508
2015-03-18 13:00:07 -05:00
Ton van den Heuvel b6438dedea Fix issue #508
Fix race condition during shutdown causing a crash in
gotoblas_set_affinity().
2015-03-18 13:22:43 +01:00
Zhang Xianyi cdefdb21cd Refs #492. Fixed c/zsyr bug with negative incx. 2015-02-26 06:37:03 +08:00
Zhang Xianyi ea7f9dacf4 Refs #509. Fixed geadd building bug with DYNAMIC_ARCH=1. 2015-02-26 01:47:11 +08:00
Zhang Xianyi bf5dbb7e2a Refs#509. Merge branch 'grisuthedragon-develop' into develop 2015-02-26 01:44:19 +08:00
Martin Koehler 39cc6b21d3 Add ATLAS-style ?geadd function 2015-02-16 13:46:20 +01:00
Zhang Xianyi 771b18ae9c Detect the wrong combined flags of USE_OPENMP=1 and USE_THREAD=0. 2015-02-08 01:42:48 -06:00
Zhang Xianyi cfa9392ffa Fix openblas_get_num_threads and openblas_get_num_procs bug with single thread. 2015-02-08 01:30:23 -06:00
Zhang Xianyi 1ccd57ce80 Merge pull request #497 from eschnett/develop
Introduce openblas_get_num_threads and openblas_get_num_procs
2015-02-03 23:09:38 -06:00
Erik Schnetter 65a847cd36 Introduce openblas_get_num_threads and openblas_get_num_procs 2015-02-03 12:23:41 -05:00
Zhang Xianyi 07ff001981 Merge pull request #495 from jeromerobert/develop
Fix a segfault in gemv when MAX_STACK_ALLOC is set
2015-01-29 18:23:50 +08:00
Jerome Robert b17ccb4c5c Fix a segfault in gemv when MAX_STACK_ALLOC is set
* stack_alloc_size is needed after the implementation call
but it may be overwritten if it's optimized to a register,
because some gemv implementation (ex: dgemv_n.S) do not
restore all register (ex: r10).
* do the same in ger.c for the same reasons even if the bug
has not been observed.
2015-01-29 09:55:57 +01:00
Zhang Xianyi 63c6fcfa0a Merge pull request #490 from eschnett/develop
Move #include statements outside extern "C" blocks
2015-01-13 15:43:56 +08:00
Erik Schnetter 29cb47fc06 Move #include statements outside extern "C" blocks 2015-01-12 21:27:52 -05:00
Zhang Xianyi 4e6c4046f7 Fix cortex-a15 detecting bug. 2015-01-12 09:35:16 +00:00
Zhang Xianyi 229ce2ccd1 Add cortex-a9 and cortex-a15 targets. 2015-01-12 08:55:29 +00:00
Zhang Xianyi ef75be0e51 Merge pull request #487 from kortschak/dromtg-test
Add test for drotmg bug fixed by 692b14c
2015-01-07 14:13:11 +08:00