Zhang Xianyi
|
937493bfeb
|
Release 0.2.16 rc1
|
2016-02-23 18:29:21 -05:00 |
Zhang Xianyi
|
74b0672223
|
Fix c/zaxpyc kernel bug on Cortex-A57.
|
2016-02-23 22:47:53 +00:00 |
Zhang Xianyi
|
6e7be06e07
|
Refs JuliaLang/julia#5728. Fix gemv performance bug on Haswell Mac OSX.
On Mac OS X, it should use .align 4 (equal to .align 16 on Linux).
I didn't get the performance benefit from .align. Thus, I deleted it.
|
2016-02-19 17:56:07 -05:00 |
Zhang Xianyi
|
a04d0555ba
|
[av skip] Fix utest makefile bug on travis ci.
|
2016-02-20 00:21:43 +08:00 |
Zhang Xianyi
|
3761c30ba4
|
Fix makefile bug for utest.
|
2016-02-18 17:01:48 -05:00 |
Zhang Xianyi
|
38593cd3a3
|
Fix compiling bug on ARM Cortex-A57.
|
2016-02-13 15:38:52 +00:00 |
Zhang Xianyi
|
e3b7781c2b
|
Update readme.
|
2016-02-13 00:33:53 +08:00 |
Zhang Xianyi
|
5e6965ea47
|
Run utest when building.
|
2016-02-13 00:33:31 +08:00 |
Zhang Xianyi
|
5cc0301fc3
|
Enable utest for appveyor.
|
2016-02-12 01:50:20 -05:00 |
Zhang Xianyi
|
19a6dedfd6
|
Add utest for CMake.
|
2016-02-12 05:38:13 +08:00 |
Zhang Xianyi
|
0e2b92e216
|
Added mising lapacke files for CMake.
|
2016-02-12 05:28:16 +08:00 |
Zhang Xianyi
|
d06b92906a
|
Add gemm3m building for CMake.
|
2016-02-12 05:02:51 +08:00 |
Zhang Xianyi
|
8e98478ff3
|
Update ctest.h from github.com:xianyi/ctest.git.
|
2016-02-12 05:01:57 +08:00 |
Zhang Xianyi
|
fb8968fb83
|
Refs #707. Bugfix for previous commit.
|
2016-02-11 05:14:53 +08:00 |
Zhang Xianyi
|
dae6b82a71
|
Refs #707. Add BUILD_LAPACK_DEPRECATED flag in Makefile.rule.
If you want to build LAPACK deprecated functions since LAPACK 3.6.0
make BUILD_LAPACK_DEPRECATED=1
|
2016-02-11 04:22:53 +08:00 |
Zhang Xianyi
|
d73244b825
|
Refs #727. Align stack buffer address on 32-bytes.
|
2016-02-11 03:52:02 +08:00 |
Zhang Xianyi
|
233c6b959f
|
Merge pull request #780 from jeromerobert/bug727
Bug727
|
2016-02-08 13:24:40 -05:00 |
Jerome Robert
|
16ec5323c9
|
Fix zgemv.c compilation when stack allocation is disabled
|
2016-02-08 12:05:02 +01:00 |
Jerome Robert
|
0ad02ef2d6
|
update CONTRIBUTORS.md
|
2016-02-08 11:26:51 +01:00 |
Jerome Robert
|
73397faf68
|
Add benchmark/smallscaling.c
* Bench small matrices with multi-threading
* Close #727
|
2016-02-08 11:25:27 +01:00 |
Jerome Robert
|
5fc2203d8a
|
zgemv: Add a workaround for #746
|
2016-02-08 11:25:15 +01:00 |
Jerome Robert
|
78dcf5c3d5
|
Improve performances of ztrmv on small matrices
* Use stack allocation
* Disable multi-threading
* Ref #727
|
2016-02-08 11:25:02 +01:00 |
Jerome Robert
|
32f793195f
|
Use stack allocation in zgemv and zger
For better performance with small matrices
Ref #727
|
2016-02-08 11:24:21 +01:00 |
Zhang Xianyi
|
be4e5fcd20
|
Fixed #778. Merge branch 'buffer51-develop' into develop
|
2016-02-05 08:39:08 +08:00 |
buffer51
|
855e0cb700
|
Restored LAPACK_COMPLEX_STRUCTURE for Android prior to 21. Refs #682.
|
2016-02-04 17:20:07 -05:00 |
buffer51
|
7f7d04dcd2
|
Fixed linking error when compiling ARMv7 for Android (disabled -lpthread and added -Wl,--no-warn-mismatch).
|
2016-02-04 17:05:31 -05:00 |
buffer51
|
4e1b521e27
|
Fix lapack complex implementation of lauu2 and potf2 for Android (use FLOAT instead of FLOAT[2] as imaginary part is not used).
|
2016-02-04 16:59:56 -05:00 |
Zhang Xianyi
|
a1a96589aa
|
Fixed #773 blas_quickdivide bug on CMake and Visual Studio x86 32-bit.
|
2016-02-04 15:23:32 -05:00 |
Zhang Xianyi
|
0e68beb89f
|
Fixed #711, #698. Merge branch 'byzhang-develop' into develop
|
2016-02-03 02:56:27 +08:00 |
Zhang Xianyi
|
926ba8b7ca
|
Merge branch 'develop' of https://github.com/byzhang/OpenBLAS into byzhang-develop
|
2016-02-03 02:48:32 +08:00 |
Zhang Xianyi
|
9f080c47e1
|
Merge pull request #743 from tkelman/patch-1
re enable Fortran optimization flag on windows
|
2016-02-02 13:46:12 -05:00 |
Zhang Xianyi
|
52eba814ce
|
Fixed #769. Merge branch 'martin-frbg-develop' into develop
|
2016-02-02 13:43:51 -05:00 |
Martin Kroeker
|
935356c34f
|
Update dynamic.c and cpuid_x86.c for Intel Avoton.
Second part of "support Intel Avoton via Nehalem kernel"
|
2016-02-02 13:42:55 -05:00 |
Zhang Xianyi
|
ff9388d625
|
Refs #768. Swap the result of zdot x87 fp kernel.
|
2016-02-02 13:38:01 -05:00 |
Martin Kroeker
|
4f05c23673
|
Update cpuid_x86.c
Add recognition of Intel Atom C27xx (Avoton, model code 4D)
|
2016-02-02 13:38:01 -05:00 |
Benyu Zhang
|
4a1263f609
|
Fix the source paths
|
2016-02-01 18:32:42 -08:00 |
Zhang Xianyi
|
962376664d
|
Refs #768. Swap the result of zdot x87 fp kernel.
|
2016-02-02 09:15:02 +08:00 |
Tony Kelman
|
5fef0d1b75
|
re enable Fortran optimization flag on windows
partial revert of 299cdcdc29
from #696, was not explained why that was needed
|
2016-01-30 01:13:51 -08:00 |
Zhang Xianyi
|
578f471808
|
Fix utest bug when INTERFACE64=1.
|
2016-01-28 22:18:38 -06:00 |
Zhang Xianyi
|
5a8447e97e
|
Use ctest.h for unit test. Enable unit test on travis CI.
|
2016-01-29 11:35:31 +08:00 |
Zhang Xianyi
|
be95bdaf47
|
Detect ARMV8 on 32-bit mode by using ARMV7 kernels.
|
2016-01-28 17:30:26 +00:00 |
Zhang Xianyi
|
c44ff4d648
|
Refs #714. avoid compiling warnings.
|
2016-01-28 04:38:07 +08:00 |
Zhang Xianyi
|
e003a1294c
|
Merge pull request #764 from martin-frbg/develop
Update Makefile.system to fix awk/nawk issue #763
|
2016-01-26 14:03:27 -06:00 |
Martin Kroeker
|
44062517eb
|
Update Makefile.system
Define AWK as "nawk" for SunOS (actually Illumos) only - fixes #763
|
2016-01-26 20:35:25 +01:00 |
Zhang Xianyi
|
13f0f8c10e
|
Refs #723. Avoid out of boundary for getf2.
|
2016-01-26 09:14:57 -06:00 |
Zhang Xianyi
|
f5df444ceb
|
Merge pull request #762 from jeromerobert/bug760
Let openblas_get_num_threads return the number of active threads
|
2016-01-26 08:45:16 -06:00 |
Zhang Xianyi
|
e382713423
|
Merge pull request #759 from jeromerobert/bug742
Bug742
|
2016-01-26 08:43:32 -06:00 |
Zhang Xianyi
|
aaa8551c57
|
Merge pull request #749 from lotheac/illumos_fixes
illumos fixes
|
2016-01-26 08:42:20 -06:00 |
Jerome Robert
|
0d87c1ffb6
|
Let openblas_get_num_threads return the number of active threads
... not the number of allocated threads.
Close #760
|
2016-01-26 13:04:16 +01:00 |
wernsaar
|
0b194426f8
|
Merge pull request #761 from wernsaar/develop
Ref #740: all assembly codes now clear floating point register correctly
|
2016-01-26 09:19:14 +01:00 |