Zhang Xianyi
fb8968fb83
Refs #707 . Bugfix for previous commit.
2016-02-11 05:14:53 +08:00
Zhang Xianyi
dae6b82a71
Refs #707 . Add BUILD_LAPACK_DEPRECATED flag in Makefile.rule.
...
If you want to build LAPACK deprecated functions since LAPACK 3.6.0
make BUILD_LAPACK_DEPRECATED=1
2016-02-11 04:22:53 +08:00
Zhang Xianyi
d73244b825
Refs #727 . Align stack buffer address on 32-bytes.
2016-02-11 03:52:02 +08:00
Zhang Xianyi
233c6b959f
Merge pull request #780 from jeromerobert/bug727
...
Bug727
2016-02-08 13:24:40 -05:00
Jerome Robert
16ec5323c9
Fix zgemv.c compilation when stack allocation is disabled
2016-02-08 12:05:02 +01:00
Jerome Robert
0ad02ef2d6
update CONTRIBUTORS.md
2016-02-08 11:26:51 +01:00
Jerome Robert
73397faf68
Add benchmark/smallscaling.c
...
* Bench small matrices with multi-threading
* Close #727
2016-02-08 11:25:27 +01:00
Jerome Robert
5fc2203d8a
zgemv: Add a workaround for #746
2016-02-08 11:25:15 +01:00
Jerome Robert
78dcf5c3d5
Improve performances of ztrmv on small matrices
...
* Use stack allocation
* Disable multi-threading
* Ref #727
2016-02-08 11:25:02 +01:00
Jerome Robert
32f793195f
Use stack allocation in zgemv and zger
...
For better performance with small matrices
Ref #727
2016-02-08 11:24:21 +01:00
Zhang Xianyi
be4e5fcd20
Fixed #778 . Merge branch 'buffer51-develop' into develop
2016-02-05 08:39:08 +08:00
buffer51
855e0cb700
Restored LAPACK_COMPLEX_STRUCTURE for Android prior to 21. Refs #682 .
2016-02-04 17:20:07 -05:00
buffer51
7f7d04dcd2
Fixed linking error when compiling ARMv7 for Android (disabled -lpthread and added -Wl,--no-warn-mismatch).
2016-02-04 17:05:31 -05:00
buffer51
4e1b521e27
Fix lapack complex implementation of lauu2 and potf2 for Android (use FLOAT instead of FLOAT[2] as imaginary part is not used).
2016-02-04 16:59:56 -05:00
Zhang Xianyi
a1a96589aa
Fixed #773 blas_quickdivide bug on CMake and Visual Studio x86 32-bit.
2016-02-04 15:23:32 -05:00
Zhang Xianyi
0e68beb89f
Fixed #711 , #698 . Merge branch 'byzhang-develop' into develop
2016-02-03 02:56:27 +08:00
Zhang Xianyi
926ba8b7ca
Merge branch 'develop' of https://github.com/byzhang/OpenBLAS into byzhang-develop
2016-02-03 02:48:32 +08:00
Zhang Xianyi
9f080c47e1
Merge pull request #743 from tkelman/patch-1
...
re enable Fortran optimization flag on windows
2016-02-02 13:46:12 -05:00
Zhang Xianyi
52eba814ce
Fixed #769 . Merge branch 'martin-frbg-develop' into develop
2016-02-02 13:43:51 -05:00
Martin Kroeker
935356c34f
Update dynamic.c and cpuid_x86.c for Intel Avoton.
...
Second part of "support Intel Avoton via Nehalem kernel"
2016-02-02 13:42:55 -05:00
Zhang Xianyi
ff9388d625
Refs #768 . Swap the result of zdot x87 fp kernel.
2016-02-02 13:38:01 -05:00
Martin Kroeker
4f05c23673
Update cpuid_x86.c
...
Add recognition of Intel Atom C27xx (Avoton, model code 4D)
2016-02-02 13:38:01 -05:00
Benyu Zhang
4a1263f609
Fix the source paths
2016-02-01 18:32:42 -08:00
Zhang Xianyi
962376664d
Refs #768 . Swap the result of zdot x87 fp kernel.
2016-02-02 09:15:02 +08:00
Tony Kelman
5fef0d1b75
re enable Fortran optimization flag on windows
...
partial revert of 299cdcdc29
from #696 , was not explained why that was needed
2016-01-30 01:13:51 -08:00
Zhang Xianyi
578f471808
Fix utest bug when INTERFACE64=1.
2016-01-28 22:18:38 -06:00
Zhang Xianyi
5a8447e97e
Use ctest.h for unit test. Enable unit test on travis CI.
2016-01-29 11:35:31 +08:00
Zhang Xianyi
be95bdaf47
Detect ARMV8 on 32-bit mode by using ARMV7 kernels.
2016-01-28 17:30:26 +00:00
Zhang Xianyi
c44ff4d648
Refs #714 . avoid compiling warnings.
2016-01-28 04:38:07 +08:00
Zhang Xianyi
e003a1294c
Merge pull request #764 from martin-frbg/develop
...
Update Makefile.system to fix awk/nawk issue #763
2016-01-26 14:03:27 -06:00
Martin Kroeker
44062517eb
Update Makefile.system
...
Define AWK as "nawk" for SunOS (actually Illumos) only - fixes #763
2016-01-26 20:35:25 +01:00
Zhang Xianyi
13f0f8c10e
Refs #723 . Avoid out of boundary for getf2.
2016-01-26 09:14:57 -06:00
Zhang Xianyi
f5df444ceb
Merge pull request #762 from jeromerobert/bug760
...
Let openblas_get_num_threads return the number of active threads
2016-01-26 08:45:16 -06:00
Zhang Xianyi
e382713423
Merge pull request #759 from jeromerobert/bug742
...
Bug742
2016-01-26 08:43:32 -06:00
Zhang Xianyi
aaa8551c57
Merge pull request #749 from lotheac/illumos_fixes
...
illumos fixes
2016-01-26 08:42:20 -06:00
Jerome Robert
0d87c1ffb6
Let openblas_get_num_threads return the number of active threads
...
... not the number of allocated threads.
Close #760
2016-01-26 13:04:16 +01:00
wernsaar
0b194426f8
Merge pull request #761 from wernsaar/develop
...
Ref #740 : all assembly codes now clear floating point register correctly
2016-01-26 09:19:14 +01:00
Werner Saar
63a7d7fb24
updated gemv_n_vfpv3.S for armv7
2016-01-25 15:00:13 +01:00
Werner Saar
b4ede558a5
updated nrm2 kernel for armv7
2016-01-25 11:55:25 +01:00
Werner Saar
de3e2d4349
updated trmm kernels for armv7
2016-01-25 11:08:56 +01:00
Werner Saar
a0e51e96f1
updated gemm kernels for armv7
2016-01-25 10:46:10 +01:00
Lauri Tirkkonen
d6afac9624
don't pass -Y at all to the linker on illumos
...
the illumos linker can't understand the "-Y/lib"... form that f_check
generates, and -Wl cannot pass options that include commas
2016-01-25 11:09:34 +02:00
Werner Saar
c2891330bc
updated KERNEL.ARMV6
2016-01-24 17:12:07 +01:00
Werner Saar
ceaa931e48
updated gemv kernel for armv6
2016-01-24 16:31:19 +01:00
Werner Saar
eaa63165df
updated cgemv and zgemv kernels for armv6
2016-01-24 14:42:38 +01:00
Werner Saar
c65357c566
updated trmm_kernels for armv6
2016-01-24 13:03:33 +01:00
Werner Saar
e63e9f9f26
updated gemm_kernels for armv6
2016-01-24 11:55:50 +01:00
Jerome Robert
1fe3aab047
Use GEMM_MULTITHREAD_THRESHOLD as a number of ops
...
...not a matrix size. For GEMM_MULTITHREAD_THRESHOLD=4
(the default value) this does not change anything but
for other values it make the GEMM and GEMV thresholds
changing in the same way.
Close #742
2016-01-24 11:31:40 +01:00
Werner Saar
aafd3ab60e
updated cdot and zdot on arm
2016-01-24 10:56:49 +01:00
Jerome Robert
1a1935507b
[z]ger: increase multithread threshold
...
The ones given in 3ae30cd
was by far to low because I
mixed m and m*n in my measures. Note that the new ones
are closed to the [z]gemv ones which is comforting
that both are right.
2016-01-24 10:46:35 +01:00