Martin Kroeker
84c00c3c6e
Support running just the GEMV version of the thread safety test
2020-09-17 13:46:41 +02:00
Martin Kroeker
8c5c991bd7
Add cpp_thread_test options
2020-09-17 13:45:40 +02:00
Martin Kroeker
2e3b15d68b
Add CMakeLists.txt
2020-09-17 13:43:55 +02:00
Martin Kroeker
eaf7f825bd
Merge pull request #85 from xianyi/develop
...
rebase
2020-09-17 13:42:47 +02:00
Martin Kroeker
4c10a1673d
Merge pull request #2840 from martin-frbg/fixup2833
...
Fix for cmake BUILD_ settings PR 2833
2020-09-16 18:55:50 +02:00
Martin Kroeker
c4aeeeb9f4
Activate all BUILD_ options if none was specified
2020-09-15 23:15:34 +02:00
Martin Kroeker
3843bd188c
Merge pull request #84 from xianyi/develop
...
rebase
2020-09-15 23:13:30 +02:00
Martin Kroeker
ddec244a5a
Merge pull request #2838 from austinpagan/gordon_trmm
...
Adding performance patch for trmm, just like trsm (#2836 )
2020-09-15 21:17:48 +02:00
fossum
dfeca46098
Adding performance patch for trmm, just like #2836
2020-09-15 08:59:50 -05:00
Martin Kroeker
f8950f40a2
Merge pull request #2836 from austinpagan/gordon_trsm
...
Fixing a performance bug in trsm_[LR].c.
2020-09-15 11:26:37 +02:00
fossum
274d6e015b
Fixing a performance bug in trsm_[LR].c.
2020-09-14 13:10:48 -05:00
Martin Kroeker
91c84e1c01
Merge pull request #2796 from Guobing-Chen/BF16_dot_coversion_apis
...
Add bfloat16 based dot and conversion with single/double
2020-09-14 15:00:19 +02:00
Martin Kroeker
1ee1e7b495
Merge pull request #2833 from martin-frbg/issue2830
...
Make building the tests for individual data types conditional on the respective BUILD option
2020-09-14 07:24:23 +02:00
Martin Kroeker
ba644378dc
Copy BUILD_ options available to the compiler flags
2020-09-14 00:03:33 +02:00
Martin Kroeker
9e11c2d62f
Add BUILD_SINGLE etc
2020-09-13 23:55:11 +02:00
Martin Kroeker
4d250d0cdf
Rearrange ifdefs
2020-09-13 23:29:01 +02:00
Martin Kroeker
de139337b8
Remove spurious tests for complex ASUM and NRM2
2020-09-13 22:20:41 +02:00
Martin Kroeker
ec2948f147
Make tests conditional on BUILD_DOUBLE
2020-09-13 22:17:46 +02:00
Martin Kroeker
ce89398636
Make tests for individual variable types conditional on the respective BUILD_ option
2020-09-13 21:52:18 +02:00
Martin Kroeker
593ce9e237
Make building individual tests depend on BUILD_SINGLE etc defines
2020-09-13 21:50:12 +02:00
Martin Kroeker
74e358bcd5
Remove spurious complex16 tests
2020-09-13 21:49:01 +02:00
Martin Kroeker
26792d2096
Copy BUILD_* directives to the compiler options to allow ifdef in tests
2020-09-13 21:47:55 +02:00
Martin Kroeker
6b52c7e172
Merge pull request #2832 from martin-frbg/issue2831
...
Fix gfortran detection by vendor matching
2020-09-13 21:20:30 +02:00
Martin Kroeker
746ad3bd19
Fix vendor match for GCC gfortran
2020-09-13 18:40:59 +02:00
Martin Kroeker
55d4d470ec
Merge pull request #83 from xianyi/develop
...
rebase
2020-09-13 18:30:11 +02:00
Martin Kroeker
a270894730
Merge pull request #2829 from mhillenibm/clang_s390x
...
Fix DYNAMIC_ARCH=1 with clang s390x
2020-09-08 23:36:41 +02:00
Marius Hillenbrand
047b8d7aff
Add an s390 build with clang to the Travis configuration
...
Since clang builds have been fixed on s390x, including support for
DYNAMIC_ARCH, cover that build type in Travis.
Explicitly request Ubuntu 20.04 (codename focal) to get a recent
LLVM/clang version 10.x and thereby cover all s390x architecture
generations supported in OpenBLAS. Ubuntu 18.10's LLVM/clang 6.x cannot
build the inline assembly in some of the Z13 and Z14 kernels.
LLVM/clang currently does not support OpenMP on s390x, so disable that
in the build.
Signed-off-by: Marius Hillenbrand <mhillen@linux.ibm.com>
2020-09-08 20:59:06 +02:00
Marius Hillenbrand
f7731a358a
Update CONTRIBUTERS.md - clang build fixes for IBM z
...
Signed-off-by: Marius Hillenbrand <mhillen@linux.ibm.com>
2020-09-08 19:34:18 +02:00
Marius Hillenbrand
a55fe06f25
s390x/DYNAMIC_ARCH: define a HW_CAP flag to support slightly older glibc versions
...
Enable building DYNAMIC_ARCH support with older versions of glibc that
do not know about the hwcap flag HWCAP_S390_VXE yet.
Signed-off-by: Marius Hillenbrand <mhillen@linux.ibm.com>
2020-09-08 19:34:18 +02:00
Marius Hillenbrand
4f34bcfb5e
s390x/DYNAMIC_ARCH: pass supported arch levels from Makefile to run-time code
...
... instead of duplicating the (old) mechanism from the Makefile that
aimed to derive supported architecture generations from the gcc
version.
To enable builds with DYNAMIC_ARCH with older compiler releases, the
Makefile and drivers/other/dynamic_arch.c need a common view of the
architecture support built into the library.
We follow the notation from x86 when used with DYNAMIC_LIST, where
defines DYN_<ARCH NAME> denote support for a given generation to be
built in. Since there are far fewer architecture generations in OpenBLAS
for s390x, that does not bloat command lines too much.
Signed-off-by: Marius Hillenbrand <mhillen@linux.ibm.com>
2020-09-08 19:34:18 +02:00
Marius Hillenbrand
0629d8ebdb
s390x/DYNAMIC_ARCH: generalize detecting supported archs for clang
...
Simplify detection of which kernels we can compile on s390x. Instead of
decoding the gcc version in a complicated manner, just check if CC
supports a given -march=archXY flag. Together with the next patch, we
thereby gain support for builds with LLVM/clang with DYNAMIC_ARCH=1.
Signed-off-by: Marius Hillenbrand <mhillen@linux.ibm.com>
2020-09-08 19:34:18 +02:00
Martin Kroeker
15da2f9acb
Merge pull request #2828 from martin-frbg/lapack438
...
Correct xLASET arguments in LAPACK EIG tests
2020-09-08 10:25:19 +02:00
Martin Kroeker
7d9c77f421
Correct dimension argument to xLASET
...
from Reference-LAPACK PR 438
2020-09-07 22:03:46 +02:00
Martin Kroeker
c8f029a518
Merge pull request #82 from xianyi/develop
...
rebase
2020-09-07 21:59:13 +02:00
Martin Kroeker
e72430fe46
Merge pull request #2803 from xiegengxin/AVX2-asum
...
Implementaion of dasum, sasum with AVX2 & AVX512 intrinsic
2020-09-06 18:32:15 +02:00
Martin Kroeker
6e0f6c5f00
Merge pull request #2824 from martin-frbg/asumbench
...
Use POSIX2001 clock.gettime in asum benchmark if available
2020-09-06 10:05:47 +02:00
Martin Kroeker
6f8fad87c5
Use POSIX2001 clock.gettime for higher resolution
2020-09-05 19:44:01 +02:00
Martin Kroeker
ed0f2d3dd7
Merge pull request #2816 from martin-frbg/silicon
...
Add basic support for Apple Vortex (ARM64) cpu
2020-09-05 19:17:59 +02:00
Martin Kroeker
43a31b7786
Merge pull request #2823 from martin-frbg/fix2778
...
Improve fix for lapack-test EIG/cchkhb2stg from PR 2778
2020-09-05 17:29:38 +02:00
Martin Kroeker
8a2a137a9e
Correct argument to SLASET (Improves fix from PR2778)
...
as explained by serguei-patchkovskii in Reference-LAPACK/lapack#438 (comment) , passing in an index of 1 instead of N leads to a standards violation accessing matrix A in SLASET, i.e. undefined behavior
2020-09-05 13:06:31 +02:00
Martin Kroeker
0d1f30a297
Merge pull request #81 from xianyi/develop
...
rebase
2020-09-05 12:47:03 +02:00
Martin Kroeker
70a254d507
Merge pull request #2822 from martin-frbg/issue2821
...
Fix potential domain error in sqrt
2020-09-05 12:39:32 +02:00
Martin Kroeker
330044d821
Fix potentiol domain error in sqrt
2020-09-05 09:44:33 +02:00
Martin Kroeker
97636b2c8a
Merge pull request #2819 from h-vetinari/carry_lapack_437
...
Carry lapack#437
2020-09-04 23:50:43 +02:00
Martin Kroeker
4d36711547
Merge pull request #2820 from RajalakshmiSR/clang
...
POWER9: Fix mcpu option with clang
2020-09-04 23:09:31 +02:00
Rajalakshmi Srinivasaraghavan
718f67421a
POWER9: Fix mcpu option with clang
...
Adding check for compiler type before checking GCC version in Makefile.
This allows clang to use power9 instead of power8 when CORE is POWER9.
2020-09-04 10:36:19 -05:00
H. Vetinari
3426519ae2
adapt ?ggsv?-functions to ambient code style in LAPACKE/include/lapack.h
2020-09-04 17:33:24 +02:00
H. Vetinari
1c6c71fa85
Follow-up to lapack#434 & lapack#409: add missing 'const' in signatures
...
Based on how the surrounding functions in lapack.h are handling the
parameters, particularly the ?ggsv?3-variants of the affected functions
2020-09-04 17:33:11 +02:00
H. Vetinari
860247b5da
Follow-up to lapack#434 & lapack#409: fix signature mismatches
2020-09-04 17:32:53 +02:00
Martin Kroeker
c61771e335
Merge pull request #2778 from martin-frbg/lapackeig
...
Fix various wrong calls to SLASET/DLASET in the EIG part of the LAPACK testsuite
2020-09-04 10:06:02 +02:00