Martin Kroeker
3d1e36d4cb
Build CBLAS interfaces for I?MIN and I?MAX
2019-03-30 12:38:41 +01:00
Martin Kroeker
4f9d3e4b28
Expose CBLAS interfaces for I?MIN and I?MAX
2019-03-30 12:37:13 +01:00
Martin Kroeker
69edc5bbe7
Restore dropped patches in the non-TLS branch of memory.c ( #2004 )
...
* Restore dropped patches in the non-TLS branch of memory.c
As discovered in #2002 , the reintroduction of the "original" non-TLS version of memory.c as an alternate branch had inadvertently used ba1f91f
rather than a8002e2
, thereby dropping the commits for #1450 , #1468 , #1501 , #1504 and #1520 .
2019-02-07 20:06:13 +01:00
Martin Kroeker
641767f846
Merge pull request #2001 from martin-frbg/cmake-dynlist
...
Support DYNAMIC_LIST option in cmake
2019-02-06 08:39:24 +01:00
Martin Kroeker
af6e2253a2
Merge pull request #2000 from martin-frbg/issue1989
...
Make c_check robust against old or incomplete perl installations
2019-02-06 00:29:30 +01:00
Martin Kroeker
5952e586ce
Support DYNAMIC_LIST option in cmake
...
e.g. cmake -DDYNAMIC_ARCH=1 -DDYNAMIC_LIST="NEHALEM;HASWELL;ZEN" ..
original issue was #1639
2019-02-05 23:51:40 +01:00
Martin Kroeker
f10408aae8
Merge pull request #1999 from martin-frbg/issue1996-2
...
fix second instance of complex.h for c++ as well
2019-02-05 22:02:11 +01:00
Martin Kroeker
d70ae3ab43
Make c_check robust against old or incomplete perl installations
...
by catching and working around failures to load modules, and avoiding object-oriented syntax in tempfile creation.
Fixes #1989
2019-02-05 20:06:34 +01:00
Martin Kroeker
1391fc46d2
fix second instance of complex.h for c++ as well
2019-02-05 19:29:33 +01:00
Martin Kroeker
817fe9865c
Merge pull request #1998 from martin-frbg/issue1992
...
Include complex rather than complex.h in C++ contexts
2019-02-05 17:39:59 +01:00
Martin Kroeker
f4b82d7bc4
Include complex rather than complex.h in C++ contexts
...
to avoid name clashes e.g. with boost headers that use I as a generic placeholder.
Fixes #1992 as suggested by aprokop in that issue ticket.
2019-02-05 13:30:13 +01:00
Martin Kroeker
729e925174
Merge pull request #1996 from quickwritereader/develop
...
NBMAX=4096 for gemvn, added sgemvn 8x8 for future
2019-02-04 16:52:04 +01:00
Ubuntu
498ac98581
Note for unused kernels
2019-02-04 15:41:56 +00:00
Ubuntu
cd9ea45463
NBMAX=4096 for gemvn, added sgemvn 8x8 for future
2019-02-04 06:57:11 +00:00
Martin Kroeker
f9c5023e04
Merge pull request #1994 from quickwritereader/develop
...
sgemv cgemv pairs
2019-02-01 21:04:47 +01:00
Ubuntu
4abc375a91
sgemv cgemv pairs
2019-02-01 13:45:00 +00:00
Martin Kroeker
874df65491
Fix incorrect sgemv results for IBM z14
...
part of PR #1993 that was inadvertently misplaced into the toplevel directory
2019-02-01 12:58:59 +01:00
Martin Kroeker
1f4b61f572
Delete misplaced file sgemv_t_4.c
...
from #1993 , file should have gone into kernel/zarch
2019-02-01 12:57:01 +01:00
Martin Kroeker
282230c303
Merge pull request #1993 from martin-frbg/aarnes-zarch
...
Various fixes for the new Z14 target
2019-01-31 21:27:00 +01:00
Martin Kroeker
cce574c3e0
Improve the z14 SGEMVT kernel
...
from patch provided by aarnez in #991
2019-01-31 21:24:55 +01:00
Martin Kroeker
877023e1e1
Fix precision of zarch DSDOT
...
from patch provided by aarnez in #991
2019-01-31 21:22:26 +01:00
Martin Kroeker
265142edd5
Fix typo in the zarch min/max kernels
...
from patch provided by aarnez in #991
2019-01-31 21:21:40 +01:00
Martin Kroeker
885a3c4350
USE_TRMM on Z14
...
from patch provided by aarnez in #991
2019-01-31 21:18:09 +01:00
Martin Kroeker
4b512f84dd
Add cache sizes for Z14
...
from patch provided by aarnez in #991
2019-01-31 21:16:44 +01:00
Martin Kroeker
72d3e7c9b4
Add FORCE Z14
...
from patch provided by aarnez in #991
2019-01-31 21:15:50 +01:00
Martin Kroeker
bdc73a49e0
Add parameters for Z14
...
from patch provided by aarnez in #991
2019-01-31 21:14:37 +01:00
Martin Kroeker
1249ee1fd0
Add Z14 target
...
from patch provided by aarnez in #991
2019-01-31 21:13:46 +01:00
Martin Kroeker
42df9efa0c
Merge pull request #1991 from maamountki/z14
...
[ZARCH] Z14 Support, BLAS 1/2 single precision implementations
2019-01-31 19:10:03 +01:00
maamountki
82124729af
Merge branch 'develop' into z14
2019-01-31 19:36:41 +02:00
maamountki
29416cb5a3
[ZARCH] Add Z13 version for max/min functions
2019-01-31 19:11:11 +02:00
maamountki
48b9b94f7f
[ZARCH] Improve loading performance for camax/icamax
2019-01-31 18:52:11 +02:00
Martin Kroeker
86a824c97f
Fix wrong comparison that made IMIN identical to IMAX
...
as reported by aarnez in #1990
2019-01-31 15:27:21 +01:00
Martin Kroeker
808410c2c7
Fix wrong comparison that made IMIN identical to IMAX
...
as suggested in #1990
2019-01-31 15:25:15 +01:00
maamountki
eaf20f0e7a
Remove ztest
2019-01-31 09:26:50 +02:00
maamountki
fcd814a8d2
[ZARCH] Fix bug in max/min functions
2019-01-29 17:59:38 +02:00
maamountki
dc4d3bccd5
[ZARCH] Fix icamax/icamin
2019-01-29 03:47:49 +02:00
maamountki
c7143c1019
[ZARCH] Fix iamax/imax single precision
2019-01-28 17:52:23 +02:00
maamountki
04873bb174
[ZARCH] Undo the last commit
2019-01-28 17:32:24 +02:00
maamountki
c8ef9fb220
[ZARCH] Fix bug in iamax/iamin/imax/imin
2019-01-28 17:16:18 +02:00
Martin Kroeker
5be61f4b47
Merge pull request #1985 from martin-frbg/issue1984
...
Correct naming of getrf_parallel object
2019-01-28 15:44:57 +01:00
Martin Kroeker
3d155cff83
Merge pull request #1981 from edisongustavo/develop
...
Fix include directory of exported targets
2019-01-28 15:44:42 +01:00
Martin Kroeker
7d47f0a82d
Merge pull request #1978 from danielgindi/feature/msvc_cmake
...
Better support for MSVC/Windows in CMake (v0.3.x)
2019-01-28 15:43:35 +01:00
Martin Kroeker
a529c71a74
Merge pull request #1962 from brada4/r
...
Modrenize R benchmarks slightly
2019-01-28 15:42:57 +01:00
Martin Kroeker
89b60dab8a
Merge pull request #1987 from martin-frbg/issue1961
...
Change ARMV8 target with BINARY=32 to ARMV7 automatically
2019-01-26 22:25:29 +01:00
Martin Kroeker
58dd7e4501
Change ARMV8 target to ARMV7 for BINARY=32
2019-01-26 17:52:33 +01:00
Martin Kroeker
36b844af88
Change ARMV8 target to ARMV7 when BINARY32 is set
...
fixes #1961
2019-01-26 17:47:22 +01:00
Martin Kroeker
e882b239aa
Correct naming of getrf_parallel object
...
fixes #1984
2019-01-26 00:45:45 +01:00
Martin Kroeker
3f7bb87a2a
Merge pull request #1971 from martin-frbg/trsm-threshold
...
Shift transition to multithreading towards larger matrix sizes
2019-01-24 09:17:48 +01:00
Edison Gustavo Muenz
e908ac2a51
Fix include directory of exported targets
2019-01-23 15:09:13 +01:00
Martin Kroeker
8533aca964
Avoid penalizing tall skinny matrices
2019-01-23 10:03:00 +01:00