Commit Graph

  • 343b301d14 Reduce list of kernels in the dynamic arch build Martin Kroeker 2019-02-20 10:27:48 +01:00
  • 45333d5793 Fix error introduced during cleanup Martin Kroeker 2019-02-19 22:16:33 +01:00
  • e29b0cfcc4 Allow multithreading TRMV again Martin Kroeker 2019-02-19 21:03:30 +01:00
  • 78d9910236 Correct range_n limiting Martin Kroeker 2019-02-19 20:59:48 +01:00
  • e12cdf58ef Merge pull request #2024 from martin-frbg/gcc9fixes4 Martin Kroeker 2019-02-17 11:49:15 +01:00
  • 1860c9456d Merge pull request #2023 from martin-frbg/gcc9fixes3 Martin Kroeker 2019-02-17 11:48:57 +01:00
  • aec905498f Merge pull request #1988 from TiborGY/patch-1 Martin Kroeker 2019-02-17 11:36:04 +01:00
  • 56089991e2 fix the the TiborGY 2019-02-16 23:26:13 +01:00
  • f9bb76d29a Fix inline assembly constraints in Bulldozer TRSM kernels Martin Kroeker 2019-02-16 20:06:48 +01:00
  • 8242b1fe3f Fix inline assembly constraints Martin Kroeker 2019-02-16 18:51:09 +01:00
  • efb9038f72 Fix inline assembly constraints Martin Kroeker 2019-02-16 18:46:17 +01:00
  • e976557d29 Fix inline assembly constraints Martin Kroeker 2019-02-16 18:36:39 +01:00
  • 9d8be15789 Fix inline assembly constraints Martin Kroeker 2019-02-16 18:24:11 +01:00
  • d752799a0f Merge pull request #2021 from martin-frbg/gcc9fixes2 Martin Kroeker 2019-02-16 18:05:40 +01:00
  • f209fc7fa9 Update Makefile.rule TiborGY 2019-02-16 12:12:39 +01:00
  • c26c0b77a7 Fix wrong constraints in inline assembly Martin Kroeker 2019-02-15 15:08:16 +01:00
  • 1c6da2d03c Merge pull request #2019 from martin-frbg/gcc9fixes Martin Kroeker 2019-02-15 15:02:54 +01:00
  • 4255a58cd2 Rename operands to put lda on the input/output constraint list Martin Kroeker 2019-02-15 10:10:04 +01:00
  • d3e4725548 Merge pull request #2020 from martin-frbg/issue1956 Martin Kroeker 2019-02-15 09:57:59 +01:00
  • adb419ed67 With the Intel compiler on Linux, prefer ifort for the final link step Martin Kroeker 2019-02-14 22:57:30 +01:00
  • 46e415b140 Save and restore input argument 8 (lda4) Martin Kroeker 2019-02-14 22:43:18 +01:00
  • cd5a59b9cf Merge pull request #2018 from bartoldeman/fix-dgemv-znver1-tree-vectorize Martin Kroeker 2019-02-14 21:55:11 +01:00
  • 69a97ca7b9 dgemv_kernel_4x4(Haswell): add missing clobbers for xmm0,xmm1,xmm2,xmm3 Bart Oldeman 2019-02-14 16:19:41 +00:00
  • b55c586fac Fix missing clobber in x86/x86_64 blas_quickdivide inline assembly function (#2017) Martin Kroeker 2019-02-14 15:21:36 +01:00
  • 056917d616 Merge pull request #2013 from martin-frbg/issue2011 Martin Kroeker 2019-02-14 09:29:34 +01:00
  • 718efcec6f Fix out-of-bounds memory access in gemm_beta Martin Kroeker 2019-02-13 22:08:37 +01:00
  • f9d67bb5e8 Fix out-of-bounds memory access in gemm_beta Martin Kroeker 2019-02-13 22:06:41 +01:00
  • 76bb74fcd4 Merge pull request #2012 from maamountki/z14 Martin Kroeker 2019-02-13 20:15:56 +01:00
  • 0a54c98b9d [ZARCH] Modify constraints maamountki 2019-02-13 21:06:25 +02:00
  • bec54ae366 [ZARCH] Fix caxpy maamountki 2019-02-13 12:54:35 +02:00
  • 63d7bad8a5 Merge pull request #2010 from martin-frbg/issue2009 Martin Kroeker 2019-02-12 23:24:02 +01:00
  • ab1630f9fa Fix declaration of arguments in inline assembly Martin Kroeker 2019-02-12 16:14:02 +01:00
  • b824fa70eb Fix declaration of assembly arguments in SSYMV and DSYMV microkernels Martin Kroeker 2019-02-12 16:00:18 +01:00
  • 91481a3e4e Fix declaration of input arguments in inline assembly Martin Kroeker 2019-02-12 15:51:43 +01:00
  • dc6ac9eab0 Fix declaration of input arguments in the x86_64 s/dGEMV_T and s/dGEMV_N kernels Martin Kroeker 2019-02-12 15:33:48 +01:00
  • f583674109 [ZARCH] Fix cgemv_t_4 maamountki 2019-02-12 13:12:28 +02:00
  • 77fe70019f [ZARCH] Fix constraints and source code formatting maamountki 2019-02-11 16:01:13 +02:00
  • 03a2bf2602 Fix potential memory leak in cpu enumeration on Linux (#2008) Martin Kroeker 2019-02-10 23:24:45 +01:00
  • 69edc5bbe7 Restore dropped patches in the non-TLS branch of memory.c (#2004) Martin Kroeker 2019-02-07 20:06:13 +01:00
  • 7039770165 [ZARCH] Undo the last commit maamountki 2019-02-06 20:11:44 +02:00
  • 641767f846 Merge pull request #2001 from martin-frbg/cmake-dynlist Martin Kroeker 2019-02-06 08:39:24 +01:00
  • af6e2253a2 Merge pull request #2000 from martin-frbg/issue1989 Martin Kroeker 2019-02-06 00:29:30 +01:00
  • 5952e586ce Support DYNAMIC_LIST option in cmake Martin Kroeker 2019-02-05 23:51:40 +01:00
  • f10408aae8 Merge pull request #1999 from martin-frbg/issue1996-2 Martin Kroeker 2019-02-05 22:02:11 +01:00
  • d70ae3ab43 Make c_check robust against old or incomplete perl installations Martin Kroeker 2019-02-05 20:06:34 +01:00
  • 1391fc46d2 fix second instance of complex.h for c++ as well Martin Kroeker 2019-02-05 19:29:33 +01:00
  • 11a43e8116 [ZARCH] Set alignment hint for vl/vst maamountki 2019-02-05 19:17:08 +02:00
  • 817fe9865c Merge pull request #1998 from martin-frbg/issue1992 Martin Kroeker 2019-02-05 17:39:59 +01:00
  • f4b82d7bc4 Include complex rather than complex.h in C++ contexts Martin Kroeker 2019-02-05 13:30:13 +01:00
  • 61526480f9 [ZARCH] Fix copy constraint maamountki 2019-02-05 07:51:19 +02:00
  • 81daf6bc38 [ZARCH] Format source code, Fix constraints maamountki 2019-02-05 07:30:38 +02:00
  • a38aa56e76 Merge pull request #1 from xianyi/develop maamountki 2019-02-05 07:25:38 +02:00
  • 729e925174 Merge pull request #1996 from quickwritereader/develop Martin Kroeker 2019-02-04 16:52:04 +01:00
  • 498ac98581 Note for unused kernels Ubuntu 2019-02-04 15:41:56 +00:00
  • cd9ea45463 NBMAX=4096 for gemvn, added sgemvn 8x8 for future Ubuntu 2019-02-04 06:57:11 +00:00
  • f9c5023e04 Merge pull request #1994 from quickwritereader/develop Martin Kroeker 2019-02-01 21:04:47 +01:00
  • 4abc375a91 sgemv cgemv pairs Ubuntu 2019-02-01 13:45:00 +00:00
  • 874df65491 Fix incorrect sgemv results for IBM z14 Martin Kroeker 2019-02-01 12:58:59 +01:00
  • 1f4b61f572 Delete misplaced file sgemv_t_4.c Martin Kroeker 2019-02-01 12:57:01 +01:00
  • 282230c303 Merge pull request #1993 from martin-frbg/aarnes-zarch Martin Kroeker 2019-01-31 21:27:00 +01:00
  • cce574c3e0 Improve the z14 SGEMVT kernel Martin Kroeker 2019-01-31 21:24:55 +01:00
  • 877023e1e1 Fix precision of zarch DSDOT Martin Kroeker 2019-01-31 21:22:26 +01:00
  • 265142edd5 Fix typo in the zarch min/max kernels Martin Kroeker 2019-01-31 21:21:40 +01:00
  • 885a3c4350 USE_TRMM on Z14 Martin Kroeker 2019-01-31 21:18:09 +01:00
  • 4b512f84dd Add cache sizes for Z14 Martin Kroeker 2019-01-31 21:16:44 +01:00
  • 72d3e7c9b4 Add FORCE Z14 Martin Kroeker 2019-01-31 21:15:50 +01:00
  • bdc73a49e0 Add parameters for Z14 Martin Kroeker 2019-01-31 21:14:37 +01:00
  • 1249ee1fd0 Add Z14 target Martin Kroeker 2019-01-31 21:13:46 +01:00
  • 42df9efa0c Merge pull request #1991 from maamountki/z14 Martin Kroeker 2019-01-31 19:10:03 +01:00
  • 82124729af Merge branch 'develop' into z14 maamountki 2019-01-31 19:36:41 +02:00
  • 29416cb5a3 [ZARCH] Add Z13 version for max/min functions maamountki 2019-01-31 19:11:11 +02:00
  • 48b9b94f7f [ZARCH] Improve loading performance for camax/icamax maamountki 2019-01-31 18:52:11 +02:00
  • 86a824c97f Fix wrong comparison that made IMIN identical to IMAX Martin Kroeker 2019-01-31 15:27:21 +01:00
  • 808410c2c7 Fix wrong comparison that made IMIN identical to IMAX Martin Kroeker 2019-01-31 15:25:15 +01:00
  • eaf20f0e7a Remove ztest maamountki 2019-01-31 09:26:50 +02:00
  • fcd814a8d2 [ZARCH] Fix bug in max/min functions maamountki 2019-01-29 17:59:38 +02:00
  • dc4d3bccd5 [ZARCH] Fix icamax/icamin maamountki 2019-01-29 03:47:49 +02:00
  • c7143c1019 [ZARCH] Fix iamax/imax single precision maamountki 2019-01-28 17:52:23 +02:00
  • 04873bb174 [ZARCH] Undo the last commit maamountki 2019-01-28 17:32:24 +02:00
  • c8ef9fb220 [ZARCH] Fix bug in iamax/iamin/imax/imin maamountki 2019-01-28 17:16:18 +02:00
  • 5be61f4b47 Merge pull request #1985 from martin-frbg/issue1984 Martin Kroeker 2019-01-28 15:44:57 +01:00
  • 3d155cff83 Merge pull request #1981 from edisongustavo/develop Martin Kroeker 2019-01-28 15:44:42 +01:00
  • 7d47f0a82d Merge pull request #1978 from danielgindi/feature/msvc_cmake Martin Kroeker 2019-01-28 15:43:35 +01:00
  • a529c71a74 Merge pull request #1962 from brada4/r Martin Kroeker 2019-01-28 15:42:57 +01:00
  • ea1716ce2a Update Makefile.rule TiborGY 2019-01-27 17:22:26 +01:00
  • 0f24b39ebf Reword/expand comments in Makefile.rule TiborGY 2019-01-27 15:33:00 +01:00
  • 89b60dab8a Merge pull request #1987 from martin-frbg/issue1961 Martin Kroeker 2019-01-26 22:25:29 +01:00
  • 58dd7e4501 Change ARMV8 target to ARMV7 for BINARY=32 Martin Kroeker 2019-01-26 17:52:33 +01:00
  • 36b844af88 Change ARMV8 target to ARMV7 when BINARY32 is set Martin Kroeker 2019-01-26 17:47:22 +01:00
  • e882b239aa Correct naming of getrf_parallel object Martin Kroeker 2019-01-26 00:45:45 +01:00
  • 3f7bb87a2a Merge pull request #1971 from martin-frbg/trsm-threshold Martin Kroeker 2019-01-24 09:17:48 +01:00
  • e908ac2a51 Fix include directory of exported targets Edison Gustavo Muenz 2019-01-23 15:09:13 +01:00
  • 8533aca964 Avoid penalizing tall skinny matrices Martin Kroeker 2019-01-23 10:03:00 +01:00
  • 16494cb7c4 Merge pull request #1980 from martin-frbg/issue1979 Martin Kroeker 2019-01-22 21:10:38 +01:00
  • b56b34a75c Syntax fix Martin Kroeker 2019-01-22 18:55:43 +01:00
  • 21eda8b577 Report SkylakeX as Haswell if compiler does not support AVX512 Martin Kroeker 2019-01-22 18:47:12 +01:00
  • 24288803b3 Adjust test script for correct deployment Daniel Cohen Gindi 2019-01-22 14:38:01 +02:00
  • f0d834b824 Use VERSION_LESS for comparisons involving software version numbers Martin Kroeker 2019-01-22 12:32:24 +01:00
  • 63bbd7b0d7 Better support for MSVC/Windows in CMake Daniel Cohen Gindi 2019-01-21 08:35:23 +02:00
  • b111829226 [ZARCH] Update max/min functions maamountki 2019-01-21 15:56:04 +02:00