Commit Graph

  • d9ba49165a Improve the performance of rot by using AVX512 and AVX2 intrinsic Gengxin Xie 2020-09-27 10:38:19 +08:00
  • 60ab9c783f Merge pull request #2966 from martin-frbg/issue2964 Martin Kroeker 2020-11-04 16:02:46 +01:00
  • 8cc73fee98 Export NO_EXPRECISION after overriding for DYNAMIC_ARCH with GENERIC target Martin Kroeker 2020-11-03 23:47:04 +01:00
  • 0155cd53a3 Add -msse3 where needed for DYNAMIC_ARCH builds Martin Kroeker 2020-11-03 23:45:49 +01:00
  • a9f9354296 Fix target test Martin Kroeker 2020-11-02 23:17:46 +01:00
  • b9bc76aec4 Add files via upload Martin Kroeker 2020-11-02 22:43:50 +01:00
  • f071245939 Merge pull request #2967 from RajalakshmiSR/dgemm88 Martin Kroeker 2020-11-02 18:54:36 +01:00
  • 60997ddd73 allow setting soname without suffix or prefix Aisha Tammy 2020-11-02 13:04:53 +00:00
  • e5f8c2bf8a typo fix Martin Kroeker 2020-11-01 22:25:43 +01:00
  • 6baf8af658 Disable EXPRECISION for the combination of DYNAMIC_CORE and GENERIC target Martin Kroeker 2020-11-01 22:11:48 +01:00
  • 40a93c232b Disable EXPRECISION for DYNAMIC_ARCH in combination with TARGET=GENERIC Martin Kroeker 2020-11-01 21:58:26 +01:00
  • fab952bee4 Merge pull request #2962 from brada4/develop Martin Kroeker 2020-11-01 14:24:40 +01:00
  • 1cf04a6f0e Merge pull request #2963 from martin-frbg/issue2959 Martin Kroeker 2020-11-01 09:14:54 +01:00
  • dd7a9cc5bf POWER10: Change dgemm unroll factors Rajalakshmi Srinivasaraghavan 2020-10-31 18:28:57 -05:00
  • 7f26be4802 Reunify BUFFERSIZE across arm64 platforms to avoid segfaults in DYNAMIC_ARCH Martin Kroeker 2020-11-01 00:00:43 +01:00
  • 9fab65e90a add openbsd gfortran User User-User 2020-11-01 00:38:08 +02:00
  • 9efc3f0815 Merge pull request #109 from xianyi/develop Martin Kroeker 2020-10-31 22:33:52 +01:00
  • aa21cb5217 Merge pull request #2960 from thrasibule/avx2_detection Martin Kroeker 2020-10-31 20:24:21 +01:00
  • 1f564d729b fix avx2 detection Guillaume Horel 2020-10-31 10:00:48 -04:00
  • 9349dcd206 Merge pull request #2956 from RajalakshmiSR/caxpy_p10 Martin Kroeker 2020-10-30 08:54:10 +01:00
  • b435491885 Optimize caxpy for POWER10 Rajalakshmi Srinivasaraghavan 2020-10-29 14:57:51 -05:00
  • 9a058f2451 Merge pull request #2940 from Qiyu8/optimize-benchmark Martin Kroeker 2020-10-29 20:28:37 +01:00
  • 074927a7d0 Merge pull request #2954 from Guobing-Chen/BF16_gemv_support Martin Kroeker 2020-10-29 09:22:33 +01:00
  • 60b22e3462 Merge pull request #2955 from Guobing-Chen/Fix_cooperlake_build_issue Martin Kroeker 2020-10-29 09:22:07 +01:00
  • c5e62dad69 Fix cooperlake compile issue Chen, Guobing 2020-10-29 03:37:51 +08:00
  • a7b1f9b1bb Implementation of BF16 based gemv Chen, Guobing 2020-10-28 08:49:12 +08:00
  • 67f39ad813 Merge pull request #2939 from thrasibule/Makefile_cleanup Martin Kroeker 2020-10-28 09:38:40 +01:00
  • 6e13a7e99e Merge pull request #2951 from martin-frbg/cleanup_make Martin Kroeker 2020-10-28 09:37:56 +01:00
  • 2207a16235 Merge pull request #2952 from martin-frbg/issue2931 Martin Kroeker 2020-10-28 09:37:32 +01:00
  • 5d643929dd Merge pull request #2948 from martin-frbg/issue2947 Martin Kroeker 2020-10-28 09:37:09 +01:00
  • e8cbf0fc50 Output predefined HAVE_ entries to Makefile.conf for ARM with specified TARGET Martin Kroeker 2020-10-27 23:01:19 +01:00
  • b937d78a6d Try to read cpu information from /sys/devices/system/cpu/cpu0 if HWCAP_CPUID fails Martin Kroeker 2020-10-27 17:51:32 +01:00
  • e2f9005db8 Merge pull request #2950 from RajalakshmiSR/saxpy Martin Kroeker 2020-10-27 00:02:18 +01:00
  • 6a1f3e40af Remove debug printout of object list Martin Kroeker 2020-10-26 21:37:04 +01:00
  • 878b6d1f41 Remove spurious expr in flang version check Martin Kroeker 2020-10-26 21:35:40 +01:00
  • c24ba8b1dd Optimize saxpy for POWER10 Rajalakshmi Srinivasaraghavan 2020-10-26 13:24:59 -05:00
  • f917c26e83 Refractoring remaining benchmark cases. Qiyu8 2020-10-26 10:25:05 +08:00
  • 76203e2120 Merge pull request #2946 from martin-frbg/issue2945 Martin Kroeker 2020-10-26 00:43:44 +01:00
  • eec517af0e Expressly enable neon for use with intrinsics if available Martin Kroeker 2020-10-26 00:21:56 +01:00
  • fd7da56965 Move definitions that are neither needed nor supported on SUNOS Martin Kroeker 2020-10-25 12:01:50 +01:00
  • 2f9fc9be30 Update version to 0.3.12.dev Martin Kroeker 2020-10-24 23:29:05 +02:00
  • 81fcfd5ed3 Update version to 0.3.12.dev Martin Kroeker 2020-10-24 23:28:29 +02:00
  • addf7593ae Merge pull request #2944 from xianyi/release-0.3.0 Martin Kroeker 2020-10-24 13:10:51 +02:00
  • c5f280a7f0 Fix typos v0.3.12 Martin Kroeker 2020-10-24 13:03:28 +02:00
  • 6e3a05f2c9 Merge pull request #2943 from xianyi/develop Martin Kroeker 2020-10-24 12:52:59 +02:00
  • 89db73569b Update Changelog with 0.3.12 changes Martin Kroeker 2020-10-24 12:50:04 +02:00
  • e1c18e4eeb Update version to 0.3.12 for release Martin Kroeker 2020-10-24 12:15:33 +02:00
  • 26f658c9d2 Update version to 0.3.12 for release Martin Kroeker 2020-10-24 12:14:45 +02:00
  • dc35477317 Merge pull request #2942 from martin-frbg/makebuildtypes Martin Kroeker 2020-10-24 09:26:50 +02:00
  • 365f28787c Comment out BUILD_SINGLE etc. and add a short explanation Martin Kroeker 2020-10-23 23:32:06 +02:00
  • 2f2e9ddb65 Merge pull request #2941 from martin-frbg/exportsfix Martin Kroeker 2020-10-23 20:47:35 +02:00
  • 0d140e61ac Fix wrong grouping of dcombssq Martin Kroeker 2020-10-23 15:53:40 +02:00
  • 4c45cd6294 fix missing split of sladiv1/dladiv/ilaenv2stage by build type Martin Kroeker 2020-10-23 15:31:25 +02:00
  • 680f744abf Merge pull request #108 from xianyi/develop Martin Kroeker 2020-10-23 15:29:48 +02:00
  • 6f9460f0f6 Merge pull request #2937 from martin-frbg/pwr-buffersz Martin Kroeker 2020-10-23 07:15:32 +02:00
  • dd6ebdfdab Refactor the performance measurement system Qiyu8 2020-10-23 10:32:03 +08:00
  • 1917a4e7b8 reuse variables defined in Makefile.system Guillaume Horel 2020-10-22 22:00:00 -04:00
  • 6c970fa998 Merge pull request #2938 from martin-frbg/2934-3 Martin Kroeker 2020-10-23 00:19:49 +02:00
  • b23cb05231 Fix twisted spelling that broke the gfortran version test again Martin Kroeker 2020-10-23 00:18:29 +02:00
  • 1d4c96fa0c Increase BUFFERSIZE further Martin Kroeker 2020-10-23 00:12:06 +02:00
  • 34c3c407ef label always_inline function as inline to silence a gcc warning Martin Kroeker 2020-10-22 22:14:26 +02:00
  • 3f84a9ca15 Merge pull request #2936 from martin-frbg/issue2934-2 Martin Kroeker 2020-10-22 22:08:46 +02:00
  • 7e265c50bf Merge pull request #2935 from martin-frbg/lapack458 Martin Kroeker 2020-10-22 19:25:58 +02:00
  • ee90f30384 Increase BUFFERSIZE for POWER8-10 and use same value for POWER6 Martin Kroeker 2020-10-22 18:47:07 +02:00
  • 2e48d560ba Fix compiler version check Martin Kroeker 2020-10-22 16:23:29 +02:00
  • ab7f466467 Merge pull request #106 from xianyi/develop Martin Kroeker 2020-10-22 16:21:09 +02:00
  • f95031204e Fix macro used in argument conversion (LAPACK PR 458) Martin Kroeker 2020-10-22 16:19:26 +02:00
  • 909068facf Merge pull request #2932 from RajalakshmiSR/copyp10 Martin Kroeker 2020-10-22 00:29:46 +02:00
  • 5b7438fdde Merge pull request #2934 from thrasibule/improve_version_check Martin Kroeker 2020-10-22 00:29:02 +02:00
  • 47696b43e9 actually check that version is greater than 4.7 Guillaume Horel 2020-10-21 16:42:37 -04:00
  • ad745c0bae Optimize scopy/ccopy for POWER10 Rajalakshmi Srinivasaraghavan 2020-10-21 09:53:45 -05:00
  • 17c46bf06a Merge pull request #2930 from ismail/fix-no-return Martin Kroeker 2020-10-21 11:43:01 +02:00
  • 28242096cd Merge pull request #2928 from martin-frbg/issue2917 Martin Kroeker 2020-10-21 10:11:02 +02:00
  • 4a1d00f589 Fix build with -Werror=return-type dgemm_tcopy_16_skylakex.c CNAME function should return an int, add a return 0 similar to other files. İsmail Dönmez 2020-10-21 08:43:39 +02:00
  • 00813363be Enable -mavx2 for flang as well Martin Kroeker 2020-10-20 23:56:30 +02:00
  • 336e35469a Merge pull request #105 from xianyi/develop Martin Kroeker 2020-10-20 23:48:53 +02:00
  • 29668458f7 Merge pull request #2925 from martin-frbg/issue2911-2 Martin Kroeker 2020-10-20 11:27:36 +02:00
  • ee83e29046 Merge pull request #2926 from bartoldeman/vzeroupper-clobber-all Martin Kroeker 2020-10-20 09:24:47 +02:00
  • 1a0f57c8f0 Fix missing backquotes Martin Kroeker 2020-10-20 08:37:53 +02:00
  • b073d759d0 x86_64: clobber all xmm registers after vzeroupper Bart Oldeman 2020-10-20 02:16:47 +00:00
  • eddc65c7b7 Add POWER10 support flag (unconditionally for now) Martin Kroeker 2020-10-20 01:09:49 +02:00
  • bb8c3f6861 Add ld/binutils version check for POWER10 support Martin Kroeker 2020-10-20 01:04:20 +02:00
  • ff65952e46 Move HAVE_P10_SUPPORT to the build system Martin Kroeker 2020-10-20 00:55:41 +02:00
  • 6208c9899e Merge pull request #104 from xianyi/develop Martin Kroeker 2020-10-20 00:52:08 +02:00
  • 8e20ab21c8 Merge pull request #2924 from martin-frbg/issue2920 Martin Kroeker 2020-10-19 23:33:45 +02:00
  • dc6e44c3f8 Merge pull request #2916 from martin-frbg/issue2911 Martin Kroeker 2020-10-19 23:33:31 +02:00
  • 4ad33c46b0 Add back symbols that got dropped when splitting by type Martin Kroeker 2020-10-19 20:37:52 +02:00
  • fe2a922ada Add POWER10 compiler options to CCOMMON_OPT rather than COMMON_OPT Martin Kroeker 2020-10-19 17:43:53 +02:00
  • 9cac379655 Merge pull request #103 from xianyi/develop Martin Kroeker 2020-10-19 15:56:20 +02:00
  • a61c086408 Fix spurious trailing whitespace in comment Martin Kroeker 2020-10-19 09:12:12 +02:00
  • 5b9ebe4f8a Merge pull request #2919 from isuruf/export Martin Kroeker 2020-10-19 08:14:27 +02:00
  • 7eddaf0d6f Remove -mmma again (reduntant with cpu=power10) and add override statements Martin Kroeker 2020-10-19 08:11:22 +02:00
  • 14b1d33933 Fix exporting some lapack and cblas Isuru Fernando 2020-10-18 21:42:32 -05:00
  • 77669b019d Merge pull request #2915 from bartoldeman/no-empty_sgemm_direct_skylakex Martin Kroeker 2020-10-19 00:09:54 +02:00
  • 5e8ddc9001 Merge pull request #2913 from martin-frbg/issue2910 Martin Kroeker 2020-10-18 23:04:56 +02:00
  • 03e781b766 sgemm_direct_skylakex: fix 75eeb26 regression. Bart Oldeman 2020-10-18 19:50:38 +00:00
  • f1a4071d8c Clean up STACKSIZE redefinition Martin Kroeker 2020-10-18 19:41:43 +02:00
  • 97cf10062f Clean up STACKSIZE redefinition Martin Kroeker 2020-10-18 19:39:18 +02:00
  • 17e288e18d Clean up STACKSIZE redefinition Martin Kroeker 2020-10-18 19:37:04 +02:00
  • c1422f3e46 Clean up STACKSIZE redefinition Martin Kroeker 2020-10-18 19:31:01 +02:00