Commit Graph

  • d9894f45d3 Define sbgemm_r to fix DYNAMIC_ARCH builds Martin Kroeker 2022-02-25 10:04:00 +01:00
  • 522f809825 Merge pull request #3542 from martin-frbg/issue3540 Martin Kroeker 2022-02-24 00:00:00 +01:00
  • d50287fa5b Merge pull request #3544 from giordano/mg/gcc6 Martin Kroeker 2022-02-23 23:57:57 +01:00
  • abbc947edb Fix compilation of Skylake AVX512 kernels with GCC 6 Mosè Giordano 2022-02-23 22:51:59 +00:00
  • f2f0e1287b Merge pull request #3541 from martin-frbg/issue3530 Martin Kroeker 2022-02-23 23:13:53 +01:00
  • c62f8e2c01 Prevent compiler attempts to use k0 as mask register Martin Kroeker 2022-02-23 20:12:20 +01:00
  • 80eb581c83 Fix non-portable u_int64_t Martin Kroeker 2022-02-23 20:10:59 +01:00
  • 73ffabe6ba Guard uses of _mm512_reduce_add_p? Martin Kroeker 2022-02-23 20:06:14 +01:00
  • 5ad66f0e96 Merge pull request #3537 from xianyi/release-0.3.0 Martin Kroeker 2022-02-21 06:57:27 +01:00
  • 0b678b19dc Update version to 0.3.20 v0.3.20 Martin Kroeker 2022-02-20 22:35:05 +01:00
  • 15ff556862 Merge pull request #3536 from xianyi/develop Martin Kroeker 2022-02-20 22:33:59 +01:00
  • 1564b632ad Merge branch 'release-0.3.0' into develop Martin Kroeker 2022-02-20 22:33:45 +01:00
  • dec53e0ca2 Update version to 0.3.20 Martin Kroeker 2022-02-20 22:30:50 +01:00
  • c3f8de7923 Merge pull request #3535 from martin-frbg/0320changes Martin Kroeker 2022-02-20 22:21:02 +01:00
  • c352ac0ae3 Update with 0.3.20 changes Martin Kroeker 2022-02-20 22:16:04 +01:00
  • 77433af83e Merge pull request #3532 from martin-frbg/issue3528-2 Martin Kroeker 2022-02-11 11:44:32 +01:00
  • db7a03dd4c keep flang-classic on MacOS from trying to create an executable instead of a library Martin Kroeker 2022-02-10 23:04:45 +01:00
  • 0e04710099 filter out libflangmain as well Martin Kroeker 2022-02-10 23:03:05 +01:00
  • dc80925c92 Merge pull request #3531 from martin-frbg/issue2973 Martin Kroeker 2022-02-10 14:16:08 +01:00
  • e2bf3f31a6 Add .NOTPARALLEL: as a workaround for builds on DFS Martin Kroeker 2022-02-09 22:09:25 +01:00
  • 92d243fee3 Merge pull request #3527 from martin-frbg/issue3490 Martin Kroeker 2022-02-07 08:14:11 +01:00
  • fa3e9f25e6 Support AVX512-enabled Alder Lake Martin Kroeker 2022-02-07 00:00:56 +01:00
  • f7e8f9ec57 Support AVX512-enabled AlderLake Martin Kroeker 2022-02-07 00:00:15 +01:00
  • 7656aba00e Merge pull request #3493 from martin-frbg/casts+cleanup Martin Kroeker 2022-02-06 23:55:06 +01:00
  • aec32e5bd4 Update azure-pipelines.yml Martin Kroeker 2022-02-05 22:39:03 +01:00
  • 3007ca6371 Merge pull request #3524 from martin-frbg/lapack646 Martin Kroeker 2022-02-03 22:31:23 +01:00
  • a3eea3e127 Fix input argument check (LAPACK PR 646) Martin Kroeker 2022-02-03 11:43:17 +01:00
  • b212577e50 Merge pull request #3521 from martin-frbg/issue3520 Martin Kroeker 2022-01-28 13:39:36 +01:00
  • 63483ba0ff Merge pull request #3522 from martin-frbg/issue3517 Martin Kroeker 2022-01-28 10:36:57 +01:00
  • d2b5fbf80f Exclude some complex (LAPACK) functions when NO_LAPACK is set Martin Kroeker 2022-01-27 22:02:08 +01:00
  • 7f0b11fbc1 Exclude some complex drivers when NO_LAPACK is set Martin Kroeker 2022-01-27 22:00:39 +01:00
  • addc2a7aaa Add proper defaults for IMIN/IMAX Martin Kroeker 2022-01-27 19:56:32 +01:00
  • 204e021515 Merge pull request #3518 from martin-frbg/elbrus Martin Kroeker 2022-01-25 20:57:59 +01:00
  • b0d39349f9 Merge pull request #3516 from mmuetzel/no-fortran Martin Kroeker 2022-01-25 20:57:38 +01:00
  • 5d24f3d210 Update CONTRIBUTORS.md Martin Kroeker 2022-01-22 19:09:00 +01:00
  • 66a15e15a8 Update CONTRIBUTORS.md Martin Kroeker 2022-01-22 19:02:57 +01:00
  • 299d4d70a3 Add default KERNEL file for Elbrus E2K arch Martin Kroeker 2022-01-22 18:59:36 +01:00
  • 3492bea602 Create Makefile Martin Kroeker 2022-01-22 18:57:28 +01:00
  • 898cf5faf3 Add Elbrus e2k architecture support Martin Kroeker 2022-01-22 18:55:10 +01:00
  • bc93f468ef Add Elbrus E2000 architecture as generic x86_64 compatible Martin Kroeker 2022-01-22 18:53:38 +01:00
  • 1937b4e435 Add Elbrus e2k architecture detection Martin Kroeker 2022-01-22 18:27:38 +01:00
  • 00f44bfff7 cmake: Check if Fortran compiler is usable before enabling it. Markus Mützel 2022-01-21 13:27:17 +01:00
  • c1c0d5ce1d Merge pull request #3492 from binebrank/arm_sve_zgemm Martin Kroeker 2022-01-18 21:36:33 +01:00
  • 19d435b1b3 update armv8sve + contributors Bine Brank 2022-01-18 08:28:31 +01:00
  • f158d59087 adapt CMake Bine Brank 2022-01-17 22:36:48 +01:00
  • 8ac2c1daf0 Merge pull request #3514 from martin-frbg/issue3513 Martin Kroeker 2022-01-17 19:22:18 +01:00
  • 40003f8edb Fix pivot offset calculation for negative incx Martin Kroeker 2022-01-17 00:11:18 +01:00
  • 57e2a72f40 Fix pivot offset calculation for negative incx Martin Kroeker 2022-01-17 00:10:21 +01:00
  • 3b6293f5a0 Fix offset calculation for negative incx Martin Kroeker 2022-01-17 00:09:14 +01:00
  • afa0cece5c Fix pivot offset calculation for negative incx Martin Kroeker 2022-01-17 00:08:20 +01:00
  • eca2f50b48 Fix pivot offset calculation for negative incx Martin Kroeker 2022-01-17 00:07:33 +01:00
  • 0e9e951306 Fix pivot offset calculation for negative incx Martin Kroeker 2022-01-17 00:06:41 +01:00
  • 1b49ef8dcf Fix pivot index for negative increments Martin Kroeker 2022-01-17 00:05:33 +01:00
  • b6a445cfd8 adapt Makefile for SVE trsm Bine Brank 2022-01-16 21:40:56 +01:00
  • 0fb6cc07bf fix ztrsm lt/ut copy Bine Brank 2022-01-16 21:39:57 +01:00
  • f1315288a8 add sve ztrsm Bine Brank 2022-01-15 22:27:25 +01:00
  • aaa2b1a861 fix sve dtrsm kernels Bine Brank 2022-01-15 21:02:14 +01:00
  • 8071e179f1 add remaining sve trsm copy kernels Bine Brank 2022-01-11 21:16:38 +01:00
  • f87468ac91 trsm_lncopy_sve Bine Brank 2022-01-10 21:45:37 +01:00
  • e8939b3d30 sve trsmRN and trsmRT Bine Brank 2022-01-10 20:42:20 +01:00
  • 5188aede5d Merge pull request #3511 from martin-frbg/cmakeutils Martin Kroeker 2022-01-10 09:12:52 +01:00
  • a9e297e476 Fix handling of ifdef/ifndef Martin Kroeker 2022-01-09 23:31:59 +01:00
  • 098672b51b add trsm_kernel_LT_sve Bine Brank 2022-01-09 20:11:47 +01:00
  • be7e55880c sve trsm_kernel_LN Bine Brank 2022-01-09 19:40:04 +01:00
  • 499ae5e8f7 Merge pull request #3510 from martin-frbg/issue3505 Martin Kroeker 2022-01-09 14:50:51 +01:00
  • b6b024232d Merge pull request #3508 from snadampal/v1_n2 Martin Kroeker 2022-01-09 14:50:26 +01:00
  • 2573ccfb2e make DYNAMIC_ARCH option available to getarch_2nd/param.h Martin Kroeker 2022-01-08 23:50:34 +01:00
  • f1ac59f200 Forward DYNAMIC_ARCH option to Makefile.prebuild Martin Kroeker 2022-01-08 23:48:58 +01:00
  • 15d4b37913 SkylakeX: match parameters to dgemm kernels for dyn/non-dyn Martin Kroeker 2022-01-08 23:48:13 +01:00
  • 19c8f615dc OpenBLAS: aarch64: Add neoverse-v1/n2 architecture specifics Sunita Nadampalli 2022-01-07 00:28:17 +00:00
  • cbcea149f0 update contributors Bine Brank 2022-01-06 10:29:35 +01:00
  • bb33446b40 fix makefile.L3 Bine Brank 2022-01-06 10:26:11 +01:00
  • f33543d029 combine zchemm into single file Bine Brank 2022-01-05 14:42:37 +01:00
  • 0c91d043ae adapt CMake for SVE Bine Brank 2022-01-05 14:36:39 +01:00
  • 39ab219704 sve copy functions for cgemm chemm zsymm Bine Brank 2022-01-05 09:12:22 +01:00
  • 18102ae8c3 add cgemm ctrmm sve kernels Bine Brank 2022-01-05 09:09:18 +01:00
  • 87537b8c55 modify sve zgemmcopy kernels Bine Brank 2022-01-05 09:07:28 +01:00
  • d30157d891 update configuration of kernels for A64FX and ARMV8SVE Bine Brank 2022-01-05 09:00:54 +01:00
  • 07fa6fa3b1 configure Makefile for sve Bine Brank 2022-01-05 08:57:51 +01:00
  • 2e2c02b762 fix sve ztrmm kernel Bine Brank 2022-01-04 14:42:07 +01:00
  • 68c414d3a6 ztrmm sve copy functions Bine Brank 2022-01-04 14:40:59 +01:00
  • ce329ab686 add sve zhemm copy routines Bine Brank 2022-01-03 15:56:05 +01:00
  • 0140373802 add sve ztrmm Bine Brank 2022-01-02 19:15:33 +01:00
  • ecf034b250 Merge pull request #3502 from jgillis/develop Martin Kroeker 2022-01-01 12:12:32 +01:00
  • f8b1ca5039 Merge pull request #3504 from martin-frbg/issue3503 Martin Kroeker 2022-01-01 11:43:17 +01:00
  • b329e45288 Guard against omp_get_num_places returning zero Martin Kroeker 2022-01-01 00:46:23 +01:00
  • f7b6912868 ztrmm sve copy kernels Bine Brank 2021-12-30 21:00:16 +01:00
  • ea3db69faa Fix cmake crosscompilation for core2 target jgillis 2021-12-29 22:50:20 +01:00
  • 40b14e4957 fix zgemm kernel Bine Brank 2021-12-29 11:42:04 +01:00
  • ee823b6ed9 Merge pull request #3500 from martin-frbg/osx_dyn_xerbla Martin Kroeker 2021-12-28 22:54:27 +01:00
  • 6cae44d4f7 Ensure that the right xerbla gets included in OSX DYNAMIC_ARCH builds Martin Kroeker 2021-12-28 19:06:55 +01:00
  • a06b4aff52 Merge pull request #3496 from yuanhec/develop Martin Kroeker 2021-12-28 18:51:56 +01:00
  • 9d455b1b09 Merge remote-tracking branch 'upstream/develop' into develop yuanhecai 2021-12-27 09:50:57 +08:00
  • 6ec4aab875 zgemm sve copy routines Bine Brank 2021-12-26 17:05:46 +01:00
  • 878064f394 sve zgemm kernel Bine Brank 2021-12-26 08:44:05 +01:00
  • 683a7548bf added macros for sve zgemm kernels Bine Brank 2021-12-25 11:46:41 +01:00
  • 7b146e590c fix function typecast Martin Kroeker 2021-12-24 20:01:52 +01:00
  • e9a0e52201 fix function typecast Martin Kroeker 2021-12-24 20:00:50 +01:00
  • 2db0b2e445 Fixed MSA enabled optimization on Loongson-3A4000 yuanhecai 2021-12-23 20:04:27 +08:00
  • 253670383f Merge pull request #3491 from gxw-loongson/develop Martin Kroeker 2021-12-22 08:34:12 +01:00