Commit Graph

  • 9809931eb4 clean up unused variables and unreachable statements Martin Kroeker 2021-12-21 18:53:55 +01:00
  • 6b407a16cb fix function typecasts Martin Kroeker 2021-12-21 18:51:28 +01:00
  • aecb4a5e8d fix function typecasts Martin Kroeker 2021-12-21 18:50:22 +01:00
  • c49d46f25f fix function typecast Martin Kroeker 2021-12-21 18:49:18 +01:00
  • 64365c919e fix function typecasts Martin Kroeker 2021-12-21 18:47:35 +01:00
  • d1ee6ff73f fix function typecasts Martin Kroeker 2021-12-21 18:45:28 +01:00
  • 07fe5b19a4 typecast function pointers Martin Kroeker 2021-12-21 12:31:54 +01:00
  • e3c9947c0f prepare kernel for sve zgemm Bine Brank 2021-12-21 11:19:27 +01:00
  • 8d9b9c6b2a loongarch64: Optimize dgemm_kernel gxw 2021-12-21 09:22:59 +08:00
  • 8cec83bdfb Update version to 0.3.19.dev Martin Kroeker 2021-12-19 21:22:19 +01:00
  • 894fde9bfe Update version to 0.3.19.dev Martin Kroeker 2021-12-19 21:21:47 +01:00
  • d1c6270c52 Merge pull request #3489 from xianyi/release-0.3.0 Martin Kroeker 2021-12-19 21:21:13 +01:00
  • 2480e5046e Update version to 0.3.19 v0.3.19 Martin Kroeker 2021-12-19 20:55:57 +01:00
  • 488911486a Merge pull request #3488 from xianyi/develop Martin Kroeker 2021-12-19 20:54:49 +01:00
  • 54a0c0bce3 Merge branch 'release-0.3.0' into develop Martin Kroeker 2021-12-19 16:35:07 +01:00
  • 6025daca63 Update version to 0.3.19 Martin Kroeker 2021-12-19 16:32:04 +01:00
  • e545614cd0 Merge pull request #3487 from martin-frbg/0319changes Martin Kroeker 2021-12-19 16:30:47 +01:00
  • b6001a2ee3 Update with 0.3.19 changes Martin Kroeker 2021-12-19 14:34:14 +01:00
  • 9c8d1e013f Merge pull request #3486 from martin-frbg/nvhpc Martin Kroeker 2021-12-18 23:09:30 +01:00
  • ed430cd963 Update -tp option for recent nvfortran on x86_64 Martin Kroeker 2021-12-18 21:56:26 +01:00
  • b3f4b8c95a Merge pull request #3485 from martin-frbg/issue3453 Martin Kroeker 2021-12-17 11:08:36 +01:00
  • 6ed52576f8 Add feature-based fallback for unknown x86_64 cpus Martin Kroeker 2021-12-16 22:02:49 +01:00
  • 126ad48991 Merge pull request #3484 from martin-frbg/issue3481 Martin Kroeker 2021-12-16 21:50:28 +01:00
  • f67a0620a3 Merge pull request #3480 from wzgpeter/develop Martin Kroeker 2021-12-16 21:50:06 +01:00
  • 449fb7d849 Merge pull request #3478 from ffontaine/develop Martin Kroeker 2021-12-16 21:49:19 +01:00
  • 7a7fbb11c3 define "unlikely" on non-cygwin too Martin Kroeker 2021-12-16 17:28:28 +01:00
  • b31349c22a Open up delayed (re)init to non-Cygwin OS as well Martin Kroeker 2021-12-16 16:58:12 +01:00
  • 4d61e453cc Merge pull request #3483 from martin-frbg/issue3482 Martin Kroeker 2021-12-16 11:54:20 +01:00
  • f3b51ec608 move brace inside the ifdef block Martin Kroeker 2021-12-16 09:37:58 +01:00
  • 92b7b949dd fix bug in zscal function Wu Zhigang 2021-12-15 00:22:19 -08:00
  • a0cc119f26 Makefile: also consider -O, -Og and -Os when stripping flags Thomas De Schampheleire 2021-12-14 23:36:16 +01:00
  • c8d05aa7a5 Move the threads overflow flag under the protection of the local blas lock (#3476) Martin Kroeker 2021-12-13 08:34:52 +01:00
  • b0a590f4fe Merge pull request #3475 from wjc404/optimize-A53-dgemm Martin Kroeker 2021-12-12 19:09:08 +01:00
  • f4d1f0333b Merge pull request #3474 from rafaelcfsousa/rafael/cmake_power Martin Kroeker 2021-12-12 19:08:27 +01:00
  • b610d2de37 optimize cgemm on ARM cortex A53 & cortex A55 Jia-Chen 2021-12-12 17:22:52 +08:00
  • 697e2752d7 Merge pull request #3464 from binebrank/arm_sve_sgemm Martin Kroeker 2021-12-11 20:35:22 +01:00
  • a8f62a347b fix UNROLL_MN and add to targets for SVE Bine Brank 2021-12-11 16:37:23 +01:00
  • 774267fdac adjust Makefile.L3 for SVE Bine Brank 2021-12-11 16:35:08 +01:00
  • d38110a5ce Use CMake variables instead of as Rafael Cardoso Fernandes Sousa 2021-12-10 17:35:28 -06:00
  • 23a7561353 Fix error cmake (small kernels) Rafael Cardoso Fernandes Sousa 2021-12-09 09:57:39 -06:00
  • 214fbcee15 Fix cmake for power Rafael Cardoso Fernandes Sousa 2021-12-09 08:28:17 -06:00
  • f7f7fea0dc Merge pull request #3472 from kavanabhat/p10_aixas_p8 Martin Kroeker 2021-12-09 07:28:57 +01:00
  • 2241068c26 Merge pull request #3469 from martin-frbg/issue2986 Martin Kroeker 2021-12-08 22:19:32 +01:00
  • 3e9a52869c Fix ar path in ARMV7 Darwin NDK build on Azure (#3473) Martin Kroeker 2021-12-08 22:18:44 +01:00
  • eee3381cbe Fallback for Power kernels kavanabhat 2021-12-08 03:52:23 -06:00
  • 5378046abd roll back DGEMM kernels to 4x8 when compiling for DYNAMIC_ARCH Martin Kroeker 2021-12-06 19:43:54 +01:00
  • dd1f645371 switch DGEMM unroll parameters for SkylakeX if DYNAMIC_ARCH Martin Kroeker 2021-12-06 19:42:51 +01:00
  • a1fea1fe2a sgemm v2x8 SVE kernel Bine Brank 2021-12-05 18:47:29 +01:00
  • 2ae73a2b34 Merge pull request #3468 from martin-frbg/issue3467 Martin Kroeker 2021-12-05 15:52:44 +01:00
  • 8d11278e28 Fix hardcoded library name Martin Kroeker 2021-12-05 14:38:41 +01:00
  • abe1ce3434 strmm sve v1x8 kernel Bine Brank 2021-12-05 14:03:08 +01:00
  • ea09355eae Fix DYNAMIC_ARCH builds with CMAKE on OSX and add corresponding test to Azure CI (#3409) Martin Kroeker 2021-12-04 22:24:02 +01:00
  • 54d321d742 Merge pull request #3466 from rafaelcfsousa/rafael/small_matrix_p10 Martin Kroeker 2021-12-03 12:12:20 +01:00
  • c248442df4 Merge pull request #3465 from kavanabhat/develop Martin Kroeker 2021-12-03 12:11:43 +01:00
  • 1470b7e4de Delete test_zhemv.c Martin Kroeker 2021-12-03 11:41:53 +01:00
  • 0882db30a2 Merge pull request #3455 from cenewcombe/develop Martin Kroeker 2021-12-03 10:01:20 +01:00
  • 9a45b5123f Update Makefile.system kavanabhat 2021-12-02 13:29:38 +05:30
  • 84125e4035 Merge pull request #1 from kavanabhat/as_check_fix kavanabhat 2021-12-01 20:30:43 +05:30
  • 7b5b93037d Fix truncated assembler checks kavanabhat 2021-12-01 19:30:40 +05:30
  • 0de36f7b5c trmm sve copy fucntions for single precision Bine Brank 2021-11-29 21:25:05 +01:00
  • c78fdcc80d [POWER] Add support for SMALL_MATRIX_OPT Rafael Cardoso Fernandes Sousa 2021-11-16 14:47:41 -06:00
  • 86ae89bf33 add sgemm kernel and copy functions for sgemm and ssymm Bine Brank 2021-11-28 18:12:47 +01:00
  • 454edd741c Merge pull request #3425 from binebrank/arm_sve_dgemm Martin Kroeker 2021-11-26 16:14:55 +01:00
  • bcfbdc81b2 Merge pull request #3459 from rafaelcfsousa/fix_cmake Martin Kroeker 2021-11-26 15:19:24 +01:00
  • 7c6370cbfd Merge pull request #3462 from martin-frbg/azure-alpine2 Martin Kroeker 2021-11-26 13:40:23 +01:00
  • fbfc8b1b83 Update alpine-chroot-install again Martin Kroeker 2021-11-26 13:39:49 +01:00
  • ca65a4e91d update CONTRIBUTORS.md Bine Brank 2021-11-26 13:11:19 +01:00
  • 1af73ce38e Adapt CMake for SVE Bine Brank 2021-11-26 10:35:01 +01:00
  • e7fca060db Merge pull request #3457 from wjc404/optimize-A53-dgemm Martin Kroeker 2021-11-26 10:30:47 +01:00
  • bc4c98de26 Merge pull request #3456 from martin-frbg/issue3444 Martin Kroeker 2021-11-26 10:29:28 +01:00
  • c3b1e55bdc AzureCI: Fetch alpine-chroot-install from master to get key updates (#3460) Martin Kroeker 2021-11-26 09:38:41 +01:00
  • 5c1cd5e0c2 MOD: add comments to a53 zgemm kernel Jia-Chen 2021-11-25 22:48:48 +08:00
  • d5c9353f1b Modify the order that cmake set the KERNEL variables (generic now is fallback) Rafael Cardoso Fernandes Sousa 2021-11-24 20:07:20 -06:00
  • fb891f33da Fix the cmake parser to identify more patterns Rafael Cardoso Fernandes Sousa 2021-11-24 14:07:28 -06:00
  • 9f59b19fcd MOD: optimize zgemm on cortex-A53/cortex-A55 Jia-Chen 2021-11-24 21:51:45 +08:00
  • f4da23dcb6 reduced dgemm_unroll_m to work with 128-bit sve Bine Brank 2021-11-23 21:18:08 +01:00
  • 531a28b6a0 removed unused code (compiler warnings) Bine Brank 2021-11-22 10:12:34 +01:00
  • 9b9cb90bb1 modify Makefile for SVE copy Bine Brank 2021-11-22 09:54:20 +01:00
  • 9388f05a3c configure SVE Makefile Bine Brank 2021-11-21 18:33:43 +01:00
  • b58d4f31ab some clean-up & commentary Bine Brank 2021-11-21 14:56:27 +01:00
  • 52a3f004a0 Fix unintended reversion of recent CortexA53 changes Martin Kroeker 2021-11-20 23:54:48 +01:00
  • a3cd36acff Add CMAKE support for cross-compiling to MIPS32 Martin Kroeker 2021-11-20 17:34:28 +01:00
  • b7df500106 Add generic mips32 target Martin Kroeker 2021-11-20 17:31:51 +01:00
  • 19ccef5fb1 Add generic MIPS32 target Martin Kroeker 2021-11-20 17:31:11 +01:00
  • e6ed4be02e symm SVE copy rutines Bine Brank 2021-11-20 16:35:29 +01:00
  • feeb8283a5 Fix unsafe read during final iteration of zsymv_L_sse2.S Caroline Newcombe 2021-11-19 14:29:32 -06:00
  • ec4daf420f Merge pull request #3451 from wjc404/optimize-A53-dgemm Martin Kroeker 2021-11-18 18:17:27 +01:00
  • 302f22693a MOD: optimize normal DGEMM on ARMV8 cortex-A53 & cortex-A55 Jia-Chen 2021-11-18 21:14:43 +08:00
  • 7b825531a6 Merge pull request #3450 from mmuetzel/suffix-nofortran Martin Kroeker 2021-11-16 23:58:09 +01:00
  • de2ed66596 cmake: Set SUFFIX64 also for NOFORTRAN Markus Mützel 2021-11-15 08:53:52 +01:00
  • 3c7eed0e53 add remaining trmm copy rutines for SVE Bine Brank 2021-11-14 16:00:10 +01:00
  • 8f6c8d1a9e Merge pull request #3449 from martin-frbg/mips_msa Martin Kroeker 2021-11-14 12:01:53 +01:00
  • 46947efb83 Ignore compiler support for MIPS MSA if the cpu lacks this capability Martin Kroeker 2021-11-13 23:32:26 +01:00
  • a569fa1540 MIPS P5600 and 24KC,1004K cpus do not support MSA Martin Kroeker 2021-11-13 23:26:48 +01:00
  • d6194d6a0c get MSA capability from feature flags Martin Kroeker 2021-11-13 23:25:34 +01:00
  • 7d996b1c36 dtrmm_utcopy sve function Bine Brank 2021-11-13 18:48:53 +01:00
  • 5b7a3c0e1b Merge pull request #3447 from martin-frbg/issue3446 Martin Kroeker 2021-11-11 09:29:36 +01:00
  • 9cc0098ce2 Fix potentially wrong HOSTARCH definition in cross-compilation Martin Kroeker 2021-11-10 22:27:14 +01:00
  • ab7917910d add v2x8 kernel + fix sve dtrmm Bine Brank 2021-11-07 20:37:51 +01:00
  • 2d7ca63e21 Merge pull request #3443 from martin-frbg/issue3441 Martin Kroeker 2021-11-05 12:23:47 +01:00