Commit Graph

  • 919c221cec deploy: 453b9e4886 gh-pages martin-frbg 2024-10-31 16:47:30 +00:00
  • 453b9e4886 Merge pull request #4961 from h-vetinari/flang develop Martin Kroeker 2024-10-31 17:46:55 +01:00
  • d3272e51eb explicitly link to OpenMP H. Vetinari 2024-07-01 09:05:00 +11:00
  • c634114c8a Merge pull request #4960 from martin-frbg/gemmtr Martin Kroeker 2024-10-30 18:37:26 +01:00
  • 598bd21991 Merge pull request #4958 from XiWeiGu/x86_64_opt_somatcopy_ct_avx Martin Kroeker 2024-10-30 16:04:00 +01:00
  • c9d1a3b912 Merge pull request #4954 from XiWeiGu/la64_update_gh_actions Martin Kroeker 2024-10-30 14:35:57 +01:00
  • c3e7d08fb4 Copy GEMMT to its new name GEMMTR Martin Kroeker 2024-10-30 12:56:16 +01:00
  • 0cf656fd3e Add copies of GEMMT under its new name GEMMTR Martin Kroeker 2024-10-30 12:55:14 +01:00
  • 2edf548203 deploy: 24b5ccaf4b martin-frbg 2024-10-30 11:49:09 +00:00
  • 24b5ccaf4b Merge pull request #4202 from TiborGY/inlines_pt2 Martin Kroeker 2024-10-30 12:48:35 +01:00
  • 73c6a28073 x86_64: opt somatcopy_ct with AVX gxw 2024-10-29 06:31:58 +00:00
  • f66e6d32c2 Merge pull request #4953 from NickelWenzel/fix_trtrs_return_types Martin Kroeker 2024-10-25 23:29:24 +02:00
  • a8bb105ed6 Merge pull request #4848 from haampie/fix/cmake-min-version Martin Kroeker 2024-10-25 20:59:13 +02:00
  • 0e6a2cc93c bump the minimum_required version instead Martin Kroeker 2024-10-25 16:47:52 +02:00
  • 815cb24944 remove unused INLINE macro definitions TGY 2023-08-16 06:06:00 +02:00
  • 2c65e504bd deploy: ac736820d7 martin-frbg 2024-10-25 11:44:27 +00:00
  • ac736820d7 Merge pull request #4955 from cdaley/optimize_gemv_forwarding Martin Kroeker 2024-10-25 13:43:54 +02:00
  • 8f595382c4 gh-actions: Test LoongArch64 with gcc14 from Ubuntu 24.04 gxw 2024-10-25 03:12:15 +00:00
  • cb48505251 optimize gemv forwarding on ARM64 systems Chris Daley 2024-10-24 21:05:26 -07:00
  • 79f4bbd4cd fix: return types of *trtrs routines nickel 2024-10-24 11:20:02 +02:00
  • 6405318ea8 deploy: 72461f1c8c martin-frbg 2024-10-23 14:40:35 +00:00
  • 72461f1c8c Merge pull request #4950 from ayappanec/fix-aix-build Martin Kroeker 2024-10-23 16:40:02 +02:00
  • 020cce1068 Fix build issues with gcc compiler as well Ayappan Perumal 2024-10-23 04:24:06 -05:00
  • b6ec73e77c Fix AIX build Ayappan Perumal 2024-10-21 07:38:03 -05:00
  • 97749d4d8a deploy: 8a0cd5fcef martin-frbg 2024-10-20 19:53:28 +00:00
  • 8a0cd5fcef Merge pull request #4949 from martin-frbg/mingw32-14.2 Martin Kroeker 2024-10-20 21:52:57 +02:00
  • 4dba6ce6ea work around mingw32-gfortran 14.2 miscompiling CBLAS1 tests Martin Kroeker 2024-10-20 20:25:06 +02:00
  • a93ec74e95 Merge pull request #4948 from martin-frbg/fixhavesve Martin Kroeker 2024-10-18 20:00:42 +02:00
  • c4bb4e74fc NeoverseN2 has SVE too Martin Kroeker 2024-10-18 14:50:55 +02:00
  • 86720778ef write HAVE_SVE to config where applicable Martin Kroeker 2024-10-18 14:14:43 +02:00
  • 286161c23b deploy: 016bdb9b0b martin-frbg 2024-10-18 12:03:36 +00:00
  • 016bdb9b0b Merge pull request #4946 from XiWeiGu/la64_omatcopy_lasx Martin Kroeker 2024-10-18 14:03:06 +02:00
  • ffaa5765a4 Bench: Add omatcopy gxw 2024-10-17 12:32:54 +00:00
  • a93897276b Merge pull request #4943 from martin-frbg/update_readme Martin Kroeker 2024-10-17 21:13:48 +02:00
  • 3fc1225dd6 Merge branch 'OpenMathLib:develop' into update_readme Martin Kroeker 2024-10-17 21:08:58 +02:00
  • 33078d11e4 stress importance of TARGET setting in DYNAMIC_ARCH builds Martin Kroeker 2024-10-17 21:07:49 +02:00
  • 0cb3240a11 deploy: 15a57598f5 martin-frbg 2024-10-17 17:21:41 +00:00
  • 15a57598f5 Merge pull request #4944 from ChipKerchner/vectorizeBF16GEMV Martin Kroeker 2024-10-17 19:21:07 +02:00
  • ab71a1edf2 Better VSX. Chip Kerchner 2024-10-17 08:25:02 -05:00
  • bb31bbef52 LoongArch64: Opt somatcopy_ct with LASX gxw 2024-10-17 11:45:13 +00:00
  • b37129341b LoongArch64: Opt somatcopy_cn with LASX gxw 2024-10-17 11:27:55 +00:00
  • acf6cab304 LoongArch64: Opt somatcopy_rn with LASX gxw 2024-10-17 09:50:02 +00:00
  • 15edb441bf LoongArch64: Opt somatcopy_rt with LASX gxw 2024-10-14 17:36:56 +08:00
  • 457d1c6972 remove unused CI badges, wiki->docs, xianyi->OpenMathLib Martin Kroeker 2024-10-17 10:33:08 +02:00
  • a23de0a334 deploy: 6a60eb1a02 martin-frbg 2024-10-16 07:39:01 +00:00
  • 6a60eb1a02 Merge pull request #4924 from XiWeiGu/la64_readme Martin Kroeker 2024-10-16 09:38:18 +02:00
  • 8483a71169 Merge pull request #4937 from martin-frbg/lapack1064 Martin Kroeker 2024-10-14 21:52:41 +02:00
  • 22628f1a69 Fix leading dimension for B (Reference-LAPACK PR 1064) Martin Kroeker 2024-10-14 18:59:03 +02:00
  • 27ed6da331 Fix leading dimension for B (Reference-LAPACK PR 1064) Martin Kroeker 2024-10-14 18:57:50 +02:00
  • 7018c1b001 Fix leading dimension for B (Reference-LAPACK PR 1064) Martin Kroeker 2024-10-14 18:56:44 +02:00
  • a659f40fe1 Fix leading dimension for B (Reference-LAPACK PR 1064) Martin Kroeker 2024-10-14 18:53:30 +02:00
  • 191a33a916 deploy: c979c1d948 martin-frbg 2024-10-14 06:13:58 +00:00
  • c979c1d948 Merge pull request #4936 from martin-frbg/fixmips64generic Martin Kroeker 2024-10-14 08:13:27 +02:00
  • a47b3c8867 Fix unroll parameter selection for MIPS64_GENERIC Martin Kroeker 2024-10-13 22:54:34 +02:00
  • 2391dc1c0f Merge branch 'vectorizeBF16GEMV' of github.ibm.com:PowerAppLibs/OpenBLAS into vectorizeBF16GEMV Chip Kerchner 2024-10-13 13:48:33 -05:00
  • 36bd3eeddf Vectorize BF16 GEMV (VSX & MMA). Use GEMM_GEMV_FORWARD_BF16 (for Power). Chip Kerchner 2024-10-13 13:46:11 -05:00
  • f8e113f27b Replace types with include file. Chip Kerchner 2024-10-13 10:55:03 -05:00
  • a53a197934 Merge remote-tracking branch 'origin/develop' into vectorizeBF16GEMV Chip Kerchner 2024-10-12 15:15:17 -05:00
  • 2302328a78 deploy: 3184b7f209 martin-frbg 2024-10-12 15:20:14 +00:00
  • 3184b7f209 Merge pull request #4933 from ChipKerchner/thread_sbgemv Martin Kroeker 2024-10-12 17:19:41 +02:00
  • 0082240044 Merge branch 'thread_sbgemv' into vectorizeBF16GEMV Chip Kerchner 2024-10-11 16:13:59 -05:00
  • 1d51ca5798 Change multi-threading logic for SBGEMV to be the same as SGEMV. Chip Kerchner 2024-10-11 16:08:48 -05:00
  • c8f53b85ce Merge remote-tracking branch 'origin/develop' into vectorizeBF16GEMV Chip Kerchner 2024-10-11 11:10:20 -05:00
  • bd6f77e3ce deploy: 18a23c23f7 martin-frbg 2024-10-11 06:54:36 +00:00
  • 18a23c23f7 Merge pull request #4929 from martin-frbg/issue4905 Martin Kroeker 2024-10-11 08:54:02 +02:00
  • 5a79446bdb Merge pull request #4918 from HaoZeke/testFixes Martin Kroeker 2024-10-10 21:53:18 +02:00
  • 7ba6591ff2 Merge branch 'OpenMathLib:develop' into issue4905 Martin Kroeker 2024-10-10 21:50:38 +02:00
  • 550bc77832 Fix expectation values for CblasRowMajor order Martin Kroeker 2024-10-10 20:39:29 +02:00
  • 48ad5bea0f deploy: e0ad20f72b martin-frbg 2024-10-10 14:18:38 +00:00
  • e0ad20f72b Merge pull request #4932 from martin-frbg/cirrusosxndk Martin Kroeker 2024-10-10 16:18:07 +02:00
  • e4bc5e4718 remove stray quote Martin Kroeker 2024-10-10 11:02:56 +02:00
  • b89fb9632f Update Android NDK install path for M1/armv7 crossbuild Martin Kroeker 2024-10-10 10:19:11 +02:00
  • e52d9b4cf1 Merge pull request #4928 from austinpagan/czgemm_in_c Martin Kroeker 2024-10-09 20:26:21 +02:00
  • dbd83762f9 Merge pull request #4926 from NickelWenzel/fix_arm64_windows_and_uwp Martin Kroeker 2024-10-09 19:48:16 +02:00
  • 9762464718 Fix CBLAS interface filling in the wrong triangle for Row-Major Martin Kroeker 2024-10-09 18:06:39 +02:00
  • 0b7fb5c791 CGEMM & ZGEMM using C code. Gordon Fossum 2024-10-09 09:42:23 -05:00
  • bee123e8e3 fix: add missing NO_AFFINITY checks NickelWenzel 2024-10-09 16:36:40 +02:00
  • 1d2af7ab1a deploy: 7ac5b9011f martin-frbg 2024-10-09 14:19:22 +00:00
  • 7ac5b9011f Merge pull request #4923 from martin-frbg/zen5 Martin Kroeker 2024-10-09 16:18:47 +02:00
  • 3ab8b1408e LoongArch64: Update README.md gxw 2024-10-08 21:08:09 +08:00
  • 2c3b87a082 Add preliminary cpu autodetection for Zen5/5c Martin Kroeker 2024-10-08 23:07:42 +02:00
  • 73c1882129 Merge pull request #4922 from martin-frbg/issue4904-2 Martin Kroeker 2024-10-07 13:24:14 +02:00
  • 522ee5f0ea deploy: bc0691a556 martin-frbg 2024-10-07 06:26:38 +00:00
  • bc0691a556 Merge pull request #4920 from martin-frbg/issue4917 Martin Kroeker 2024-10-07 08:26:03 +02:00
  • b0346e72f4 update names of loongarch64 targets for cross-compilation Martin Kroeker 2024-10-06 22:48:33 +02:00
  • 9c707dc6b9 Update dynamic arch list to new target scheme Martin Kroeker 2024-10-06 22:46:03 +02:00
  • 9783dd07ab Rename KERNEL.LOONGSONGENERIC to KERNEL.LA64_GENERIC Martin Kroeker 2024-10-06 22:43:11 +02:00
  • dda8b0427a deploy: 0dfe42d62a martin-frbg 2024-10-06 20:29:58 +00:00
  • 0dfe42d62a Merge pull request #4919 from martin-frbg/issue4916-2 Martin Kroeker 2024-10-06 22:29:28 +02:00
  • d6bb8dcfd1 Common code. Chip Kerchner 2024-10-06 14:13:43 -05:00
  • 8a1710dd0d don't apply switch_ratio to tail of loop Martin Kroeker 2024-10-06 20:03:32 +02:00
  • c9e92348a6 Handle inf/nan if dummy2 flag is set Martin Kroeker 2024-10-06 19:57:17 +02:00
  • d9f368dfe6 TST: Signal abort for ctest failures correctly Rohit Goswami 2024-07-29 03:51:21 +00:00
  • 722e4ae07a MAINT: Explicitly replace instead of unknown Rohit Goswami 2024-07-30 15:24:23 +00:00
  • a6b7751881 BUG: Allow tests to be run multiple times Rohit Goswami 2024-07-30 15:14:05 +00:00
  • 9ac0fb0111 Merge branch 'develop' into vectorizeBF16GEMV Chip Kerchner 2024-10-04 06:49:53 -05:00
  • 624e9d110e Merge pull request #4916 from martin-frbg/issue4901 Martin Kroeker 2024-10-03 23:25:45 +02:00
  • d714013ab9 change sgemm kernel to 4x4 as the 16x4 altivec goes out of bounds Martin Kroeker 2024-10-03 22:04:20 +02:00
  • 7c4f3638fd switch PPCG4 SGEMM kernel to 4x4 Martin Kroeker 2024-10-03 22:00:15 +02:00
  • 915a6d6e44 Add casting. Chip Kerchner 2024-10-03 14:08:21 -05:00