Commit Graph

  • 3f46d0c79a Merge pull request #4381 from darshanp4/issue_4323 Martin Kroeker 2023-12-19 16:53:53 +01:00
  • 60e66725e4 Use numeric labels to allow repeated inlining Chris Sidebottom 2023-12-19 13:11:06 +00:00
  • 7a4fef4f60 Tweak SVE dot kernel Chris Sidebottom 2023-12-15 12:50:48 +00:00
  • dab0da8243 Update GEMM param for NEOVERSEV1 Darshan Patel 2023-12-19 13:56:55 +05:30
  • 3b520a56a9 Merge pull request #4378 from martin-frbg/issue3871 Martin Kroeker 2023-12-15 21:58:56 +01:00
  • 563daadc92 Merge pull request #4379 from barracuda156/ppc970 Martin Kroeker 2023-12-15 20:03:44 +01:00
  • 8c143331b0 PPC970: drop -mcpu=970 which seems to produce faulty code barracuda156 2023-12-15 22:55:52 +08:00
  • d2f1594bca Merge pull request #4368 from martin-frbg/issue4073 Martin Kroeker 2023-12-15 14:49:52 +01:00
  • 544cb86300 Mention C906V instruction set limitation and update DYNAMIC_ARCH lists Martin Kroeker 2023-12-15 14:03:59 +01:00
  • 8793601e86 Merge pull request #4375 from martin-frbg/issue4352 Martin Kroeker 2023-12-15 13:35:18 +01:00
  • f06b535566 Use C kernel for dgemv_t due to limitations of the old assembly one Martin Kroeker 2023-12-15 09:58:44 +01:00
  • 293131d6b9 Merge pull request #4370 from barracuda156/unbreak_powerpc Martin Kroeker 2023-12-14 10:30:03 +01:00
  • 981e315b30 cc.cmake: use -force_cpusubtype_ALL for Darwin PPC barracuda156 2023-12-14 12:01:31 +08:00
  • d9653af018 KERNEL.PPC970, KERNEL.PPCG4: unbreak CMake parsing barracuda156 2023-12-13 19:23:50 +08:00
  • 302ca7edc7 Merge pull request #4371 from barracuda156/970 Martin Kroeker 2023-12-13 14:32:37 +01:00
  • a8d3619f65 cc.cmake: add optflags for G5 and G4 kernels barracuda156 2023-12-13 19:42:56 +08:00
  • aa46f1e4e7 revert addition of MSVC-compatible complex (moved to lapacke_config.h) Martin Kroeker 2023-12-12 23:07:48 +01:00
  • dcdc351272 Add MSVC-compatible complex types Martin Kroeker 2023-12-12 23:06:22 +01:00
  • 55a0718f72 Merge pull request #4369 from ChipKerchner/power10Copies Martin Kroeker 2023-12-12 18:49:21 +01:00
  • 93747fb377 Merge remote-tracking branch 'origin/develop' into power10Copies Chip-Kerchner 2023-12-12 09:32:49 -06:00
  • dcf6999c4e remove extraneous endif Martin Kroeker 2023-12-12 11:27:17 +01:00
  • 6bd7c54af5 introduce MT_TRACE to clean up SMP_DEBUG code Mark Seminatore 2023-12-11 15:13:04 -08:00
  • 330101e0b3 Add complex type definitions for MSVC Martin Kroeker 2023-12-11 21:52:00 +01:00
  • d9f1478068 Merge pull request #4367 from barracuda156/unbreak_powerpc Martin Kroeker 2023-12-11 21:38:32 +01:00
  • 9dbc8129b3 cpuid_power.c: add CPU_SUBTYPE_POWERPC_7400 case barracuda156 2023-12-11 21:09:06 +08:00
  • c732f275a2 system_check.cmake: fix arch detection for Darwin PowerPC barracuda156 2023-12-11 21:05:31 +08:00
  • e60fb0f397 Merge pull request #4359 from mseminatore/win_perf Martin Kroeker 2023-12-09 23:40:26 +01:00
  • efa9515a23 Merge branch 'OpenMathLib:develop' into win_perf Mark Seminatore 2023-12-09 10:09:49 -08:00
  • 4e738e561a Replace two vector loads with one vector pair load and fix endianess of stores. Chip-Kerchner 2023-12-08 12:36:08 -06:00
  • 1332f8a822 Merge pull request #4159 from OMaghiarIMG/risc-v-tail-policy Martin Kroeker 2023-12-08 10:25:41 +01:00
  • edac80d7e8 some cleanup, dynamically scale threads, add missing WIN_CASE defn Mark Seminatore 2023-12-07 14:59:27 -08:00
  • 2d316c2920 Merge pull request #4125 from OMaghiarIMG/risc-v Martin Kroeker 2023-12-07 14:50:58 +01:00
  • 5b09833b1c Merge pull request #4019 from uniontech-lilinjie/develop Martin Kroeker 2023-12-07 14:46:17 +01:00
  • 3193aa9c7e Merge pull request #4362 from yinshiyou/la-dev Martin Kroeker 2023-12-07 09:15:15 +01:00
  • d32f38fb37 loongarch64: Add optimizations for nrm2. yancheng 2023-12-07 13:15:55 +08:00
  • f9b468990e loongarch64: Add optimizations for rot. yancheng 2023-12-07 13:12:29 +08:00
  • c80e7e27d1 loongarch64: Add optimizations for sum and asum. yancheng 2023-12-07 13:08:03 +08:00
  • d4c96a35a8 loongarch64: Add optimizations for axpy and axpby. yancheng 2023-12-07 13:02:03 +08:00
  • 360acc0a41 loongarch64: Add optimizations for swap. yancheng 2023-12-07 12:57:05 +08:00
  • 174c25766b loongarch64: Add optimizations for copy. yancheng 2023-12-07 12:15:46 +08:00
  • 49829b2b7d loongarch64: Add optimizations for iamin. yancheng 2023-12-07 12:11:30 +08:00
  • be83f5e4e0 loongarch64: Add optimizations for iamax. yancheng 2023-12-07 12:07:30 +08:00
  • e3fb2b5afa loongarch64: Add optimizations for imin. yancheng 2023-12-07 12:01:05 +08:00
  • e46b48e372 loongarch64: Add optimizations for imax. yancheng 2023-12-07 11:56:41 +08:00
  • 702fc1d56d loongarch64: Add optimization for min. yancheng 2023-12-07 11:51:19 +08:00
  • 346b384d1c loongarch64: Add optimization for max. yancheng 2023-12-07 11:30:02 +08:00
  • ff2ecc6cda loongarch64: Add optimization for amin. yancheng 2023-12-07 11:08:09 +08:00
  • 265b5f2e80 loongarch64: Add optimizations for amax. yancheng 2023-12-07 10:57:13 +08:00
  • 993ede7c70 loongarch64: Add optimizations for scal. yancheng 2023-11-27 11:30:34 +08:00
  • 4ebf814b42 fix bug failing to mark task as finished. Mark Seminatore 2023-12-05 23:28:37 -08:00
  • 5f51811728 try at new threading model Mark Seminatore 2023-12-05 22:43:36 -08:00
  • a8cb611157 Merge pull request #4358 from martin-frbg/lapack954 Martin Kroeker 2023-12-05 22:20:15 +01:00
  • 589f2b6466 Fix search phrase used to count successful tests (Reference-LAPACK PR 954) Martin Kroeker 2023-12-05 20:10:20 +01:00
  • 6aa5f53e26 Merge pull request #4357 from martin-frbg/lapack953 Martin Kroeker 2023-12-05 20:03:21 +01:00
  • effb7af2a2 Fix memory leak (Reference-LAPACK PR 953) Martin Kroeker 2023-12-05 17:55:38 +01:00
  • 5915a69734 Merge pull request #4356 from martin-frbg/lapack736-2 Martin Kroeker 2023-12-05 17:48:42 +01:00
  • 226a14c549 Restore library path adjustments Martin Kroeker 2023-12-05 15:50:06 +01:00
  • c5fa318add Add tests for DMD (Reference-LAPACK PR 736) Martin Kroeker 2023-12-05 15:45:59 +01:00
  • fa03e5497a Add tests for the DMD functions (Reference-LAPACK PR 736) Martin Kroeker 2023-12-05 15:43:28 +01:00
  • a53a79e059 Add tests for the DMD functions (Reference-LAPACK PR 736) Martin Kroeker 2023-12-05 15:41:39 +01:00
  • e3039fa7f6 Merge pull request #4351 from catap/cmake-old-macos Martin Kroeker 2023-12-05 14:40:18 +01:00
  • 4a12cf53ec [RISC-V] Improve RVV kernel generator LMUL usage Octavian Maghiar 2023-12-04 11:13:35 +00:00
  • e4586e81b8 [RISC-V] Add RISC-V Vector 128-bit target Octavian Maghiar 2023-12-04 11:02:18 +00:00
  • 2381132ada Darwin < 20: always write xerbla.c.o into archive Erik Bråthen Solem 2023-12-03 19:13:53 +01:00
  • 89fa51d495 Revert 42b5e08 ("Allow weak linking on old macOS") Erik Bråthen Solem 2023-12-03 19:06:49 +01:00
  • 08fde5ebd2 Use 64bit build on CMAKE_SYSTEM_PROCESSOR=i386 on Darwin Kirill A. Korinsky 2023-11-30 21:24:58 +00:00
  • 39bf8ece20 Merge pull request #4340 from yinshiyou/la-dev Martin Kroeker 2023-11-29 08:22:25 +01:00
  • 42b5e081d8 Merge pull request #4348 from catap/macos-undefinded-dynamic-lookup Martin Kroeker 2023-11-28 22:14:53 +01:00
  • a1562e4bae Allow weak linking on old macOS Kirill A. Korinsky 2023-11-28 14:04:01 +00:00
  • c4a622db9e Merge pull request #4346 from martin-frbg/issue4343 Martin Kroeker 2023-11-28 14:01:14 +01:00
  • 9fe07d82fd loongarch: Add LSX optimization for dot. Shiyou Yin 2023-11-24 17:57:14 +08:00
  • 13b8c44b44 loongarch: Add optimization for dsdot kernel. Shiyou Yin 2023-11-24 16:40:32 +08:00
  • 3def6a8143 loongarch: Add LASX optimization for dot. Shiyou Yin 2023-11-15 17:24:33 +08:00
  • 1310a0931b loongarch: Refine build control for loongarch64. Shiyou Yin 2023-11-15 16:54:06 +08:00
  • ff92e6e707 Fix installation location of lapacke_mangling header Martin Kroeker 2023-11-28 12:53:35 +01:00
  • b7a28f5e42 Merge pull request #4344 from catap/macos-always-use-ar Martin Kroeker 2023-11-28 12:39:45 +01:00
  • 9beee55167 Enable overstep of too long args without DYNAMIC_ARCH Kirill A. Korinsky 2023-11-27 16:54:49 +00:00
  • 01c7010543 cmake/openblas.pc.in: fixed version and URL Kirill A. Korinsky 2023-11-27 14:28:08 +00:00
  • fc66ecd25a Merge pull request #4339 from martin-frbg/lapack-3-12-0 Martin Kroeker 2023-11-25 23:54:05 +01:00
  • 08be9004f8 Update version number and copyright date to Reference-LAPACK 3.12.0 Martin Kroeker 2023-11-25 18:57:17 +01:00
  • 578f0f9590 Update version number to 3.12.0 Martin Kroeker 2023-11-25 18:53:16 +01:00
  • 3d9e20f614 Update version to 3.12.0 Martin Kroeker 2023-11-25 18:51:54 +01:00
  • f7351e493c Update Reference-LAPACK docs to 3.12.0 Martin Kroeker 2023-11-25 18:49:34 +01:00
  • be8661ba40 Merge pull request #4338 from martin-frbg/lapack941 Martin Kroeker 2023-11-25 18:41:25 +01:00
  • ca5a87ff1d Small documentation fix for Truncated QR With Pivoting (Reference-LAPACK PR 941) Martin Kroeker 2023-11-25 15:31:18 +01:00
  • f745f02f35 benchmark: Fix missing colons in outputs of ./strsv.goto Shiyou Yin 2023-11-24 14:51:37 +08:00
  • 97d3c9b827 Merge pull request #4336 from martin-frbg/fix4322 Martin Kroeker 2023-11-22 22:44:21 +01:00
  • c883abf838 Revert unintentional change to linking rule from PR 4322 Martin Kroeker 2023-11-22 22:41:53 +01:00
  • 8138999cd0 Merge pull request #4333 from codeworm96/update_dynamic_core_readme Martin Kroeker 2023-11-21 13:50:37 +01:00
  • a938e48fa2 Merge pull request #4334 from RajalakshmiSR/Makefile_power Martin Kroeker 2023-11-21 10:24:25 +01:00
  • 47da601a2d POWER: Fixing Makefile error Rajalakshmi Srinivasaraghavan 2023-11-20 17:24:22 -06:00
  • 54be8f4d67 Update the list of default dynamic targets for x86_64 in the README to be consistent with the Makefile Yuning Zhang 2023-11-20 13:28:25 -08:00
  • d526c4306f Merge pull request #4329 from isuruf/sbgemm Martin Kroeker 2023-11-20 15:30:02 +01:00
  • 2ea65bacd0 Merge pull request #4330 from bartoldeman/asum-init-mask Martin Kroeker 2023-11-20 05:38:39 +01:00
  • c34e2cf380 Use _mm_set1_epi{32,64x} to init mask in x86-64 [cz]asum Bart Oldeman 2023-11-19 21:21:23 +00:00
  • 864c65b526 Merge pull request #4328 from martin-frbg/4239-3 Martin Kroeker 2023-11-19 22:06:17 +01:00
  • 6b2651ece3 Fix building test_sbgemm Isuru Fernando 2023-11-19 02:57:13 -06:00
  • 22aa401656 Temporarily disable the AVX512 CASUM/ZASUM microkernels for any version of NVIDIA HPC (#4327) Martin Kroeker 2023-11-19 00:04:31 +01:00
  • 47b03fd4b4 Copy XCode15-specific workaround to Fortran flags to fix build of tests Martin Kroeker 2023-11-18 23:45:02 +01:00
  • df4cd7e82c Merge pull request #4326 from bartoldeman/fix-casum-backup-kernel Martin Kroeker 2023-11-18 19:06:06 +01:00