Commit Graph

  • 326d394a0f Add get_num_procs implementation for AIX Martin Kroeker 2018-10-31 18:38:22 +01:00
  • 6af8e35a24 Merge pull request #1837 from embray/set-num-thread-after-fork Martin Kroeker 2018-10-30 12:41:24 +01:00
  • 38cf5d9364 ensure that threading has been initialized in the first place before calling openblas_set_num_threads Erik M. Bray 2018-10-28 21:16:52 +00:00
  • 8a43baacb2 Merge pull request #1836 from martin-frbg/zen2core Martin Kroeker 2018-10-28 20:00:01 +01:00
  • 64ca44873b Fix detection of Ryzen2 (missing CORE_ZEN) Martin Kroeker 2018-10-28 18:36:55 +01:00
  • 2d8064174c register push/pop command change fengrl 2018-10-26 17:55:15 +08:00
  • 76a66eaac8 Merge pull request #1829 from ashwinyes/develop_aarch64_dynamic_arch_support Martin Kroeker 2018-10-23 18:14:28 +02:00
  • 2992e3886a disable threading in C/ZSWAP copying from S/DSWAP Andrew 2018-10-22 23:21:49 +03:00
  • d5aeff636f ARM64: Enable DYNAMIC_ARCH Ashwin Sekhar T K 2018-10-18 05:15:45 -07:00
  • af2837c392 ARM64: Remove #define ARMV8 for THUNDERX Ashwin Sekhar T K 2018-10-22 01:49:16 -07:00
  • e7b66cd36e ARM64: Fix DYNAMIC_ARCH compilation for cores which dont use GEMM3M Ashwin Sekhar T K 2018-10-18 05:13:02 -07:00
  • d50abc8903 ARM64: Move parameters from parameter.c to param.h Ashwin Sekhar T K 2018-10-18 05:02:23 -07:00
  • 351a0c777c ARM64: Remove XGENE1 references Ashwin Sekhar T K 2018-10-18 04:51:24 -07:00
  • e3c262e5cf Merge pull request #1825 from brada4/hemv Martin Kroeker 2018-10-21 20:34:05 +02:00
  • a293bdcd5e re-arrange new code for readability Andrew 2018-10-20 21:37:53 +03:00
  • c7bbf9c987 Attempt to tame _hemv threading #1820 Andrew 2018-10-20 11:13:29 +03:00
  • 898a8dcaba init Andrew 2018-10-20 10:55:04 +03:00
  • 71c6deed60 Merge pull request #1821 from ashwinyes/develop_aarch64_armv8neonkernels Martin Kroeker 2018-10-18 08:13:05 +02:00
  • 21f46a1cf2 ARM64: Use THUNDERX2T99 Neon Kernels for ARMV8 Ashwin Sekhar T K 2018-10-17 08:11:27 -07:00
  • caf339412f ARM64: Remove dependency of THUNDERX2T99 Makefile on CORTEXA57 Makefile Ashwin Sekhar T K 2018-10-17 08:02:40 -07:00
  • 8001fdcd2a ARM64: Remove dependency of THUNDERX Makefile on ARMV8 Makefile Ashwin Sekhar T K 2018-10-17 08:02:16 -07:00
  • 162e312832 ARM64: Remove dependency of CORTEXA57 Makefile on ARMV8 Makefile Ashwin Sekhar T K 2018-10-17 08:01:45 -07:00
  • c3d93caa8d ARM64: Remove dependency of XGENE1 Makefile on ARMV8 Makefile Ashwin Sekhar T K 2018-10-17 08:01:27 -07:00
  • a71923514f Merge pull request #1815 from fenrus75/sgemm_beta_fix Martin Kroeker 2018-10-14 19:57:34 +02:00
  • 55b244ca0d enable the SGEMM/SKX C based kernel Arjan van de Ven 2018-10-12 09:30:35 +00:00
  • 2263d3906c Merge pull request #1812 from martin-frbg/issue1806-2 Martin Kroeker 2018-10-11 21:51:31 +02:00
  • 81c9985c3a Use KERNEL_DEFINITIONS rather than COMMON_OPTS to pass -march=skylake-avx512 Martin Kroeker 2018-10-11 11:03:27 +02:00
  • 56ebc7b53e Merge pull request #1808 from martin-frbg/issue1806 Martin Kroeker 2018-10-11 07:48:08 +02:00
  • c5f88f5a57 Merge pull request #1807 from xianyi/revert-1798-cmake-avx512 Martin Kroeker 2018-10-11 07:47:53 +02:00
  • 8a11ec19d1 Syntax fix Martin Kroeker 2018-10-10 23:47:35 +02:00
  • fa53b903db Add -march=skylake-avx512 to CFLAGS when the target is Skylake Martin Kroeker 2018-10-10 19:22:01 +02:00
  • 84bcdf9c66 Revert "Add -march=skylake-avx512 when required" revert-1798-cmake-avx512 Martin Kroeker 2018-10-10 19:15:32 +02:00
  • 8f7e986184 Merge pull request #1802 from martin-frbg/issue1801 Martin Kroeker 2018-10-10 08:52:53 +02:00
  • d0e83666ad Merge pull request #1804 from fenrus75/sgemm Martin Kroeker 2018-10-10 08:50:44 +02:00
  • d4bad73834 Add a C+intrinsics version of the SGEMM/skylakex kernel Arjan van de Ven 2018-10-10 01:49:22 +00:00
  • 065763adde Merge pull request #1800 from fengrl/patch-1 Martin Kroeker 2018-10-09 10:56:37 +02:00
  • 210b03b543 Merge pull request #1792 from martin-frbg/cmakesuffix Martin Kroeker 2018-10-09 10:34:52 +02:00
  • 6234a32656 Use cygwin compilation workaround for avx512 on msys2/mingw64 as well Martin Kroeker 2018-10-09 10:31:59 +02:00
  • c0d7cd3dac Merge pull request #1799 from martin-frbg/issue1796 Martin Kroeker 2018-10-09 08:20:52 +02:00
  • 667f0cc1cb Merge pull request #1793 from fenrus75/ncopy Martin Kroeker 2018-10-09 08:19:14 +02:00
  • d4c8853a02 Update common_mips64.h fengrl 2018-10-09 11:20:16 +08:00
  • d3d58f8ee5 Catch conflicting usage of ARCH in at least some BSD environments Martin Kroeker 2018-10-08 22:29:35 +02:00
  • 697dc1baf8 Use override for ARCH in make.inc Martin Kroeker 2018-10-08 22:26:59 +02:00
  • a9b51b8448 Merge pull request #1798 from martin-frbg/cmake-avx512 Martin Kroeker 2018-10-08 21:15:17 +02:00
  • eba394c711 Add -march=skylake-avx512 when required Martin Kroeker 2018-10-08 19:18:12 +02:00
  • 582c589727 dgemm/skylakex: replace discrete mul/add with fma Arjan van de Ven 2018-10-06 23:13:26 +00:00
  • adbf6afa25 Add vector optimizations for ncopy as well for dgemm/skylakex Arjan van de Ven 2018-10-06 21:18:12 +00:00
  • 32bec8afbb add a skylakex optimized dgemm beta function Arjan van de Ven 2018-10-06 16:36:26 +00:00
  • 6e2c494556 Merge pull request #1791 from dev-zero/develop Martin Kroeker 2018-10-06 16:29:29 +02:00
  • 20c5d668fe dgemm/avx512 simplify and speed up the 4x4 kernel Arjan van de Ven 2018-10-06 14:12:32 +00:00
  • 6d43c51ccf undo slow dgemm/skylake microoptimization Arjan van de Ven 2018-10-06 14:00:37 +00:00
  • d74dc39b0f Add optimized *copy versions for skylakex Arjan van de Ven 2018-10-06 13:47:20 +00:00
  • 41951da6d4 Merge pull request #6 from xianyi/develop Martin Kroeker 2018-10-06 14:36:36 +02:00
  • 474f7e9583 Add SYMBOLPREFIX and -SUFFIX options and improve help output Martin Kroeker 2018-10-06 14:28:04 +02:00
  • 79ea839b63 fix parallel build issues with APFS/HFS+/ext2/3 in netlib-lapack Tiziano Müller 2018-10-06 14:10:02 +02:00
  • f7f97c6148 Merge pull request #1789 from brada4/develop Martin Kroeker 2018-10-05 20:42:37 +02:00
  • 6f22e1cfb8 Merge pull request #1788 from fenrus75/avx512-8x16 Martin Kroeker 2018-10-05 20:40:38 +02:00
  • 66b43affbc Add a 24x8 kernel to the skylakex dgemm implementation Arjan van de Ven 2018-10-05 13:22:21 +00:00
  • 1938819c25 skylake dgemm: Add a 16x8 kernel Arjan van de Ven 2018-10-05 11:49:43 +00:00
  • bda3dbe2eb update travis alpine chroot with avx512 intrinsics headers Andrew 2018-10-05 15:47:55 +03:00
  • c3e0f0eb38 update travis alpine chroot with avx512 intrinsics headers Andrew 2018-10-05 15:41:52 +03:00
  • a980953bd7 Merge pull request #1785 from brada4/develop Martin Kroeker 2018-10-05 08:25:38 +02:00
  • 78c99d5231 Merge pull request #1784 from fenrus75/dgemm-avx512 Martin Kroeker 2018-10-05 08:03:27 +02:00
  • b7496c3638 Function name needs to be CNAME, set from outside to allow suffixing for dynamic_arch Martin Kroeker 2018-10-04 19:14:59 +02:00
  • 95f4e87579 Merge pull request #1787 from jeromerobert/develop Martin Kroeker 2018-10-04 18:41:47 +02:00
  • b095f2fad6 Fix unknown type name __WAIT_STATUS on RHEL5 Jerome Robert 2018-10-04 12:27:44 +02:00
  • 02ef20a1e4 Merge pull request #1786 from martin-frbg/immintrin Martin Kroeker 2018-10-04 09:07:09 +02:00
  • 4c3643ed7f Check availability of immintrin.h in the AVX512 compatibility test Martin Kroeker 2018-10-04 07:36:49 +02:00
  • 591cca7cb0 Check availability of immintrin.h in the AVX512 compatibility test Martin Kroeker 2018-10-04 07:35:30 +02:00
  • 3439158dea address #1782 2nd loop Andrew 2018-10-03 21:20:50 +02:00
  • 45fe8cb0c5 Create a AVX512 enabled version of DGEMM Arjan van de Ven 2018-10-03 14:45:25 +00:00
  • 544b069e85 Merge pull request #1780 from martin-frbg/issue1774-2 Martin Kroeker 2018-09-29 09:27:47 +02:00
  • 9b2a7ad40d Convert fldmia/fstmia instructions to UAL syntax for clang7 Martin Kroeker 2018-09-28 23:05:15 +02:00
  • 10ce70701a Merge pull request #1778 from fengrl/develop Martin Kroeker 2018-09-26 11:14:58 +02:00
  • 6fc85a6359 test_axpy work error on LOONGSON3A platform #1777 fengruilin 2018-09-26 15:14:04 +08:00
  • 831c661386 Merge pull request #1775 from martin-frbg/issue1774 Martin Kroeker 2018-09-25 18:58:39 +02:00
  • 7e5df34e6a Convert fldmia/fstmia instructions to UAL syntax for clang7 Martin Kroeker 2018-09-25 09:41:58 +02:00
  • 4f45040b89 Merge pull request #1773 from martin-frbg/issue1767 Martin Kroeker 2018-09-23 23:25:15 +02:00
  • 28aa94bf4b Include thread numbers in failure message from blas_thread_init Martin Kroeker 2018-09-22 14:00:15 +02:00
  • 56e7c68810 Merge pull request #1771 from staticfloat/sf/ldflags Martin Kroeker 2018-09-22 13:11:39 +02:00
  • cf6df9464c Document the stub status of the QUAD_PRECiSION code (#1772) Martin Kroeker 2018-09-22 12:31:37 +02:00
  • 6f77af2eef Add $(LDFLAGS) to $(CC) and $(FC) invocations within exports/Makefile Elliot Saba 2018-09-21 09:19:51 +00:00
  • 4d183e5567 Merge pull request #1765 from martin-frbg/issue1761 Martin Kroeker 2018-09-19 22:02:21 +02:00
  • 34d55fd165 Merge pull request #1764 from yurivict/64-suffix Martin Kroeker 2018-09-19 18:16:38 +02:00
  • b991570210 Merge pull request #1762 from martin-frbg/issue1710-2 Martin Kroeker 2018-09-19 18:16:21 +02:00
  • 288aeea8a2 Fix default settings - USE_TLS and USE_SIMPLE_THREADED_LEVEL3 should both be off Martin Kroeker 2018-09-19 18:08:31 +02:00
  • 1ad1e79062 Catch inadvertent USE_TLS=0 declaration Martin Kroeker 2018-09-19 18:03:43 +02:00
  • b402626509 Do not use the new TLS code for non-threaded builds even if USE_TLS is set Martin Kroeker 2018-09-16 12:43:36 +02:00
  • ec0cac1669 Merge pull request #4 from xianyi/develop Martin Kroeker 2018-09-16 12:36:49 +02:00
  • 2349e15149 Allow to install the 'interfare64' version concurrently with the regular version Yuri 2018-09-15 19:59:17 -07:00
  • f3c262156e Add an explicit cast to silence a warning Martin Kroeker 2018-09-13 14:24:29 +02:00
  • 30f5a69ab8 Add explicit cast to silence a warning Martin Kroeker 2018-09-13 14:23:31 +02:00
  • fd081a91e4 Merge pull request #1759 from martin-frbg/lapack283 Martin Kroeker 2018-09-11 13:52:09 +02:00
  • 094f8c3b57 remove unused variable ldb_t Martin Kroeker 2018-09-11 10:53:47 +02:00
  • 5cf090f516 remove unused variable ldb_t Martin Kroeker 2018-09-11 10:52:30 +02:00
  • 58363542e7 remove unused variable ldb_t Martin Kroeker 2018-09-11 10:51:17 +02:00
  • 3abc22a5bf Merge pull request #1757 from brada4/develop Martin Kroeker 2018-09-09 22:55:15 +02:00
  • 1e531701b7 fix small typo Andrew 2018-09-09 16:52:25 +02:00
  • 5d42b6ea04 Merge pull request #1756 from martin-frbg/issue1754 Martin Kroeker 2018-09-07 11:02:18 +02:00
  • ba4f433321 Merge pull request #1749 from martin-frbg/issue1531 Martin Kroeker 2018-09-07 11:02:01 +02:00