Commit Graph

  • 5f8f0583d4
    Merge branch 'develop' into fc-1847 Martin Kroeker 2018-11-07 08:47:52 +0100
  • 974a6a30f2
    Merge pull request #1858 from brada4/buff-1847 Martin Kroeker 2018-11-07 08:46:55 +0100
  • 9531d0e175 lets fit it in one 4k page Andrew 2018-11-06 17:51:24 +0000
  • 40cce0e353 handle cmake too Andrew 2018-11-06 09:45:49 +0000
  • 3fd41313fc add low bound for number of buffers Andrew 2018-11-06 09:40:13 +0000
  • a931afe269 init Andrew 2018-11-06 09:39:05 +0000
  • 7d3502b500 Add -frecursive gfortran option by default Andrew 2018-11-06 08:20:55 +0000
  • 066f8065d1 init Andrew 2018-11-06 08:19:08 +0000
  • fb5b2177ca [Arm64) Revert A53 detection as A57 Renato Golin 2018-11-05 11:30:12 +0000
  • 96942c6c75
    Revert change from #532 due to unsafe use of static buffer Martin Kroeker 2018-11-03 16:25:40 +0100
  • f1c02273cb
    Merge pull request #1846 from fenrus75/threadsize Martin Kroeker 2018-11-02 13:18:01 +0100
  • 661035477c
    Merge pull request #1850 from martin-frbg/issue1811 Martin Kroeker 2018-11-02 09:50:51 +0100
  • aa7e47aa0a
    Merge pull request #1849 from martin-frbg/aix_install2 Martin Kroeker 2018-11-01 20:39:16 +0100
  • 9c177d270b
    Restore Android/ARMv7 build fix from #778 Martin Kroeker 2018-11-01 18:50:25 +0100
  • b025523197
    Use installbsd on AIX Martin Kroeker 2018-11-01 18:26:08 +0100
  • 5b50bd36f7
    Merge pull request #1845 from martin-frbg/aix_install Martin Kroeker 2018-11-01 09:53:10 +0100
  • 5b708e5eb1 sgemm/dgemm: add a way for an arch kernel to specify prefered sizes Arjan van de Ven 2018-11-01 01:43:20 +0000
  • dcc5d6291e skylakex: Make the sgemm/dgemm beta code robust for a N=0 or M=0 case Arjan van de Ven 2018-11-01 01:42:09 +0000
  • 7b5aea52bb
    Accomodate AIX install, which has different syntax Martin Kroeker 2018-10-31 21:50:34 +0100
  • f5595d0262
    Merge pull request #1843 from martin-frbg/aix_numprocs Martin Kroeker 2018-10-31 21:25:15 +0100
  • 326d394a0f
    Add get_num_procs implementation for AIX Martin Kroeker 2018-10-31 18:38:22 +0100
  • e0c6b4df93 also remove varied sys/time.h reincarnations where EPOCH consts are ifn-re-defined later Andrew 2018-10-30 13:21:47 +0000
  • 6af8e35a24
    Merge pull request #1837 from embray/set-num-thread-after-fork Martin Kroeker 2018-10-30 12:41:24 +0100
  • bffcbaca4c clean includes duplicating #include "common.h" Andrew 2018-10-30 11:25:46 +0000
  • d5b3b96cbf init Andrew 2018-10-30 11:17:38 +0000
  • 38cf5d9364 ensure that threading has been initialized in the first place before calling openblas_set_num_threads Erik M. Bray 2018-10-28 21:16:52 +0000
  • 8a43baacb2
    Merge pull request #1836 from martin-frbg/zen2core Martin Kroeker 2018-10-28 20:00:01 +0100
  • 64ca44873b
    Fix detection of Ryzen2 (missing CORE_ZEN) Martin Kroeker 2018-10-28 18:36:55 +0100
  • 2d8064174c
    register push/pop command change fengrl 2018-10-26 17:55:15 +0800
  • ea252c711b Merge branch 'develop' of https://github.com/fengrl/OpenBLAS into develop fengruilin 2018-10-26 17:10:12 +0800
  • 3754c5f012 roll back fengruilin 2018-10-26 17:07:35 +0800
  • 90b51b70e0 loongson can use blas_lock fengruilin 2018-10-26 16:51:42 +0800
  • 34617742ec register should be push/pull with sdc1/ldc1 on mips64 fengruilin 2018-10-26 16:43:07 +0800
  • 76a66eaac8
    Merge pull request #1829 from ashwinyes/develop_aarch64_dynamic_arch_support Martin Kroeker 2018-10-23 18:14:28 +0200
  • 2992e3886a disable threading in C/ZSWAP copying from S/DSWAP Andrew 2018-10-22 23:21:49 +0300
  • d5aeff636f ARM64: Enable DYNAMIC_ARCH Ashwin Sekhar T K 2018-10-18 05:15:45 -0700
  • af2837c392 ARM64: Remove #define ARMV8 for THUNDERX Ashwin Sekhar T K 2018-10-22 01:49:16 -0700
  • e7b66cd36e ARM64: Fix DYNAMIC_ARCH compilation for cores which dont use GEMM3M Ashwin Sekhar T K 2018-10-18 05:13:02 -0700
  • d50abc8903 ARM64: Move parameters from parameter.c to param.h Ashwin Sekhar T K 2018-10-18 05:02:23 -0700
  • 351a0c777c ARM64: Remove XGENE1 references Ashwin Sekhar T K 2018-10-18 04:51:24 -0700
  • 2cba58f7eb register push to stack modify, a bug fengruilin 2018-10-22 10:48:09 +0800
  • e3c262e5cf
    Merge pull request #1825 from brada4/hemv Martin Kroeker 2018-10-21 20:34:05 +0200
  • f74805609e
    Point out possible parallelism issues TiborGY 2018-10-21 13:36:47 +0200
  • a293bdcd5e re-arrange new code for readability Andrew 2018-10-20 21:37:53 +0300
  • c7bbf9c987 Attempt to tame _hemv threading #1820 Andrew 2018-10-20 11:13:29 +0300
  • 898a8dcaba init Andrew 2018-10-20 10:55:04 +0300
  • 71c6deed60
    Merge pull request #1821 from ashwinyes/develop_aarch64_armv8neonkernels Martin Kroeker 2018-10-18 08:13:05 +0200
  • 21f46a1cf2 ARM64: Use THUNDERX2T99 Neon Kernels for ARMV8 Ashwin Sekhar T K 2018-10-17 08:11:27 -0700
  • caf339412f ARM64: Remove dependency of THUNDERX2T99 Makefile on CORTEXA57 Makefile Ashwin Sekhar T K 2018-10-17 08:02:40 -0700
  • 8001fdcd2a ARM64: Remove dependency of THUNDERX Makefile on ARMV8 Makefile Ashwin Sekhar T K 2018-10-17 08:02:16 -0700
  • 162e312832 ARM64: Remove dependency of CORTEXA57 Makefile on ARMV8 Makefile Ashwin Sekhar T K 2018-10-17 08:01:45 -0700
  • c3d93caa8d ARM64: Remove dependency of XGENE1 Makefile on ARMV8 Makefile Ashwin Sekhar T K 2018-10-17 08:01:27 -0700
  • a71923514f
    Merge pull request #1815 from fenrus75/sgemm_beta_fix Martin Kroeker 2018-10-14 19:57:34 +0200
  • 973ac24d27 remove dead assignments Andrew 2018-10-14 00:26:03 +0300
  • b9e504b5f8 clean last unused variable warning in C LAPACK Andrew 2018-10-12 19:13:00 +0300
  • d631ada332 init Andrew 2018-10-12 19:01:40 +0300
  • 55b244ca0d enable the SGEMM/SKX C based kernel Arjan van de Ven 2018-10-12 09:30:35 +0000
  • abd5c6c85f init Andrew 2018-10-12 00:15:04 +0300
  • d4afd59fb1 remove surplus locking code , only enabled w x86, disabled or never enabled on all others Andrew 2018-10-11 23:29:34 +0300
  • d4bfd11db7 init Andrew 2018-10-11 23:20:30 +0300
  • acc9953137 permit perl in PATH different from /usr/bin/perl Andrew 2018-10-11 23:12:34 +0300
  • 9b94960cc1 init Andrew 2018-10-11 23:11:15 +0300
  • da3566292a
    Merge 65dcb85450 into 2263d3906c Andrew 2018-10-11 19:52:39 +0000
  • 2263d3906c
    Merge pull request #1812 from martin-frbg/issue1806-2 Martin Kroeker 2018-10-11 21:51:31 +0200
  • 81c9985c3a
    Use KERNEL_DEFINITIONS rather than COMMON_OPTS to pass -march=skylake-avx512 Martin Kroeker 2018-10-11 11:03:27 +0200
  • 56ebc7b53e
    Merge pull request #1808 from martin-frbg/issue1806 Martin Kroeker 2018-10-11 07:48:08 +0200
  • c5f88f5a57
    Merge pull request #1807 from xianyi/revert-1798-cmake-avx512 Martin Kroeker 2018-10-11 07:47:53 +0200
  • 8a11ec19d1
    Syntax fix Martin Kroeker 2018-10-10 23:47:35 +0200
  • 65dcb85450 remove #1782 'first loop' Andrew 2018-10-10 21:00:38 +0300
  • fa53b903db
    Add -march=skylake-avx512 to CFLAGS when the target is Skylake Martin Kroeker 2018-10-10 19:22:01 +0200
  • 84bcdf9c66
    Revert "Add -march=skylake-avx512 when required" revert-1798-cmake-avx512 Martin Kroeker 2018-10-10 19:15:32 +0200
  • 8f7e986184
    Merge pull request #1802 from martin-frbg/issue1801 Martin Kroeker 2018-10-10 08:52:53 +0200
  • d0e83666ad
    Merge pull request #1804 from fenrus75/sgemm Martin Kroeker 2018-10-10 08:50:44 +0200
  • d4bad73834 Add a C+intrinsics version of the SGEMM/skylakex kernel Arjan van de Ven 2018-10-10 01:49:22 +0000
  • 065763adde
    Merge pull request #1800 from fengrl/patch-1 Martin Kroeker 2018-10-09 10:56:37 +0200
  • 210b03b543
    Merge pull request #1792 from martin-frbg/cmakesuffix Martin Kroeker 2018-10-09 10:34:52 +0200
  • 6234a32656
    Use cygwin compilation workaround for avx512 on msys2/mingw64 as well Martin Kroeker 2018-10-09 10:31:59 +0200
  • c0d7cd3dac
    Merge pull request #1799 from martin-frbg/issue1796 Martin Kroeker 2018-10-09 08:20:52 +0200
  • 667f0cc1cb
    Merge pull request #1793 from fenrus75/ncopy Martin Kroeker 2018-10-09 08:19:14 +0200
  • d4c8853a02
    Update common_mips64.h fengrl 2018-10-09 11:20:16 +0800
  • d3d58f8ee5
    Catch conflicting usage of ARCH in at least some BSD environments Martin Kroeker 2018-10-08 22:29:35 +0200
  • 697dc1baf8
    Use override for ARCH in make.inc Martin Kroeker 2018-10-08 22:26:59 +0200
  • a9b51b8448
    Merge pull request #1798 from martin-frbg/cmake-avx512 Martin Kroeker 2018-10-08 21:15:17 +0200
  • eba394c711
    Add -march=skylake-avx512 when required Martin Kroeker 2018-10-08 19:18:12 +0200
  • 6b0c7c6d06 optimize thread lock on mips64 fengruilin 2018-10-08 16:06:43 +0800
  • 582c589727 dgemm/skylakex: replace discrete mul/add with fma Arjan van de Ven 2018-10-06 23:13:26 +0000
  • adbf6afa25 Add vector optimizations for ncopy as well for dgemm/skylakex Arjan van de Ven 2018-10-06 21:18:12 +0000
  • 32bec8afbb add a skylakex optimized dgemm beta function Arjan van de Ven 2018-10-06 16:36:26 +0000
  • 6e2c494556
    Merge pull request #1791 from dev-zero/develop Martin Kroeker 2018-10-06 16:29:29 +0200
  • 20c5d668fe dgemm/avx512 simplify and speed up the 4x4 kernel Arjan van de Ven 2018-10-06 14:12:32 +0000
  • 6d43c51ccf undo slow dgemm/skylake microoptimization Arjan van de Ven 2018-10-06 14:00:37 +0000
  • d74dc39b0f Add optimized *copy versions for skylakex Arjan van de Ven 2018-10-06 13:47:20 +0000
  • 41951da6d4
    Merge pull request #6 from xianyi/develop Martin Kroeker 2018-10-06 14:36:36 +0200
  • 474f7e9583
    Add SYMBOLPREFIX and -SUFFIX options and improve help output Martin Kroeker 2018-10-06 14:28:04 +0200
  • 79ea839b63 fix parallel build issues with APFS/HFS+/ext2/3 in netlib-lapack Tiziano Müller 2018-10-06 14:10:02 +0200
  • f7f97c6148
    Merge pull request #1789 from brada4/develop Martin Kroeker 2018-10-05 20:42:37 +0200
  • 6f22e1cfb8
    Merge pull request #1788 from fenrus75/avx512-8x16 Martin Kroeker 2018-10-05 20:40:38 +0200
  • 66b43affbc Add a 24x8 kernel to the skylakex dgemm implementation Arjan van de Ven 2018-10-05 13:22:21 +0000
  • 1938819c25 skylake dgemm: Add a 16x8 kernel Arjan van de Ven 2018-10-05 11:49:43 +0000
  • bda3dbe2eb update travis alpine chroot with avx512 intrinsics headers Andrew 2018-10-05 15:47:55 +0300