Commit Graph

  • cacacc8007 Add an AVX512 enabled DSCAL function Arjan van de Ven 2018-08-11 17:14:57 +0000
  • 1a00ef3d27
    Merge pull request #1725 from fenrus75/axpy Martin Kroeker 2018-08-11 11:01:20 +0200
  • 4c0d832ec3
    Merge pull request #1724 from fenrus75/sdot Martin Kroeker 2018-08-11 11:00:56 +0200
  • fc33cbc7bb
    Merge pull request #1728 from martin-frbg/changelog Martin Kroeker 2018-08-10 13:24:36 +0200
  • c52a831ae4
    Add changes from the 0.3.x releases Martin Kroeker 2018-08-10 13:23:47 +0200
  • a72952f6e5 Allow overriding USE_COMPILER_TLS (formerly HAS_COMPILER_TLS). Craig Donner 2018-08-10 09:10:29 +0100
  • 2e99873ff7 Add a AVX512 enabled SAXPY/DAXPY functions Arjan van de Ven 2018-08-10 02:58:32 +0000
  • 00abaa865b Add an AVX512 enabled SDOT function Arjan van de Ven 2018-08-10 02:31:48 +0000
  • 33043f563f
    Disable scal to benchmark zgemv separately by default maamountki 2018-08-10 01:54:18 +0300
  • b30b82ce46
    Merge b1cc69e7a8 into 66da7677bd Arjan van de Ven 2018-08-09 14:30:05 +0000
  • 9b56e815e5
    Merge 732abce9f1 into 66da7677bd Arjan van de Ven 2018-08-09 14:28:34 +0000
  • 66da7677bd
    Merge pull request #1721 from fenrus75/ddot2 Martin Kroeker 2018-08-09 15:39:06 +0200
  • 7932ff3ea9 Add an AVX512 enabled DDOT function Arjan van de Ven 2018-08-08 02:59:11 +0000
  • 732abce9f1 Use intrinsics instead of inline asm Arjan van de Ven 2018-08-05 14:45:54 +0000
  • 4fb9f3b7a5 use named arguments in the inline asm Arjan van de Ven 2018-08-05 14:22:38 +0000
  • 62f4c69708
    Merge pull request #1717 from martin-frbg/issue1708 Martin Kroeker 2018-08-06 22:05:47 +0200
  • 453bfa7e71
    [ZARCH] Restore detect() function maamountki 2018-08-06 20:03:49 +0300
  • 23229011db
    [ZARCH] Z14 support, BLAS 1/2 single precision implementations, Some missing double precision implementations, Gemv optimization maamountki 2018-08-06 18:20:40 +0300
  • 73478664d4
    Add workaround for avx512 compilations on Cygwin Martin Kroeker 2018-08-06 16:40:32 +0200
  • ee955757f9
    Merge pull request #1715 from stevengj/patch-1 Martin Kroeker 2018-08-05 22:48:44 +0200
  • b1cc69e7a8 Convert dscal_haswell to intrinsics and add AVX512 support Arjan van de Ven 2018-08-05 19:19:49 +0000
  • 93aa18b1a8 daxpy_haswell: Change to C+instrinsics + AVX512 to mimic the change to saxpy_haswell Arjan van de Ven 2018-08-05 18:29:34 +0000
  • 7af8a5445d saxpy_haswell: Go to a more compact intrinsics notation Arjan van de Ven 2018-08-05 18:28:47 +0000
  • 850b73dbb9 saxpy_haswell: Add AVX512 support Arjan van de Ven 2018-08-05 17:50:16 +0000
  • 06ea72f5a5 write saxpy_haswell kernel using C intrinsics and don't disallow inlining Arjan van de Ven 2018-08-05 17:43:40 +0000
  • d86604687f saxpy_haswell: Use named arguments in inline asm Arjan van de Ven 2018-08-05 17:16:14 +0000
  • ef30a7239c sdot_haswell: similar to ddot: turn into intrinsics based C code that supports AVX512 Arjan van de Ven 2018-08-05 16:38:19 +0000
  • 21c6220d63 fix typo in dsymv avx512 code path Arjan van de Ven 2018-08-05 15:16:48 +0000
  • 34d63df4b3 Add AVX512 support to DDOT Arjan van de Ven 2018-08-05 15:16:20 +0000
  • ae38fa55c3 Use intrinsics instead of inline asm Arjan van de Ven 2018-08-05 14:45:54 +0000
  • 847bbd6f4c use named arguments in the inline asm Arjan van de Ven 2018-08-05 14:22:38 +0000
  • 48610a4524
    fix blasabs for windows Steven G. Johnson 2018-08-05 08:18:51 -0400
  • 9c29524f50 various code cleanups and comments Arjan van de Ven 2018-08-05 02:44:40 +0000
  • f2810beafb Add AVX512 support to dsymv_L_microk_haswell-2.c Arjan van de Ven 2018-08-04 23:56:06 +0000
  • c202e06297 Write dsymv_kernel_4x4 for Haswell using intrinsics Arjan van de Ven 2018-08-04 23:35:36 +0000
  • 4a553e8678
    Merge pull request #1713 from martin-frbg/issue1710 Martin Kroeker 2018-08-04 23:51:31 +0200
  • e788102c10
    Merge pull request #1709 from stevengj/patch-1 Martin Kroeker 2018-08-04 23:51:10 +0200
  • 0faba28adb dsymv_L haswell: use symbol names for inline asm Arjan van de Ven 2018-08-04 21:25:53 +0000
  • df31ec064e Add AVX512 support to the dgemv_n_microk_haswell-4.c kernel Arjan van de Ven 2018-08-04 20:48:59 +0000
  • 165f00c159
    fabs -> fabsl Martin Kroeker 2018-08-04 20:14:51 +0200
  • 40c068a875
    Introduce blasabs() to switch between abs() and labs() for INTERFACE64 Martin Kroeker 2018-08-04 20:07:59 +0200
  • 933896a1d0
    Use blasabs to switch between abs and labs as needed for INTERFACE64 Martin Kroeker 2018-08-04 20:06:49 +0200
  • e52d01cfe7 Also make the kernel_4x2 use intrinsics for readability and consistency Arjan van de Ven 2018-08-04 17:53:55 +0000
  • 4a8ae8b8aa replace the hasell dgemv_kernel_4x4 kernel with a the same code written in intrinsics Arjan van de Ven 2018-08-04 17:25:54 +0000
  • 350531e76a dgemv_n_microk_haswell: Use symbolic names for asm inputs to make the code more readable Arjan van de Ven 2018-08-04 14:44:04 +0000
  • a4e321400b
    fabs -> fabsl Steven G. Johnson 2018-08-03 13:00:10 -0400
  • 9e65430504
    Merge pull request #1703 from wsttiger/cmake_fix Martin Kroeker 2018-08-02 23:48:42 +0200
  • 2cfa86b406
    Merge pull request #1707 from extrowerk/haiku_support Martin Kroeker 2018-08-02 22:27:00 +0200
  • 2a9a9389ef Added target_include_directories() Scott Thornton 2018-08-02 14:58:52 -0500
  • 6463bffd59 Haiku supporting patches Zoltán Mizsei 2018-08-02 20:49:14 +0200
  • 8ef7d4fb54
    Merge pull request #1706 from oon3m0oo/develop Martin Kroeker 2018-08-02 18:53:34 +0200
  • 6400868e55 Fix #1705 where we incorrectly calculate page locations. Craig Donner 2018-08-02 16:21:19 +0100
  • 8ebf541e97 Set EXPORT_NAME to match OpenBLASConfig.cmake Scott Thornton 2018-07-30 15:18:29 -0500
  • b03ae3f4dc
    Set version to 0.3.3.dev Martin Kroeker 2018-07-30 08:23:13 +0200
  • 2cc8fb0ad2
    Set version to 0.3.3.dev Martin Kroeker 2018-07-30 08:22:38 +0200
  • e8a68ef261
    Merge pull request #1702 from xianyi/develop v0.3.2 Martin Kroeker 2018-07-30 07:25:01 +0200
  • 64826a0d7d
    Merge branch 'release-0.3.0' into develop Martin Kroeker 2018-07-29 22:37:09 +0200
  • 25f2d25cfe
    Merge pull request #1697 from martin-frbg/issue1696 Martin Kroeker 2018-07-25 19:55:29 +0200
  • 73131fa30a
    Do not treat WIndows UWB builds as cross-compiling Martin Kroeker 2018-07-24 17:46:33 +0200
  • 66fcdd5be8
    Merge pull request #1695 from martin-frbg/issue1692 Martin Kroeker 2018-07-22 16:34:09 +0200
  • 43ac839c16
    Unset memory table entry, not just the temporary pointer to it on shutdown Martin Kroeker 2018-07-22 09:19:19 +0200
  • 7ba5936ecd
    Merge pull request #1688 from martin-frbg/issue1673 Martin Kroeker 2018-07-19 19:03:45 +0200
  • b14f44d2ad
    Temporarily disable special handling of OPENMP thread memory allocation Martin Kroeker 2018-07-19 08:57:56 +0200
  • e71d70ba87
    Merge pull request #1681 from martin-frbg/issue1671 Martin Kroeker 2018-07-16 22:47:05 +0200
  • d671870f5f
    Merge pull request #1684 from martin-frbg/issue1672 Martin Kroeker 2018-07-16 22:46:49 +0200
  • 4e103c822c
    typo fix Martin Kroeker 2018-07-16 12:56:39 +0200
  • d2142760e0
    Fix precision problem in DSDOT Martin Kroeker 2018-07-15 17:11:40 +0200
  • 2fbfc64da8
    Use C kernels for default c/zAXPY, xROT, c/zSWAP Martin Kroeker 2018-07-15 17:09:55 +0200
  • 5e937b6022
    Merge a0bd542648 into 36aea5ce2d Martin Kroeker 2018-07-15 14:40:16 +0000
  • a0bd542648
    Map c/zAXPY, c/zSWAP and xROT to the mips C kernels Martin Kroeker 2018-07-15 13:06:46 +0200
  • 35902bfe1f
    Fix lack of precision in DSDOT by promoting arguments Martin Kroeker 2018-07-15 13:02:26 +0200
  • 8d5b33b6be
    Add cpu identification via mfpvr call for the BSDs Martin Kroeker 2018-07-12 23:39:00 +0200
  • 36aea5ce2d
    Merge pull request #1680 from martin-frbg/snprint Martin Kroeker 2018-07-12 14:05:13 +0200
  • 1309711e24
    Fix declaration of snprintf for older MSVC Martin Kroeker 2018-07-12 11:47:52 +0200
  • 571e9de2ac
    Fix definition of snprintf for MSVC Martin Kroeker 2018-07-12 11:42:25 +0200
  • 448ed15115
    Merge pull request #1678 from martin-frbg/issue1677 Martin Kroeker 2018-07-12 09:21:34 +0200
  • 045fb5ea2c
    Define snprintf for older versions of MSVC Martin Kroeker 2018-07-12 07:30:58 +0200
  • bdb29242a3
    Merge ba586c3d16 into 4dd70d98d7 oon3m0oo 2018-07-04 07:02:39 +0000
  • 4dd70d98d7
    Merge pull request #1667 from xianyi/revert-1642-develop Martin Kroeker 2018-07-04 08:27:21 +0200
  • 504310eeb9
    Merge pull request #1665 from martin-frbg/cpuid-ryzen2 Martin Kroeker 2018-07-04 08:19:40 +0200
  • ea1f39518f
    Merge pull request #1663 from martin-frbg/issue1641 Martin Kroeker 2018-07-04 08:19:11 +0200
  • 5f2a3c05cd
    Revert "Rewrite &= -> = and simplify the initial blocking phase." revert-1642-develop Martin Kroeker 2018-07-03 21:42:28 +0200
  • d0ec4325cf
    Add cpuid for AMD Ryzen 2 Martin Kroeker 2018-07-03 21:03:24 +0200
  • 3f73e8b8cf
    Add cpuid for AMD Ryzen 2 Martin Kroeker 2018-07-03 21:01:35 +0200
  • a83f01e0ee
    Merge pull request #1662 from martin-frbg/cmake-avx512 Martin Kroeker 2018-07-03 17:40:09 +0200
  • a49203b48c
    Double MAX_ALLOCATING_THREADS to fix segfaults with Go and Octave Martin Kroeker 2018-07-03 17:35:54 +0200
  • ba586c3d16 Ensure that the gotoblas lookup table is always initialized. Craig Donner 2018-07-03 12:06:54 +0100
  • b74aef2816
    Add -march=skylake-avx512 to AVX512 compile check and suppress its output Martin Kroeker 2018-07-03 14:41:44 +0200
  • a9fa805007
    Merge pull request #1660 from martin-frbg/issue1659 Martin Kroeker 2018-07-02 17:48:19 +0200
  • 9d15a3bd16
    Fix typo that broke compilation with DYNAMIC_ARCH and NO_AVX2 Martin Kroeker 2018-07-02 14:40:41 +0200
  • c6aec89d10
    Merge pull request #1657 from martin-frbg/release-0.3.0 v0.3.1 Martin Kroeker 2018-07-01 12:03:07 +0200
  • bbf2124970
    set version number to 0.3.2.dev Martin Kroeker 2018-07-01 12:01:51 +0200
  • 1392eba488
    set version number to 0.3.2.dev Martin Kroeker 2018-07-01 12:01:16 +0200
  • e6d7711199
    remove dev suffix from version number Martin Kroeker 2018-07-01 11:59:47 +0200
  • 7a914347c5
    remove dev suffix from version number Martin Kroeker 2018-07-01 11:58:57 +0200
  • 61659f8765
    Merge pull request #1648 from martin-frbg/nofort Martin Kroeker 2018-07-01 11:56:40 +0200
  • 3a8f0a6a1f
    Merge pull request #1656 from xianyi/develop Martin Kroeker 2018-07-01 11:55:21 +0200
  • 3d3c19717c
    Merge pull request #1655 from martin-frbg/issue1641 Martin Kroeker 2018-07-01 08:41:22 +0200
  • 24e344038d
    Merge pull request #1654 from martin-frbg/avx512check Martin Kroeker 2018-07-01 01:17:03 +0200
  • 4e9c34018e
    Fix apparent off-by-one error in calculation of MAX_ALLOCATING_THREADS Martin Kroeker 2018-06-30 23:57:50 +0200