Commit Graph

  • a49203b48c Double MAX_ALLOCATING_THREADS to fix segfaults with Go and Octave Martin Kroeker 2018-07-03 17:35:54 +02:00
  • b74aef2816 Add -march=skylake-avx512 to AVX512 compile check and suppress its output Martin Kroeker 2018-07-03 14:41:44 +02:00
  • a9fa805007 Merge pull request #1660 from martin-frbg/issue1659 Martin Kroeker 2018-07-02 17:48:19 +02:00
  • 9d15a3bd16 Fix typo that broke compilation with DYNAMIC_ARCH and NO_AVX2 Martin Kroeker 2018-07-02 14:40:41 +02:00
  • c6aec89d10 Merge pull request #1657 from martin-frbg/release-0.3.0 v0.3.1 Martin Kroeker 2018-07-01 12:03:07 +02:00
  • bbf2124970 set version number to 0.3.2.dev Martin Kroeker 2018-07-01 12:01:51 +02:00
  • 1392eba488 set version number to 0.3.2.dev Martin Kroeker 2018-07-01 12:01:16 +02:00
  • e6d7711199 remove dev suffix from version number Martin Kroeker 2018-07-01 11:59:47 +02:00
  • 7a914347c5 remove dev suffix from version number Martin Kroeker 2018-07-01 11:58:57 +02:00
  • 61659f8765 Merge pull request #1648 from martin-frbg/nofort Martin Kroeker 2018-07-01 11:56:40 +02:00
  • 3a8f0a6a1f Merge pull request #1656 from xianyi/develop Martin Kroeker 2018-07-01 11:55:21 +02:00
  • 3d3c19717c Merge pull request #1655 from martin-frbg/issue1641 Martin Kroeker 2018-07-01 08:41:22 +02:00
  • 24e344038d Merge pull request #1654 from martin-frbg/avx512check Martin Kroeker 2018-07-01 01:17:03 +02:00
  • 4e9c34018e Fix apparent off-by-one error in calculation of MAX_ALLOCATING_THREADS Martin Kroeker 2018-06-30 23:57:50 +02:00
  • f5243e8e1f Add compiler option to avx512 test and hide test output Martin Kroeker 2018-06-30 23:47:44 +02:00
  • ba8388cee0 Merge pull request #1651 from martin-frbg/avx512-nodgemm Martin Kroeker 2018-06-30 17:48:03 +02:00
  • 6e54b0a027 Disable the 16x2 DTRMM kernel on SkylakeX as well Martin Kroeker 2018-06-30 17:31:06 +02:00
  • 40c8cbc3bf Merge pull request #1650 from martin-frbg/avx512-nodgemm Martin Kroeker 2018-06-30 13:05:46 +02:00
  • d3c9eb4c7d Merge pull request #1639 from martin-frbg/dyn_list Martin Kroeker 2018-06-30 13:05:30 +02:00
  • f0a8dc2eec Disable the AVX512 DGEMM kernel for now Martin Kroeker 2018-06-30 11:34:48 +02:00
  • cc92257ea6 Update Makefile Martin Kroeker 2018-06-27 00:09:21 +02:00
  • 2aba1b1658 Merge branch 'develop' into nofort Martin Kroeker 2018-06-27 00:07:32 +02:00
  • 8396e9e777 Handle NOFORTRAN=0 Martin Kroeker 2018-06-27 00:00:27 +02:00
  • bfad307ed7 Merge pull request #1647 from martin-frbg/armv7-dot Martin Kroeker 2018-06-26 22:27:30 +02:00
  • b83e4c60c7 Remove premature exit for INC_X or INC_Y zero Martin Kroeker 2018-06-26 20:46:42 +02:00
  • e344db269b Remove premature exit for INC_X or INC_Y zero Martin Kroeker 2018-06-26 20:45:57 +02:00
  • 545b82efd3 Remove premature exit for INC_X or INC_Y zero Martin Kroeker 2018-06-26 20:45:00 +02:00
  • e322a951fe Remove premature exit for INC_X or INC_Y zero Martin Kroeker 2018-06-26 20:44:13 +02:00
  • ff2f171036 Merge pull request #1644 from martin-frbg/revert-filterout Martin Kroeker 2018-06-26 10:15:15 +02:00
  • 092175cfec Revert changes to NOFORTRAN handling from 952541e Martin Kroeker 2018-06-26 08:09:52 +02:00
  • 750162a05f Try gradual fallback for cores not in the dynamic core list Martin Kroeker 2018-06-25 21:02:31 +02:00
  • e6d93f20f1 Merge pull request #2 from martin-frbg/develop Martin Kroeker 2018-06-25 20:48:10 +02:00
  • c38c65eb65 Merge pull request #1 from xianyi/develop Martin Kroeker 2018-06-25 20:45:56 +02:00
  • ce3651516f Merge pull request #1642 from oon3m0oo/develop Martin Kroeker 2018-06-25 19:23:40 +02:00
  • 0144068537 Rewrite &= -> = and simplify the initial blocking phase. Craig Donner 2018-06-25 13:53:11 +01:00
  • 1833a67071 Add support for a user-defined list of dynamic targets Martin Kroeker 2018-06-23 19:42:15 +02:00
  • 0b2b83d9ed Add support for a user-defined list of dynamic targets Martin Kroeker 2018-06-23 19:41:32 +02:00
  • 62cf769aa6 Merge pull request #1638 from martin-frbg/issue1637 Martin Kroeker 2018-06-23 15:01:02 +02:00
  • eb71d61c7c Expose CBLAS interface to BLAS extensions iXamin Martin Kroeker 2018-06-23 13:31:09 +02:00
  • 9cf22b7d91 Build cblas_iXamin interfaces Martin Kroeker 2018-06-23 13:27:30 +02:00
  • cc66743b66 Merge pull request #1634 from oon3m0oo/develop Martin Kroeker 2018-06-21 21:01:03 +02:00
  • 2aa0a5804e Use BLAS rather than CBLAS in test_fork.c (#1626) oon3m0oo 2018-06-21 17:47:45 +01:00
  • 28c28ed275 Fix data races reported by TSAN. Craig Donner 2018-06-21 11:13:57 +01:00
  • a399d00425 Further improvements to memory.c. (#1625) oon3m0oo 2018-06-20 21:04:03 +01:00
  • f66b9c8826 Merge pull request #1630 from martin-frbg/x86-march Martin Kroeker 2018-06-20 21:51:57 +02:00
  • 2946c46024 Merge pull request #1631 from oon3m0oo/stack Martin Kroeker 2018-06-20 21:51:38 +02:00
  • 05978528c3 Avoid declaring arrays of size 0 when making large stack allocations. Craig Donner 2018-06-20 17:03:18 +01:00
  • ef6f0b645e Merge pull request #1629 from martin-frbg/issue1628 Martin Kroeker 2018-06-20 16:41:13 +02:00
  • 0c5b7b400b Add -march=skylake-avx512 to flags if target is skylake x Martin Kroeker 2018-06-20 15:16:19 +02:00
  • 952541e840 Need to use filter-out to handle NOFORTRAN not set Martin Kroeker 2018-06-20 13:20:30 +02:00
  • 9369d3e6e5 Modify NOFORTRAN tests to always check the value; fix rewriting of NO_FORTRAN Martin Kroeker 2018-06-19 23:28:06 +02:00
  • 10b70c904d Handle erroneous user settings NOFORTRAN=0 and NO_FORTRAN Martin Kroeker 2018-06-19 20:53:19 +02:00
  • 6a5ab083b7 Handle special case of gfortran+clang+OpenMP Martin Kroeker 2018-06-19 20:47:33 +02:00
  • 1f9e4f3193 Handle special case of gfortran+clang+OpenMP Martin Kroeker 2018-06-19 20:46:36 +02:00
  • 5a6a2bed9a Merge pull request #1623 from fenrus75/fast-thread Martin Kroeker 2018-06-18 09:02:40 +02:00
  • 2d8cc7193a Support upcoming Intel Cannon Lake CPUs as Skylake X (#1621) Martin Kroeker 2018-06-17 23:38:14 +02:00
  • 2ddc96c9e5 make WMB / MB safer on x86-64 Arjan van de Ven 2018-06-17 18:06:24 +00:00
  • 7e39ffe113 On x86-64, make MB/WMB compiler barriers Arjan van de Ven 2018-06-17 17:53:15 +00:00
  • 73de17664d Add missing barriers in gemm scheduler Arjan van de Ven 2018-06-17 17:50:43 +00:00
  • 6eb4b9ae7c Tune HASWELL SWITCH_RATIO as well Arjan van de Ven 2018-06-17 17:05:04 +00:00
  • 5c6f008365 Tune param.h for SkylakeX Arjan van de Ven 2018-06-17 15:47:50 +00:00
  • d148ec4ea1 Don't use _Atomic for jobs sometimes... Arjan van de Ven 2018-06-17 15:39:15 +00:00
  • 9e162146a9 Only initialize the part of the jobs array that will get used Arjan van de Ven 2018-06-17 15:32:03 +00:00
  • 47bf0dba8f Add build-time option for OMP scheduler; document MULTITHREAD_THRESHOLD range (#1620) Martin Kroeker 2018-06-15 11:25:05 +02:00
  • 12603b7dbb Merge pull request #1618 from oon3m0oo/less_locking Martin Kroeker 2018-06-15 00:10:29 +02:00
  • bf40f806ef Remove the need for most locking in memory.c. Craig Donner 2018-06-14 12:18:04 +01:00
  • ed682a4a0c Merge pull request #1619 from martin-frbg/issue1580 Martin Kroeker 2018-06-14 17:48:51 +02:00
  • fcb77ab129 Update OSX deployment target to 10.8 Martin Kroeker 2018-06-14 16:57:58 +02:00
  • 26e1cfb653 Merge pull request #1607 from martin-frbg/dynarch Martin Kroeker 2018-06-14 16:52:55 +02:00
  • c628c6fa59 Merge pull request #1612 from oon3m0oo/cpus Martin Kroeker 2018-06-14 16:51:31 +02:00
  • 67d81ab49d Merge pull request #1609 from martin-frbg/issue1529 Martin Kroeker 2018-06-12 23:00:24 +02:00
  • 2f957947a6 Merge pull request #1613 from xianyi/revert-1600-noyield Martin Kroeker 2018-06-11 17:14:49 +02:00
  • de8fff671d Revert "Use usleep instead of sched_yield by default" revert-1600-noyield Martin Kroeker 2018-06-11 17:05:27 +02:00
  • 6f71c0fce4 Return a somewhat sane default value for L2 cache size if cpuid retur… (#1611) Martin Kroeker 2018-06-11 13:26:19 +02:00
  • c2545b0fd6 Fixed a few more unnecessary calls to num_cpu_avail. Craig Donner 2018-06-11 10:13:09 +01:00
  • e65f451409 include CMakePackageConfigHelpers Martin Kroeker 2018-06-10 15:09:43 +02:00
  • 02634b549b Add template for OpenBLASConfig.cmake Martin Kroeker 2018-06-10 09:25:46 +02:00
  • 0bea6bb9e7 Create OpenBLASConfig.cmake from cmake as well Martin Kroeker 2018-06-10 09:24:37 +02:00
  • 3313e4b946 Merge pull request #1608 from martin-frbg/issue874 Martin Kroeker 2018-06-09 19:57:33 +02:00
  • e9cd11768c Enable parallel make on MS Windows by default Martin Kroeker 2018-06-09 17:54:36 +02:00
  • 63f7395fb4 Move some DYNAMIC_ARCH targets to new DYNAMIC_OLDER option Martin Kroeker 2018-06-09 16:31:38 +02:00
  • 1cbd8f3ae4 Move some DYNAMIC_ARCH targets to new DYNAMIC_OLDER option Martin Kroeker 2018-06-09 16:30:46 +02:00
  • 6c2d90ba77 Move some DYNAMIC_ARCH targets to new DYNAMIC_OLDER option Martin Kroeker 2018-06-09 16:29:17 +02:00
  • 0297b3211a Merge pull request #1605 from oon3m0oo/develop Martin Kroeker 2018-06-09 12:42:34 +02:00
  • 66316b9f4c Improve performance of GEMM for small matrices when SMP is defined. Craig Donner 2018-06-07 14:54:42 +01:00
  • 6adc4b7b36 Merge pull request #1601 from martin-frbg/zaxpy Martin Kroeker 2018-06-07 14:09:58 +02:00
  • 2ade0ef085 Merge pull request #1600 from martin-frbg/noyield Martin Kroeker 2018-06-07 12:42:00 +02:00
  • e8880c1699 Use a single thread for small input size Martin Kroeker 2018-06-07 10:26:55 +02:00
  • ed7c4a043b Use usleep instead of sched_yield by default Martin Kroeker 2018-06-07 10:18:26 +02:00
  • cf234a0561 Merge pull request #1589 from fenrus75/skylakex Martin Kroeker 2018-06-06 22:07:09 +02:00
  • ae2a33128b Merge pull request #1599 from martin-frbg/c_check_avx512 Martin Kroeker 2018-06-06 18:42:42 +02:00
  • e4718b1fee Better AVX512 test case Martin Kroeker 2018-06-06 16:51:30 +02:00
  • 9b87b64262 Improve AVX512 testcase Martin Kroeker 2018-06-06 16:49:00 +02:00
  • 0218b884c1 Merge pull request #1598 from martin-frbg/issue1593-2 Martin Kroeker 2018-06-06 12:48:26 +02:00
  • 83da278093 Update common.h Martin Kroeker 2018-06-06 09:27:49 +02:00
  • 358d4df2bd Merge branch 'develop' into issue1593-2 Martin Kroeker 2018-06-06 09:21:41 +02:00
  • 06d43760e4 Restore _Atomic define before stdatomic.h for old gcc Martin Kroeker 2018-06-06 09:18:10 +02:00
  • a4af8861ff Merge pull request #1597 from martin-frbg/cmake-avx512 Martin Kroeker 2018-06-06 07:22:20 +02:00
  • 7fb62aed7e Check build system support for AVX512 instructions Martin Kroeker 2018-06-05 23:29:33 +02:00
  • f6021c798d Re-enable QUIET_MAKE Martin Kroeker 2018-06-05 19:09:38 +02:00