Commit Graph

3048 Commits

Author SHA1 Message Date
Martin Kroeker
5f2a3c05cd Revert "Rewrite &= -> = and simplify the initial blocking phase." 2018-07-03 21:42:28 +02:00
Martin Kroeker
a83f01e0ee Merge pull request #1662 from martin-frbg/cmake-avx512
Add -march=skylake-avx512 to AVX512 compile check and suppress its ou…
2018-07-03 17:40:09 +02:00
Martin Kroeker
b74aef2816 Add -march=skylake-avx512 to AVX512 compile check and suppress its output 2018-07-03 14:41:44 +02:00
Martin Kroeker
a9fa805007 Merge pull request #1660 from martin-frbg/issue1659
Fix typo that broke compilation with DYNAMIC_ARCH and NO_AVX2
2018-07-02 17:48:19 +02:00
Martin Kroeker
9d15a3bd16 Fix typo that broke compilation with DYNAMIC_ARCH and NO_AVX2
fixes 1659
2018-07-02 14:40:41 +02:00
Martin Kroeker
bbf2124970 set version number to 0.3.2.dev 2018-07-01 12:01:51 +02:00
Martin Kroeker
1392eba488 set version number to 0.3.2.dev 2018-07-01 12:01:16 +02:00
Martin Kroeker
61659f8765 Merge pull request #1648 from martin-frbg/nofort
Handle NOFORTRAN=0
2018-07-01 11:56:40 +02:00
Martin Kroeker
3d3c19717c Merge pull request #1655 from martin-frbg/issue1641
Fix apparent off-by-one error in calculation of MAX_ALLOCATING_THREADS
2018-07-01 08:41:22 +02:00
Martin Kroeker
24e344038d Merge pull request #1654 from martin-frbg/avx512check
Add compiler option to avx512 test and hide test output
2018-07-01 01:17:03 +02:00
Martin Kroeker
4e9c34018e Fix apparent off-by-one error in calculation of MAX_ALLOCATING_THREADS
fixes #1641
2018-06-30 23:57:50 +02:00
Martin Kroeker
f5243e8e1f Add compiler option to avx512 test and hide test output 2018-06-30 23:47:44 +02:00
Martin Kroeker
ba8388cee0 Merge pull request #1651 from martin-frbg/avx512-nodgemm
Disable the 16x2 DTRMM kernel on SkylakeX as well
2018-06-30 17:48:03 +02:00
Martin Kroeker
6e54b0a027 Disable the 16x2 DTRMM kernel on SkylakeX as well 2018-06-30 17:31:06 +02:00
Martin Kroeker
40c8cbc3bf Merge pull request #1650 from martin-frbg/avx512-nodgemm
Disable the AVX512 DGEMM kernel for now
2018-06-30 13:05:46 +02:00
Martin Kroeker
d3c9eb4c7d Merge pull request #1639 from martin-frbg/dyn_list
Add DYNAMIC_LIST option for user-defined list of dynamic targets
2018-06-30 13:05:30 +02:00
Martin Kroeker
f0a8dc2eec Disable the AVX512 DGEMM kernel for now
due to #1643
2018-06-30 11:34:48 +02:00
Martin Kroeker
cc92257ea6 Update Makefile 2018-06-27 00:09:21 +02:00
Martin Kroeker
2aba1b1658 Merge branch 'develop' into nofort 2018-06-27 00:07:32 +02:00
Martin Kroeker
8396e9e777 Handle NOFORTRAN=0 2018-06-27 00:00:27 +02:00
Martin Kroeker
bfad307ed7 Merge pull request #1647 from martin-frbg/armv7-dot
Remove premature exits from ARMV7 xdot codes
2018-06-26 22:27:30 +02:00
Martin Kroeker
b83e4c60c7 Remove premature exit for INC_X or INC_Y zero 2018-06-26 20:46:42 +02:00
Martin Kroeker
e344db269b Remove premature exit for INC_X or INC_Y zero 2018-06-26 20:45:57 +02:00
Martin Kroeker
545b82efd3 Remove premature exit for INC_X or INC_Y zero 2018-06-26 20:45:00 +02:00
Martin Kroeker
e322a951fe Remove premature exit for INC_X or INC_Y zero 2018-06-26 20:44:13 +02:00
Martin Kroeker
ff2f171036 Merge pull request #1644 from martin-frbg/revert-filterout
Revert changes to NOFORTRAN handling in Makefile
2018-06-26 10:15:15 +02:00
Martin Kroeker
092175cfec Revert changes to NOFORTRAN handling from 952541e 2018-06-26 08:09:52 +02:00
Martin Kroeker
750162a05f Try gradual fallback for cores not in the dynamic core list 2018-06-25 21:02:31 +02:00
Martin Kroeker
e6d93f20f1 Merge pull request #2 from martin-frbg/develop
merge develop
2018-06-25 20:48:10 +02:00
Martin Kroeker
c38c65eb65 Merge pull request #1 from xianyi/develop
Merge xianyi:develop into develop
2018-06-25 20:45:56 +02:00
Martin Kroeker
ce3651516f Merge pull request #1642 from oon3m0oo/develop
Rewrite &= -> = and simplify the initial blocking phase.
2018-06-25 19:23:40 +02:00
Craig Donner
0144068537 Rewrite &= -> = and simplify the initial blocking phase. 2018-06-25 15:08:55 +01:00
Martin Kroeker
1833a67071 Add support for a user-defined list of dynamic targets 2018-06-23 19:42:15 +02:00
Martin Kroeker
0b2b83d9ed Add support for a user-defined list of dynamic targets 2018-06-23 19:41:32 +02:00
Martin Kroeker
62cf769aa6 Merge pull request #1638 from martin-frbg/issue1637
Expose the CBLAS interface to the IxAMIN functions and have make build it
2018-06-23 15:01:02 +02:00
Martin Kroeker
eb71d61c7c Expose CBLAS interface to BLAS extensions iXamin 2018-06-23 13:31:09 +02:00
Martin Kroeker
9cf22b7d91 Build cblas_iXamin interfaces 2018-06-23 13:27:30 +02:00
Martin Kroeker
cc66743b66 Merge pull request #1634 from oon3m0oo/develop
Fix data races reported by TSAN.
2018-06-21 21:01:03 +02:00
oon3m0oo
2aa0a5804e Use BLAS rather than CBLAS in test_fork.c (#1626)
This is handy for people not using lapack.
2018-06-21 18:47:45 +02:00
Craig Donner
28c28ed275 Fix data races reported by TSAN. 2018-06-21 16:41:02 +01:00
oon3m0oo
a399d00425 Further improvements to memory.c. (#1625)
- Compiler TLS is now used only used when the compiler supports it
- If compiler TLS is unsupported, we use platform-specific TLS
- Only one variable (an index) is now in TLS
- We only access TLS once per alloc, and never when freeing
- Allocation / release info is now stored within the allocation itself, by
  over-allocating; this saves having external structures do the bookkeeping, and
  reduces some of the redundant data that was being stored (such as addresses)
- We never hit the alloc lock when not using SMP or when using OpenMP (that was
  my fault)
- Now that there are fewer tracking structures I think this is a bit easier to
  read than before
2018-06-20 22:04:03 +02:00
Martin Kroeker
f66b9c8826 Merge pull request #1630 from martin-frbg/x86-march
Add -march=skylake-avx512 to flags if target is skylake x
2018-06-20 21:51:57 +02:00
Martin Kroeker
2946c46024 Merge pull request #1631 from oon3m0oo/stack
Avoid declaring arrays of size 0 when making large stack allocations.
2018-06-20 21:51:38 +02:00
Craig Donner
05978528c3 Avoid declaring arrays of size 0 when making large stack allocations. 2018-06-20 17:03:18 +01:00
Martin Kroeker
ef6f0b645e Merge pull request #1629 from martin-frbg/issue1628
Make gfortran link libomp for clang in the tests; avoid two typical gotchas with NOFORTRAN
2018-06-20 16:41:13 +02:00
Martin Kroeker
0c5b7b400b Add -march=skylake-avx512 to flags if target is skylake x 2018-06-20 15:16:19 +02:00
Martin Kroeker
952541e840 Need to use filter-out to handle NOFORTRAN not set 2018-06-20 13:20:30 +02:00
Martin Kroeker
9369d3e6e5 Modify NOFORTRAN tests to always check the value; fix rewriting of NO_FORTRAN 2018-06-19 23:28:06 +02:00
Martin Kroeker
10b70c904d Handle erroneous user settings NOFORTRAN=0 and NO_FORTRAN 2018-06-19 20:53:19 +02:00
Martin Kroeker
6a5ab083b7 Handle special case of gfortran+clang+OpenMP 2018-06-19 20:47:33 +02:00