Martin Kroeker
9d15a3bd16
Fix typo that broke compilation with DYNAMIC_ARCH and NO_AVX2
...
fixes 1659
2018-07-02 14:40:41 +02:00
Martin Kroeker
c6aec89d10
Merge pull request #1657 from martin-frbg/release-0.3.0
...
Release 0.3.1
2018-07-01 12:03:07 +02:00
Martin Kroeker
bbf2124970
set version number to 0.3.2.dev
2018-07-01 12:01:51 +02:00
Martin Kroeker
1392eba488
set version number to 0.3.2.dev
2018-07-01 12:01:16 +02:00
Martin Kroeker
e6d7711199
remove dev suffix from version number
2018-07-01 11:59:47 +02:00
Martin Kroeker
7a914347c5
remove dev suffix from version number
2018-07-01 11:58:57 +02:00
Martin Kroeker
61659f8765
Merge pull request #1648 from martin-frbg/nofort
...
Handle NOFORTRAN=0
2018-07-01 11:56:40 +02:00
Martin Kroeker
3a8f0a6a1f
Merge pull request #1656 from xianyi/develop
...
Update the 0.3 branch from develop
2018-07-01 11:55:21 +02:00
Martin Kroeker
3d3c19717c
Merge pull request #1655 from martin-frbg/issue1641
...
Fix apparent off-by-one error in calculation of MAX_ALLOCATING_THREADS
2018-07-01 08:41:22 +02:00
Martin Kroeker
24e344038d
Merge pull request #1654 from martin-frbg/avx512check
...
Add compiler option to avx512 test and hide test output
2018-07-01 01:17:03 +02:00
Martin Kroeker
4e9c34018e
Fix apparent off-by-one error in calculation of MAX_ALLOCATING_THREADS
...
fixes #1641
2018-06-30 23:57:50 +02:00
Martin Kroeker
f5243e8e1f
Add compiler option to avx512 test and hide test output
2018-06-30 23:47:44 +02:00
Martin Kroeker
ba8388cee0
Merge pull request #1651 from martin-frbg/avx512-nodgemm
...
Disable the 16x2 DTRMM kernel on SkylakeX as well
2018-06-30 17:48:03 +02:00
Martin Kroeker
6e54b0a027
Disable the 16x2 DTRMM kernel on SkylakeX as well
2018-06-30 17:31:06 +02:00
Martin Kroeker
40c8cbc3bf
Merge pull request #1650 from martin-frbg/avx512-nodgemm
...
Disable the AVX512 DGEMM kernel for now
2018-06-30 13:05:46 +02:00
Martin Kroeker
d3c9eb4c7d
Merge pull request #1639 from martin-frbg/dyn_list
...
Add DYNAMIC_LIST option for user-defined list of dynamic targets
2018-06-30 13:05:30 +02:00
Martin Kroeker
f0a8dc2eec
Disable the AVX512 DGEMM kernel for now
...
due to #1643
2018-06-30 11:34:48 +02:00
Martin Kroeker
cc92257ea6
Update Makefile
2018-06-27 00:09:21 +02:00
Martin Kroeker
2aba1b1658
Merge branch 'develop' into nofort
2018-06-27 00:07:32 +02:00
Martin Kroeker
8396e9e777
Handle NOFORTRAN=0
2018-06-27 00:00:27 +02:00
Martin Kroeker
bfad307ed7
Merge pull request #1647 from martin-frbg/armv7-dot
...
Remove premature exits from ARMV7 xdot codes
2018-06-26 22:27:30 +02:00
Martin Kroeker
b83e4c60c7
Remove premature exit for INC_X or INC_Y zero
2018-06-26 20:46:42 +02:00
Martin Kroeker
e344db269b
Remove premature exit for INC_X or INC_Y zero
2018-06-26 20:45:57 +02:00
Martin Kroeker
545b82efd3
Remove premature exit for INC_X or INC_Y zero
2018-06-26 20:45:00 +02:00
Martin Kroeker
e322a951fe
Remove premature exit for INC_X or INC_Y zero
2018-06-26 20:44:13 +02:00
Martin Kroeker
ff2f171036
Merge pull request #1644 from martin-frbg/revert-filterout
...
Revert changes to NOFORTRAN handling in Makefile
2018-06-26 10:15:15 +02:00
Martin Kroeker
092175cfec
Revert changes to NOFORTRAN handling from 952541e
2018-06-26 08:09:52 +02:00
Martin Kroeker
750162a05f
Try gradual fallback for cores not in the dynamic core list
2018-06-25 21:02:31 +02:00
Martin Kroeker
e6d93f20f1
Merge pull request #2 from martin-frbg/develop
...
merge develop
2018-06-25 20:48:10 +02:00
Martin Kroeker
c38c65eb65
Merge pull request #1 from xianyi/develop
...
Merge xianyi:develop into develop
2018-06-25 20:45:56 +02:00
Martin Kroeker
ce3651516f
Merge pull request #1642 from oon3m0oo/develop
...
Rewrite &= -> = and simplify the initial blocking phase.
2018-06-25 19:23:40 +02:00
Craig Donner
0144068537
Rewrite &= -> = and simplify the initial blocking phase.
2018-06-25 15:08:55 +01:00
Martin Kroeker
1833a67071
Add support for a user-defined list of dynamic targets
2018-06-23 19:42:15 +02:00
Martin Kroeker
0b2b83d9ed
Add support for a user-defined list of dynamic targets
2018-06-23 19:41:32 +02:00
Martin Kroeker
62cf769aa6
Merge pull request #1638 from martin-frbg/issue1637
...
Expose the CBLAS interface to the IxAMIN functions and have make build it
2018-06-23 15:01:02 +02:00
Martin Kroeker
eb71d61c7c
Expose CBLAS interface to BLAS extensions iXamin
2018-06-23 13:31:09 +02:00
Martin Kroeker
9cf22b7d91
Build cblas_iXamin interfaces
2018-06-23 13:27:30 +02:00
Martin Kroeker
cc66743b66
Merge pull request #1634 from oon3m0oo/develop
...
Fix data races reported by TSAN.
2018-06-21 21:01:03 +02:00
oon3m0oo
2aa0a5804e
Use BLAS rather than CBLAS in test_fork.c ( #1626 )
...
This is handy for people not using lapack.
2018-06-21 18:47:45 +02:00
Craig Donner
28c28ed275
Fix data races reported by TSAN.
2018-06-21 16:41:02 +01:00
oon3m0oo
a399d00425
Further improvements to memory.c. ( #1625 )
...
- Compiler TLS is now used only used when the compiler supports it
- If compiler TLS is unsupported, we use platform-specific TLS
- Only one variable (an index) is now in TLS
- We only access TLS once per alloc, and never when freeing
- Allocation / release info is now stored within the allocation itself, by
over-allocating; this saves having external structures do the bookkeeping, and
reduces some of the redundant data that was being stored (such as addresses)
- We never hit the alloc lock when not using SMP or when using OpenMP (that was
my fault)
- Now that there are fewer tracking structures I think this is a bit easier to
read than before
2018-06-20 22:04:03 +02:00
Martin Kroeker
f66b9c8826
Merge pull request #1630 from martin-frbg/x86-march
...
Add -march=skylake-avx512 to flags if target is skylake x
2018-06-20 21:51:57 +02:00
Martin Kroeker
2946c46024
Merge pull request #1631 from oon3m0oo/stack
...
Avoid declaring arrays of size 0 when making large stack allocations.
2018-06-20 21:51:38 +02:00
Craig Donner
05978528c3
Avoid declaring arrays of size 0 when making large stack allocations.
2018-06-20 17:03:18 +01:00
Martin Kroeker
ef6f0b645e
Merge pull request #1629 from martin-frbg/issue1628
...
Make gfortran link libomp for clang in the tests; avoid two typical gotchas with NOFORTRAN
2018-06-20 16:41:13 +02:00
Martin Kroeker
0c5b7b400b
Add -march=skylake-avx512 to flags if target is skylake x
2018-06-20 15:16:19 +02:00
Martin Kroeker
952541e840
Need to use filter-out to handle NOFORTRAN not set
2018-06-20 13:20:30 +02:00
Martin Kroeker
9369d3e6e5
Modify NOFORTRAN tests to always check the value; fix rewriting of NO_FORTRAN
2018-06-19 23:28:06 +02:00
Martin Kroeker
10b70c904d
Handle erroneous user settings NOFORTRAN=0 and NO_FORTRAN
2018-06-19 20:53:19 +02:00
Martin Kroeker
6a5ab083b7
Handle special case of gfortran+clang+OpenMP
2018-06-19 20:47:33 +02:00