Dumi Loghin
db17ce896f
replace ARCH with AR in lapack-netlib
2018-09-05 12:49:37 +08:00
Martin Kroeker
cbc46163bd
Merge pull request #1526 from jerryz123/upstream_riscv
...
Add support for RISC-V
2018-04-28 11:55:45 +02:00
Jerry Zhao
0ee395db35
Fixed TRMM and SYMM for RISCV
2018-04-18 18:03:32 -07:00
Jerry Zhao
c167a3d6f4
Added RISCV build
2018-04-16 14:08:31 -07:00
Martin Kroeker
0c4718c57a
Merge pull request #1523 from martin-frbg/utest_waith
...
Include sys/types.h for proper typedefs related to wait()
2018-04-15 13:09:30 +02:00
Martin Kroeker
f29389c7ac
Merge pull request #1520 from martin-frbg/cpucounts
...
Catch invalid cpu count returned by CPU_COUNT_S
2018-04-14 22:24:34 +02:00
Martin Kroeker
734d7c6a93
Include sys/types.h for proper typedefs related to wait()
...
Should fix #1519
2018-04-14 18:59:46 +02:00
Martin Kroeker
7c861605b2
Catch invalid cpu count returned by CPU_COUNT_S
...
mips32 was seen to return zero here, driving nthreads to zero with subsequent fpe in blas_quickdivide
2018-04-14 18:29:10 +02:00
Martin Kroeker
2ca0faf495
Merge pull request #1515 from martin-frbg/mipsdot
...
Correct precision of mips dsdot
2018-04-11 08:21:25 +02:00
Martin Kroeker
0fe434598b
Fix precision of mips dsdot
2018-04-10 23:30:59 +02:00
Martin Kroeker
15c437e092
Merge pull request #1512 from ararslan/aa/travis-macos-2
...
Add macOS to the Travis testing matrix: Take 2!
2018-04-07 23:31:26 +02:00
Alex Arslan
b966bd79d5
Add a BINARY=32 build to macOS
2018-04-07 12:29:57 -07:00
Alex Arslan
2e988dbf35
Add macOS to the Travis testing matrix
2018-04-07 10:56:34 -07:00
Martin Kroeker
be6090d396
Merge pull request #1511 from xianyi/revert-1510-aa/travis-macos
...
Revert "Add macOS to the Travis testing matrix"
2018-04-07 13:29:31 +02:00
Martin Kroeker
daae8fd197
Revert "Add macOS to the Travis testing matrix"
2018-04-07 13:27:24 +02:00
Martin Kroeker
a1fb7670f7
Merge pull request #1510 from ararslan/aa/travis-macos
...
Add macOS to the Travis testing matrix
2018-04-07 12:07:12 +02:00
Martin Kroeker
6c99c97489
Merge pull request #1509 from ararslan/aa/dragonfly
...
Add DragonFly to exports/Makefile
2018-04-07 12:06:57 +02:00
Alex Arslan
6a0930560e
Add macOS to the Travis testing matrix
2018-04-06 17:53:58 -07:00
Alex Arslan
24f8d5b624
Add DragonFly to exports/Makefile
...
Its exclusion was an oversight on my part.
2018-04-06 17:30:10 -07:00
Martin Kroeker
77b4dbd53b
Merge pull request #1506 from martin-frbg/issue1497
...
Fix thread races and infinite looping on systems with many cpus
2018-04-05 23:46:36 +02:00
Martin Kroeker
bc4c3bca01
Merge pull request #1507 from martin-frbg/threads_usage
...
Underline importance of NUM_THREADS setting for BUFFER allocation
2018-04-05 08:54:07 +02:00
Martin Kroeker
6b0a9d135c
Merge pull request #1508 from ararslan/aa/wording
...
Minor changes to wording and formatting in the README
2018-04-05 08:53:38 +02:00
Alex Arslan
137ccd9dd9
Minor changes to wording and formatting in the README
...
The wording in some places is not grammatically correct. This change
also provides minor adjustments to the Markdown formatting which provide
modest improvements to readability.
2018-04-04 14:30:32 -07:00
Martin Kroeker
84923dedb7
Merge pull request #1505 from ararslan/aa/compiler
...
Compile with cc rather than gcc whenever possible
2018-04-04 22:45:33 +02:00
Martin Kroeker
8ec28ff461
Remove unguarded use of _Atomic and fix tabbing
2018-04-04 22:40:30 +02:00
Martin Kroeker
ca8ca796d3
Underline importance of NUM_THREADS setting for BUFFER allocation
...
following augray's suggestion from #1451 , and incorporating ashwinyes' comments from #1141 on the importance of NUM_THREADS even for single-threaded builds.
2018-04-04 22:26:51 +02:00
Alex Arslan
8f811a9312
Reinstate macOS logic
2018-04-04 11:41:45 -07:00
Alex Arslan
36a17536ca
Compile with cc rather than gcc whenever possible
2018-04-04 11:26:54 -07:00
Martin Kroeker
bb9876db33
Fix thread races and infinite looping on systems with many cpus
...
On systems with more than 64 cpus, blas_quickdivide will sometimes return zero which creates bogus workloads when used for the stride calculation. This then leads to threads spinning incessantly waiting for a status change that never happens, as seen in #1497 .
This patch also fixes several data races that were found by helgrind and/or tsan while debugging the issue.
2018-04-04 18:16:52 +02:00
Martin Kroeker
d636b418af
Merge pull request #1504 from ararslan/aa/openbsd
...
Allow building on OpenBSD
2018-04-04 15:26:46 +02:00
Martin Kroeker
a460c92577
Merge pull request #1501 from martin-frbg/issue875
...
Add workaround for old gcc and clang versions
2018-04-04 15:26:21 +02:00
Alex Arslan
33f838393c
Add OpenBSD and DragonFly to community supported platforms
2018-04-03 16:42:01 -07:00
Alex Arslan
a41d241a0e
Add support for DragonFly BSD
2018-04-03 16:39:29 -07:00
Alex Arslan
8da6b6ae52
Allow building on OpenBSD
...
With this change, OpenBLAS builds and all tests pass on OpenBSD 6.2
using Clang. Tested on x86-64 only, with and without DYNAMIC_ARCH=1.
2018-04-02 10:48:22 -07:00
Martin Kroeker
01c4b82f04
Update memory.c
2018-03-31 22:32:06 +02:00
Martin Kroeker
93db123f7e
Update memory.c
2018-03-29 13:13:49 +02:00
Martin Kroeker
752fdb5dd8
Add workaround for old gcc and clang versions
...
Old gcc and clang do not handle constructor arguments, finally fix #875 as discussed there, using the fedora patch
2018-03-29 11:56:56 +02:00
Martin Kroeker
07ed01e97f
Merge pull request #1500 from martin-frbg/issue1474
...
Correct index variables used in MFlops calculation
2018-03-28 09:15:34 +02:00
Martin Kroeker
35c5a32309
Correct index variables used in MFlops calculation
...
Fixes #1474
2018-03-27 21:52:29 +02:00
Martin Kroeker
c7b55b6082
Merge pull request #1499 from quickwritereader/develop
...
Implemented missing vsx simd kernels for power8 blas1/2 double. z13 modifications
2018-03-27 21:43:23 +02:00
Martin Kroeker
840e01061f
Merge pull request #1491 from martin-frbg/ddot_mt
...
Add multithreading support for Haswell DDOT
2018-03-27 21:43:05 +02:00
QWR QWR
28ca97015d
power8:Added initial zgemv_(t|n) ,i(d|z)amax,i(d|z)amin,dgemv_t(transposed),zrot
...
z13: improved zgemv_(t|n)_4,zscal,zaxpy
2018-03-27 14:54:41 +00:00
Martin Kroeker
73c5ca74fa
Merge pull request #1495 from martin-frbg/aff
...
Disable CPU affinity by default again
2018-03-19 18:03:25 +01:00
Martin Kroeker
e453555d97
Disable CPU affinity by default again
...
This setting must have been changed unintentionally by my PR #1214 (probably leftover from unrelated tests)
2018-03-19 18:02:23 +01:00
Martin Kroeker
6a6ffaff1e
Merge pull request #1494 from martin-frbg/x86_dsdot
...
Use generic/dot.c instead of the inferior arm/dot.c for x86 DSDOT
2018-03-17 15:26:47 +01:00
Martin Kroeker
28ac9ea5a6
Use generic/dot.c instead of the inferior arm/dot.c for x86 DSDOT
...
to resolve dsdot utest failure seen in #1492
2018-03-17 13:49:15 +01:00
Martin Kroeker
a55694dd5b
Declare dot_compute static to avoid conflicts in multiarch builds
2018-03-16 22:23:36 +01:00
Martin Kroeker
85a41e9cdb
Add multithreading support for Haswell DDOT
...
copied from ashwinyes' implementation in dot_thunderx2t99.c
2018-03-16 16:58:47 +01:00
Martin Kroeker
2c7392f07b
Merge pull request #1482 from martin-frbg/haswell_axpy
...
Re-enable DAXPY AVX microkernels for x86_64
2018-03-04 22:21:18 +01:00
Martin Kroeker
81215711a2
Re-enable DAXPY microkernels for x86_64
...
as the inaccuracies seen in the original testcase for #1332 appear to be due to an artefact that amplifies the very small rounding differences between FMA and discrete multiply+add
2018-03-04 19:37:03 +01:00