Commit Graph

2871 Commits

Author SHA1 Message Date
Martin Kroeker
d94d7baf7e Add mips32r2 api target 2018-05-02 20:17:26 +02:00
Martin Kroeker
3af1b5c805 Make cpuid_mips compile again and add 1004K cpu 2018-05-02 20:12:25 +02:00
Martin Kroeker
88e224f4c0 Merge pull request #1542 from martin-frbg/quickdiv64
Avoid out-of-bounds accesses in blas_quickdivide on big X86 systems
2018-05-02 18:11:50 +02:00
Martin Kroeker
d0c0506588 Omit the divide table overflow check on small systems 2018-05-02 14:44:50 +02:00
Martin Kroeker
e93355e5e1 Omit the table overflow check when building for small systems 2018-05-02 14:43:08 +02:00
Martin Kroeker
c1eb06e102 Update common_x86_64.h 2018-04-29 14:40:12 +02:00
Martin Kroeker
8145ecd70b Avoid out-of-bounds reads from blas_quick_divide_table on big systems 2018-04-29 14:38:55 +02:00
Martin Kroeker
26ce518d46 Avoid out of bounds reads from blas_quick_divide_table on big systems
Should fix #1541
2018-04-29 14:34:33 +02:00
Martin Kroeker
1d27fa8507 Merge pull request #1539 from martin-frbg/ztrmv-1332
Disable multithreading in ztrmv
2018-04-27 23:10:21 +02:00
Martin Kroeker
802cf6b22d Merge pull request #1486 from martin-frbg/atomic
Use _Atomic instead of volatile for thread safety where C11 is supported
2018-04-27 23:09:57 +02:00
Martin Kroeker
954f1832de Merge pull request #1540 from martin-frbg/mips32-zasum
Fix typo in MIPS P5600 complex ASUM code selection
2018-04-25 23:23:00 +02:00
Martin Kroeker
941ad280a8 Fix typo in MIPS P5600 complex ASUM code selection 2018-04-25 22:50:10 +02:00
Martin Kroeker
a8ed428bab Disable multithreading in ztrmv
BLAS-Tester shows that the same problem exists as with DTRMV (issue #1332)
2018-04-25 22:35:46 +02:00
Martin Kroeker
1da365312a Merge pull request #1538 from martin-frbg/arm7utest
Fix handling of zero INCX, INCY in ArmV7 AXPY and ROT
2018-04-25 08:38:58 +02:00
Martin Kroeker
2d0929fa7c Move the test for zero incx,incy in ARMV7 ROT
to pass the related utest (see #1469)
2018-04-24 22:43:00 +02:00
Martin Kroeker
125343cc88 Drop test for zero incx,incy in armv7 AXPY
...to pass the related utest (see #1469)
2018-04-24 22:39:50 +02:00
Martin Kroeker
8a3b6fa108 Use generic zrot.c on ppc64/POWER6 to work around utest failure from … (#1535)
* Use generic C implementation of zrot on ppc64/POWER6 to work around utest failure from #1469
2018-04-23 19:05:49 +02:00
Martin Kroeker
78694f1b7e Merge pull request #1534 from xianyi/revert-1333-haswell32
Revert "Fix 32bit HASWELL builds"
2018-04-22 23:34:17 +02:00
Martin Kroeker
9c5518319a Revert "Fix 32bit HASWELL builds" 2018-04-22 20:20:04 +02:00
Martin Kroeker
86f49c529d Merge pull request #1532 from martin-frbg/utest-cblas
Do not try to build the fork utest when NO_CBLAS=1
2018-04-20 23:44:15 +02:00
Martin Kroeker
625c74a38f fork utest depends on CBLAS 2018-04-20 15:43:59 +02:00
Martin Kroeker
5fcaca6438 fork utest depends on CBLAS 2018-04-20 15:42:13 +02:00
Martin Kroeker
4fcdd24459 Merge pull request #1530 from ashwinyes/develop_20180419_Tx2AutoDetect
ARM64: Enable Auto Detection of ThunderX2T99
2018-04-19 14:10:57 +02:00
Ashwin Sekhar T K
68a3c4fca6 ARM64: Enable Auto Detection of ThunderX2T99 2018-04-19 09:05:25 +00:00
Martin Kroeker
0c4718c57a Merge pull request #1523 from martin-frbg/utest_waith
Include sys/types.h for proper typedefs related to wait()
2018-04-15 13:09:30 +02:00
Martin Kroeker
f29389c7ac Merge pull request #1520 from martin-frbg/cpucounts
Catch invalid cpu count returned by CPU_COUNT_S
2018-04-14 22:24:34 +02:00
Martin Kroeker
734d7c6a93 Include sys/types.h for proper typedefs related to wait()
Should fix #1519
2018-04-14 18:59:46 +02:00
Martin Kroeker
7c861605b2 Catch invalid cpu count returned by CPU_COUNT_S
mips32 was seen to return zero here, driving nthreads to zero with subsequent fpe in blas_quickdivide
2018-04-14 18:29:10 +02:00
Martin Kroeker
2ca0faf495 Merge pull request #1515 from martin-frbg/mipsdot
Correct precision of mips dsdot
2018-04-11 08:21:25 +02:00
Martin Kroeker
0fe434598b Fix precision of mips dsdot 2018-04-10 23:30:59 +02:00
Martin Kroeker
15c437e092 Merge pull request #1512 from ararslan/aa/travis-macos-2
Add macOS to the Travis testing matrix: Take 2!
2018-04-07 23:31:26 +02:00
Alex Arslan
b966bd79d5 Add a BINARY=32 build to macOS 2018-04-07 12:29:57 -07:00
Alex Arslan
2e988dbf35 Add macOS to the Travis testing matrix 2018-04-07 10:56:34 -07:00
Martin Kroeker
be6090d396 Merge pull request #1511 from xianyi/revert-1510-aa/travis-macos
Revert "Add macOS to the Travis testing matrix"
2018-04-07 13:29:31 +02:00
Martin Kroeker
daae8fd197 Revert "Add macOS to the Travis testing matrix" 2018-04-07 13:27:24 +02:00
Martin Kroeker
20c6c38e51 Merge branch 'develop' into atomic 2018-04-07 12:09:39 +02:00
Martin Kroeker
a1fb7670f7 Merge pull request #1510 from ararslan/aa/travis-macos
Add macOS to the Travis testing matrix
2018-04-07 12:07:12 +02:00
Martin Kroeker
6c99c97489 Merge pull request #1509 from ararslan/aa/dragonfly
Add DragonFly to exports/Makefile
2018-04-07 12:06:57 +02:00
Alex Arslan
6a0930560e Add macOS to the Travis testing matrix 2018-04-06 17:53:58 -07:00
Alex Arslan
24f8d5b624 Add DragonFly to exports/Makefile
Its exclusion was an oversight on my part.
2018-04-06 17:30:10 -07:00
Martin Kroeker
77b4dbd53b Merge pull request #1506 from martin-frbg/issue1497
Fix thread races and infinite looping on systems with many cpus
2018-04-05 23:46:36 +02:00
Martin Kroeker
bc4c3bca01 Merge pull request #1507 from martin-frbg/threads_usage
Underline importance of NUM_THREADS setting for BUFFER allocation
2018-04-05 08:54:07 +02:00
Martin Kroeker
6b0a9d135c Merge pull request #1508 from ararslan/aa/wording
Minor changes to wording and formatting in the README
2018-04-05 08:53:38 +02:00
Alex Arslan
137ccd9dd9 Minor changes to wording and formatting in the README
The wording in some places is not grammatically correct. This change
also provides minor adjustments to the Markdown formatting which provide
modest improvements to readability.
2018-04-04 14:30:32 -07:00
Martin Kroeker
84923dedb7 Merge pull request #1505 from ararslan/aa/compiler
Compile with cc rather than gcc whenever possible
2018-04-04 22:45:33 +02:00
Martin Kroeker
8ec28ff461 Remove unguarded use of _Atomic and fix tabbing 2018-04-04 22:40:30 +02:00
Martin Kroeker
ca8ca796d3 Underline importance of NUM_THREADS setting for BUFFER allocation
following augray's suggestion from #1451, and incorporating ashwinyes' comments from #1141 on the importance of NUM_THREADS even for single-threaded builds.
2018-04-04 22:26:51 +02:00
Alex Arslan
8f811a9312 Reinstate macOS logic 2018-04-04 11:41:45 -07:00
Alex Arslan
36a17536ca Compile with cc rather than gcc whenever possible 2018-04-04 11:26:54 -07:00
Martin Kroeker
bb9876db33 Fix thread races and infinite looping on systems with many cpus
On systems with more than 64 cpus, blas_quickdivide will sometimes return zero which creates bogus workloads when used for the stride calculation. This then leads to threads spinning incessantly waiting for a status change that never happens, as seen in #1497.
This patch also fixes several data races that were found by helgrind and/or tsan while debugging the issue.
2018-04-04 18:16:52 +02:00