On systems with more than 64 cpus, blas_quickdivide will sometimes return zero which creates bogus workloads when used for the stride calculation. This then leads to threads spinning incessantly waiting for a status change that never happens, as seen in #1497. This patch also fixes several data races that were found by helgrind and/or tsan while debugging the issue. |
||
|---|---|---|
| .. | ||
| getf2 | ||
| getrf | ||
| getrs | ||
| laswp | ||
| lauu2 | ||
| lauum | ||
| potf2 | ||
| potrf | ||
| trti2 | ||
| trtri | ||
| CMakeLists.txt | ||
| Makefile | ||