On systems with more than 64 cpus, blas_quickdivide will sometimes return zero which creates bogus workloads when used for the stride calculation. This then leads to threads spinning incessantly waiting for a status change that never happens, as seen in #1497. This patch also fixes several data races that were found by helgrind and/or tsan while debugging the issue. |
||
|---|---|---|
| .. | ||
| Makefile | ||
| getrf_parallel.c | ||
| getrf_parallel_omp.c | ||
| getrf_single.c | ||