On systems with more than 64 cpus, blas_quickdivide will sometimes return zero which creates bogus workloads when used for the stride calculation. This then leads to threads spinning incessantly waiting for a status change that never happens, as seen in #1497. This patch also fixes several data races that were found by helgrind and/or tsan while debugging the issue. |
||
---|---|---|
.. | ||
Makefile | ||
getrf_parallel.c | ||
getrf_parallel_omp.c | ||
getrf_single.c |