OpenBLAS/lapack/getrf
Ali Saidi 208c7e7ca5 Use acq/rel semantics to pass flags/pointers in getrf_parallel.
The current implementation has locks, but the locks each only
have a critical section of one variable so atomic reads/writes
with barriers can be used to achieve the same behavior.

Like the previous patch, pthread_mutex_lock isn't fair, so in a
tight loop the previous thread that has the lock can keep it
starving another thread, even if that thread is about to write
the data that will stop the current thread from spinning.

On a 64c Arm system this improves performance by 20x on sgesv.goto.
2020-03-06 06:22:31 +00:00
..
Makefile Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
getrf_parallel.c Use acq/rel semantics to pass flags/pointers in getrf_parallel. 2020-03-06 06:22:31 +00:00
getrf_parallel_omp.c prepared lapack/getrf functions for UNROLL values, that are not a power of two 2017-01-09 12:57:26 +01:00
getrf_single.c LAPACK helpers in C that need care too 2018-01-02 14:38:50 +01:00
potrf_parallel.c Change _STDC_VERSION__ to __STDC_VERSION__ 2018-05-11 12:15:08 +08:00