OpenBLAS/lapack
Ali Saidi 208c7e7ca5 Use acq/rel semantics to pass flags/pointers in getrf_parallel.
The current implementation has locks, but the locks each only
have a critical section of one variable so atomic reads/writes
with barriers can be used to achieve the same behavior.

Like the previous patch, pthread_mutex_lock isn't fair, so in a
tight loop the previous thread that has the lock can keep it
starving another thread, even if that thread is about to write
the data that will stop the current thread from spinning.

On a 64c Arm system this improves performance by 20x on sgesv.goto.
2020-03-06 06:22:31 +00:00
..
getf2 Refs #723. Avoid out of boundary for getf2. 2016-01-26 09:14:57 -06:00
getrf Use acq/rel semantics to pass flags/pointers in getrf_parallel. 2020-03-06 06:22:31 +00:00
getrs Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
laswp add missing brackets to silence indentation warnings gcc721 2018-01-19 23:11:12 +01:00
lauu2 Fix lapack complex implementation of lauu2 and potf2 for Android (use FLOAT instead of FLOAT[2] as imaginary part is not used). 2016-02-04 16:59:56 -05:00
lauum prepared lapack/lauum for UNROLL values, that are not a power of two 2017-01-11 07:29:17 +01:00
potf2 Fix lapack complex implementation of lauu2 and potf2 for Android (use FLOAT instead of FLOAT[2] as imaginary part is not used). 2016-02-04 16:59:56 -05:00
potrf prepared lapack/potrf functions for UNROLL values, that are not a power of two 2017-01-10 10:50:28 +01:00
trti2 LAPACK helpers in C that need care too 2018-01-02 14:38:50 +01:00
trtri address minor warnings from gcc7 2019-09-07 10:21:08 +03:00
trtrs fix Makefile 2019-09-10 17:11:01 -04:00
CMakeLists.txt Correct generation of GETRF files by the CMAKE build 2020-02-15 19:29:14 +01:00
Makefile add missing objects 2019-09-08 11:14:49 -04:00