When NUM_THREADS(MAX_CPU_NUNBERS) is very large ,e.g. 256.
typedef struct {
volatile BLASLONG working[MAX_CPU_NUMBER][CACHE_LINE_SIZE * DIVIDE_RATE];
} job_t;
job_t job[MAX_CPU_NUMBER];
The job array is equal 8MB.
Thus, We use malloc instead of stack allocation.
|
||
|---|---|---|
| .. | ||
| Makefile | ||
| getrf_parallel.c | ||
| getrf_parallel_omp.c | ||
| getrf_single.c | ||