Commit Graph

23 Commits

Author SHA1 Message Date
Martin Kroeker 1dea57ab25
Revert PR #3250 (shortcut without buffer allocation) as it is unsafe on some x86_64 2021-07-14 20:32:57 +02:00
Martin Kroeker 7bb59fceb7
Clean up some warnings 2021-07-11 16:00:29 +02:00
Martin Kroeker f0e7345fb8
Add shortcut for small-size gemv_n with increments of one 2021-05-26 22:02:34 +02:00
Chen, Guobing a7b1f9b1bb Implementation of BF16 based gemv
1. Add a new API -- sbgemv to support bfloat16 based gemv
2. Implement a generic kernel for sbgemv
3. Implement an avx512-bf16 based kernel for sbgemv

Signed-off-by: Chen, Guobing <guobing.chen@intel.com>
2020-10-29 02:08:23 +08:00
Martin Kroeker 933896a1d0
Use blasabs to switch between abs and labs as needed for INTERFACE64 2018-08-04 20:06:49 +02:00
Jerome Robert 1fe3aab047 Use GEMM_MULTITHREAD_THRESHOLD as a number of ops
...not a matrix size. For GEMM_MULTITHREAD_THRESHOLD=4
(the default value) this does not change anything but
for other values it make the GEMM and GEMV thresholds
changing in the same way.

Close #742
2016-01-24 11:31:40 +01:00
Jerome Robert 87a2ccc37c Factorize MAX_STACK_ALLOC code to common_stackalloc.h
Ref #727
2016-01-08 16:03:52 +01:00
Jerome Robert f9890a6452 Fix compilation when MAX_STACK_ALLOC is not set
Close #722
2015-12-31 14:43:09 +01:00
Zhang Xianyi 640cccc2b1 Refs #697. Fixed gemv bug for Windows.
Thank matzeri's patch.
2015-11-30 15:19:45 -06:00
Zhang Xianyi dcd5ba4443 Merge branch 'cmake' of https://github.com/hpanderson/OpenBLAS into hpanderson_cmake 2015-07-22 04:06:39 +08:00
Jerome Robert ab567d8443 gemv: Ensure stack buffer is large enough to handle memory alignment
Ref #478
2015-04-24 10:12:49 +02:00
Zhang Xianyi 847e19c04e Refs #478,#482, Enable stack alloc for s/dgemv_t.(revert 9798491) 2015-04-20 23:22:40 -05:00
Zhang Xianyi fd9fd42936 Refs #478, #482. Fixed bug on previous commit. 2015-04-13 23:22:27 -05:00
Zhang Xianyi 9798481979 Refs #478, #482. Fix segfault bug for gemv_t with MAX_ALLOC_STACK flag.
For gemv_t, directly use malloc to create the buffer.
2015-04-13 19:45:27 -05:00
Hank Anderson e74462a3f5 Moved declarations to start of functions to satisfy MSVC C89 implementation. 2015-02-11 11:16:57 -06:00
Jerome Robert b17ccb4c5c Fix a segfault in gemv when MAX_STACK_ALLOC is set
* stack_alloc_size is needed after the implementation call
but it may be overwritten if it's optimized to a register,
because some gemv implementation (ex: dgemv_n.S) do not
restore all register (ex: r10).
* do the same in ger.c for the same reasons even if the bug
has not been observed.
2015-01-29 09:55:57 +01:00
Jerome Robert e9d9a8eae3 Allow to do gemv and ger buffer allocation on the stack
ger and gemv call blas_memory_alloc/free which in their turn
call blas_lock. blas_lock create thread contention when matrices
are small and the number of thread is high enough. We avoid
call blas_memory_alloc by replacing it with stack allocation.
This can be enabled with:
make -DMAX_STACK_ALLOC=2048
The given size (in byte) must be high enough to avoid thread contention
and small enough to avoid stack overflow.

Fix #478
2014-12-27 14:33:12 +01:00
wernsaar f511807fc0 modified multithreading threshold 2014-09-08 12:27:32 +02:00
wernsaar d1800397f5 optimized interface/gemv.c for multithreading 2014-09-02 17:36:07 +02:00
wernsaar f4ff889491 updated interface/gemv.c for multithreading 2014-09-02 16:30:04 +02:00
wernsaar b985cea65d adjust number of threads for sgemv and dgemv 2014-07-15 16:04:46 +02:00
Timothy Gu 6c2ead30f0 Remove all trailing whitespace except lapack-netlib
Signed-off-by: Timothy Gu <timothygu99@gmail.com>
2014-06-27 12:05:18 -07:00
Xianyi Zhang 342bbc3871 Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00