Commit Graph

338 Commits

Author SHA1 Message Date
Martin Kroeker 8229c163b7
Use runtime check for AVX512 (sgemm_direct) capability when using DYNAMIC_ARCH 2020-03-26 21:12:56 +01:00
Martin Kroeker 6a14b34c20
Avoid calling DIRECT codepath in DYNAMIC_ARCH on non-SKX 2020-03-22 14:33:16 +01:00
Martin Kroeker d65e9a2bbd
Merge pull request #2253 from thrasibule/xerbla
fix error messages
2020-01-18 20:39:04 +01:00
Guillaume Horel 2463938879 fix error message 2019-09-11 10:35:25 -04:00
Guillaume Horel 5d6525c87c more bugfix 2019-09-10 17:30:57 -04:00
Guillaume Horel 459bb9291d fix error codes 2019-09-10 17:10:33 -04:00
Guillaume Horel 5997b6b491 bugfix 2019-09-08 11:14:49 -04:00
Guillaume Horel 7ec7b999a5 add missing file 2019-09-08 11:14:49 -04:00
Guillaume Horel af9ac0898a fix Makefile 2019-09-08 11:14:49 -04:00
Guillaume Horel 9b2f0323d6 update Makefile 2019-09-08 11:14:49 -04:00
Guillaume Horel ea747cf933 start working on ?trtrs 2019-09-08 11:14:49 -04:00
luz.paz daf2fec12d Misc. typo fixes
Found via `codespell -q 3 -w -L ith,als,dum,nd,amin,nto,wis,ba -S ./relapack,./kernel,./lapack-netlib`
2019-04-29 17:03:56 -04:00
Martin Kroeker 268c28db7d
Merge pull request #2095 from martin-frbg/trsm
Correct length of name string in xerbla call
2019-04-28 09:55:25 +02:00
Martin Kroeker 0bd956fd21 Correct length of name string in xerbla call 2019-04-27 22:49:04 +02:00
Martin Kroeker 79cfc24a62
Add interface for ?sum (derived from ?asum) 2019-03-30 21:59:18 +01:00
Martin Kroeker c19a449096
Merge pull request #2071 from martin-frbg/issue2068
Provide CBLAS interfaces to I?MIN and I?MAX
2019-03-30 14:54:28 +01:00
Martin Kroeker 3d1e36d4cb
Build CBLAS interfaces for I?MIN and I?MAX 2019-03-30 12:38:41 +01:00
Martin Kroeker e29b0cfcc4
Allow multithreading TRMV again
revert workaround introduced for issue #1332 as the actual cause appears to be my incorrect fix from #1262 (see #1388)
2019-02-19 21:03:30 +01:00
Martin Kroeker 8533aca964
Avoid penalizing tall skinny matrices 2019-01-23 10:03:00 +01:00
Martin Kroeker cda81cfae0
Shift transition to multithreading towards larger matrix sizes
See #1886 and JuliaRobotics issue 500. trsm benchmarks on Haswell and Zen showed that with these values performance is roughly doubled for matrix sizes between 8x8 and 14x14, and still 10 to 20 percent better near the new cutoff at 32x32.
2019-01-19 00:10:01 +01:00
Arjan van de Ven cdc668d82b Add a "sgemm direct" mode for small matrixes
OpenBLAS has a fancy algorithm for copying the input data while laying
it out in a more CPU friendly memory layout.

This is great for large matrixes; the cost of the copy is easily
ammortized by the gains from the better memory layout.

But for small matrixes (on CPUs that can do efficient unaligned loads) this
copy can be a net loss.

This patch adds (for SKYLAKEX initially) a "sgemm direct" mode, that bypasses
the whole copy machinary for ALPHA=1/BETA=0/... standard arguments,
for small matrixes only.

What is small? For the non-threaded case this has been measured to be
in the M*N*K = 28 * 512 * 512 range, while in the threaded case it's
less, around M*N*K = 1 * 512 * 512
2018-12-13 13:47:31 +00:00
Martin Kroeker 5393759a98
Merge pull request #1869 from martin-frbg/axpy0
Handle special case INCX=0,INCY=0 in the axpy interface
2018-11-25 20:52:49 +01:00
Martin Kroeker c171b8ad13
Handle special case INCX=0,INCY=0 in the axpy interface 2018-11-13 13:57:18 +01:00
Martin Kroeker 96d2f2c9b2
Merge pull request #1831 from brada4/hemv
disable threading in C/ZSWAP copying from S/DSWAP
2018-11-07 08:49:21 +01:00
Andrew 2992e3886a disable threading in C/ZSWAP copying from S/DSWAP 2018-10-22 23:21:49 +03:00
Martin Kroeker e3c262e5cf
Merge pull request #1825 from brada4/hemv
Delay _hemv threading in attempt to address #1820
2018-10-21 20:34:05 +02:00
Andrew a293bdcd5e re-arrange new code for readability 2018-10-20 21:37:53 +03:00
Andrew c7bbf9c987 Attempt to tame _hemv threading #1820 2018-10-20 11:13:29 +03:00
Ashwin Sekhar T K 21f46a1cf2 ARM64: Use THUNDERX2T99 Neon Kernels for ARMV8
Currently the generic ARMV8 target uses C implementations
for many routines. Replace these with the neon implementations
written for THUNDERX2T99 target which are upto 6x faster for
certain routines.
2018-10-17 10:44:37 -07:00
Martin Kroeker b991570210
Merge pull request #1762 from martin-frbg/issue1710-2
Add explicit casts to silence compiler warnings
2018-09-19 18:16:21 +02:00
Martin Kroeker f3c262156e
Add an explicit cast to silence a warning
for #1710
2018-09-13 14:24:29 +02:00
Martin Kroeker 30f5a69ab8
Add explicit cast to silence a warning
for #1710
2018-09-13 14:23:31 +02:00
Martin Kroeker 4a553e8678
Merge pull request #1713 from martin-frbg/issue1710
Introduce blasabs macro and use it to switch between abs and labs for INTERFACE64
2018-08-04 23:51:31 +02:00
Martin Kroeker 165f00c159
fabs -> fabsl 2018-08-04 20:14:51 +02:00
Martin Kroeker 933896a1d0
Use blasabs to switch between abs and labs as needed for INTERFACE64 2018-08-04 20:06:49 +02:00
Steven G. Johnson a4e321400b
fabs -> fabsl
Fixes two calls that were using `fabs` on a `long double` argument rather than `fabsl`, which looks like it is doing an unintentional truncation to `double` precision.
2018-08-03 13:00:10 -04:00
Martin Kroeker 9cf22b7d91
Build cblas_iXamin interfaces 2018-06-23 13:27:30 +02:00
Craig Donner c2545b0fd6 Fixed a few more unnecessary calls to num_cpu_avail.
I don't have as many benchmarks for these as for gemm, but it should still
make a difference for small matrices.
2018-06-11 10:17:16 +01:00
Craig Donner 66316b9f4c Improve performance of GEMM for small matrices when SMP is defined.
Always checking num_cpu_avail() regardless of whether threading will actually
be used adds noticeable overhead for small matrices.  Most other uses of
num_cpu_avail() do so only if threading will be used, so do the same here.
2018-06-07 15:29:13 +01:00
Martin Kroeker e8880c1699
Use a single thread for small input size
copies daxpy improvement from #27, see #1560
2018-06-07 10:26:55 +02:00
Martin Kroeker 1d27fa8507
Merge pull request #1539 from martin-frbg/ztrmv-1332
Disable multithreading in ztrmv
2018-04-27 23:10:21 +02:00
Martin Kroeker a8ed428bab
Disable multithreading in ztrmv
BLAS-Tester shows that the same problem exists as with DTRMV (issue #1332)
2018-04-25 22:35:46 +02:00
Martin Kroeker 809fd0d451
Rewrite ROTMG to address cases not covered by the netlib algorithm (#1480)
* Rewrite ROTMG based on the new implementation in GONUM based on the algorithm proposed by Tim Hopkins, see issue 1452 for the reference
* Correct ROTMG utest for issue1452 and add another from gonum, also correct transposition of expected and observed values in error messages
2018-03-04 17:39:56 +01:00
Martin Kroeker 72f14a0363
Fix conditionals in the rescaling against GAMSQ 2018-02-18 12:54:52 +01:00
Martin Kroeker 798f1595d5
Fix condition in both second scaling loops 2018-02-18 12:37:09 +01:00
Martin Kroeker 0464aa6784
Remove debug printfs 2018-02-09 23:06:50 +01:00
Martin Kroeker 55840f0bc9
Keep the flag handling separate from the scaling loops
Fixes #1452 and is more in line with how ATLAS does it. The earlier fix from #356 only moved the bug elsewhere, but we will never want the iterative rescaling to change the dflag setting and variable associations with each cycle.
2018-02-09 23:00:03 +01:00
Andrew 47deec2c1a fix couple of dead assignment warnings 2017-12-22 00:56:35 +01:00
Martin Kroeker 38763ec4f3
Disable multithreading for trmv
as a (hopefully temporary) workaround for #1332
2017-12-03 22:40:54 +01:00
Martin Kroeker 9251a2efde
Merge pull request #1359 from brada4/develop
Eliminate mode variable where not needed in syrk interface
2017-11-18 23:47:17 +01:00
Martin Kroeker b46e2b57cc
Make return parameter of cblas_Xdotc_sub, cblas_Xdotu_sub a void pointer as well 2017-11-18 20:28:02 +01:00
Martin Kroeker 3ce401f51b
Make last parameter of cblas_Xdotc_sub/cblas_Xdotu_sub a void pointer as well 2017-11-18 18:58:40 +01:00
Andrew 27575d200a Eliminate mode variable where not needed 2017-11-15 15:32:38 +01:00
Martin Kroeker 2c222f1faa
Modify complex CBLAS functions to take void pointers
Modify complex CBLAS functions to take void pointers instead of float or double arguments (to bring the prototypes in line with netlib and other implementations' cblas.h)
2017-11-05 15:53:14 +01:00
Martin Kroeker 742f54c235 Merge pull request #1303 from martin-frbg/imatcopy-rowscols
Fix cols/rows mixup in omatcopy 2nd step for BlasTrans cases
2017-09-14 21:46:26 +02:00
Martin Kroeker d674fbb4c7 Fix cols/rows mixup in omatcopy 2nd step for BlasTrans cases
Equivalent of #1244 (issue #899) for the non-complex cases. Fixes #1289
2017-09-14 19:59:05 +02:00
Martin Kroeker 46c9357c72 Merge pull request #1288 from quickwritereader/develop
Optimized standard Blas Level-1,2 (excluding nrm2 functions) for z13 (double precision). Issue 884
2017-09-09 23:47:17 +02:00
Abdurrauf 1cfdb2295d Optimized standard Blas Level-1,2 (excluding nrm2 functions) for z13 (double precision) 2017-09-06 16:41:08 +04:00
Martin Kroeker 00740c0e34 Merge pull request #1290 from martin-frbg/imatcopy
Use in-place transform shortcut only if matrix is square
2017-09-03 13:02:10 +02:00
Martin Kroeker 254db9bd7c Use in-place transform shortcut only if matrix is square 2017-09-03 09:52:55 +02:00
Isuru Fernando d245caa49a Support out-of-source build 2017-08-01 15:16:14 +05:30
Martin Kroeker 376048156b Use in-place transform shortcut only if matrix is square 2017-07-21 11:20:15 +02:00
Martin Kroeker d1c5b8f913 Add files via upload 2017-07-20 20:51:06 +02:00
Martin Kroeker 91bde7d315 Exchange rows and cols in final omatcopy with BlasTrans
This is MicMuc's patch from #899
2017-07-15 22:02:53 +02:00
Martin Kroeker 1e06b49854 Update xerbla.c 2017-04-26 20:29:30 +02:00
Martin Kroeker 7f546f54fa Add cblas_xerbla 2017-04-26 20:01:34 +02:00
Martin Kroeker a809431e34 Add cblas_xerbla() 2017-04-26 19:58:59 +02:00
Andrew 99880f7906 Address unlikely memleak in zimatcopy interface (#1129)
* fix unlikely memleak in zimatcopy interface

* fix only unlikely memleak in zimatcopy interface

* fix only unlikely memleak in zimatcopy interface
2017-03-16 13:13:31 +01:00
Martin Kroeker 211d2eceb5 Update zdot.c 2017-03-13 18:08:00 +01:00
Martin Kroeker 5813ed095b Update zdot.c 2017-03-13 17:49:07 +01:00
Martin Kroeker e44b028fe5 Replace gnu _real_, _imag_ extensions in initializers 2017-03-13 00:40:11 +01:00
Ashwin Sekhar T K 071a830e8b THUNDERX2T99: Add optimized S/D/C/Z SWAP Implementations 2017-02-03 03:55:06 -08:00
Werner Saar dd6212e684 updated some level1 funcions, that are not thread save 2017-01-10 14:05:07 +01:00
jiahaipeng 84b8170bfb Adding multi-threading for copy, dot, rot, and asum funcitons 2017-01-10 11:48:58 +08:00
Werner Saar ae4ac6f984 removed obj-files, that are moved to lapack 3.7.0 2017-01-06 16:14:53 +01:00
Jerome Robert d346c533b1 Fix z/ctrmv stack allocation on AMD bulldozer and barcelona target
* Hopefully, because this was found by error and trial (dark magic)
* Ref #786
2016-06-07 16:11:09 +02:00
Werner Saar f04af36ad0 Merge pull request #898 from wernsaar/develop
added experimental support for optimized lapack fortran functions
2016-05-31 14:13:52 +02:00
Werner Saar 41000c8443 added directory for optimized lapack fortan codes and added dlaqr5.f 2016-05-31 12:53:07 +02:00
John Biddiscombe 053044ae4d Replace CMAKE_SOURCE_DIR/CMAKE_BINARY_DIR with PROJECT_SOURCE_DIR/PROJECT_BINARY_DIR
If OpenBLAS is built using add_subdirectory(OpenBlas) as part of another project
then the paths set by CMAKE_XXX_DIR are relative to the parent project
and not the OpenBLAS project.
2016-05-25 09:13:28 +02:00
Jerome Robert 40af513669 Disable multi-threading in swap
* Close #873
2016-05-16 13:07:55 +00:00
Jerome Robert 16ec5323c9 Fix zgemv.c compilation when stack allocation is disabled 2016-02-08 12:05:02 +01:00
Jerome Robert 5fc2203d8a zgemv: Add a workaround for #746 2016-02-08 11:25:15 +01:00
Jerome Robert 78dcf5c3d5 Improve performances of ztrmv on small matrices
* Use stack allocation
* Disable multi-threading
* Ref #727
2016-02-08 11:25:02 +01:00
Jerome Robert 32f793195f Use stack allocation in zgemv and zger
For better performance with small matrices
Ref #727
2016-02-08 11:24:21 +01:00
Jerome Robert 1fe3aab047 Use GEMM_MULTITHREAD_THRESHOLD as a number of ops
...not a matrix size. For GEMM_MULTITHREAD_THRESHOLD=4
(the default value) this does not change anything but
for other values it make the GEMM and GEMV thresholds
changing in the same way.

Close #742
2016-01-24 11:31:40 +01:00
Jerome Robert 1a1935507b [z]ger: increase multithread threshold
The ones given in 3ae30cd was by far to low because I
mixed m and m*n in my measures. Note that the new ones
are closed to the [z]gemv ones which is comforting
that both are right.
2016-01-24 10:46:35 +01:00
Jerome Robert 66eafb16cf swap: disable multi-threading for small matrices
Close #731
2016-01-19 17:14:46 +01:00
Jerome Robert 3ae30cd6b9 Disable multi-threading for small matrices in [z]ger
Ref #731
2016-01-19 17:14:31 +01:00
Jerome Robert 87a2ccc37c Factorize MAX_STACK_ALLOC code to common_stackalloc.h
Ref #727
2016-01-08 16:03:52 +01:00
Jerome Robert f9890a6452 Fix compilation when MAX_STACK_ALLOC is not set
Close #722
2015-12-31 14:43:09 +01:00
Zhang Xianyi 285d042b10 Fixed rotg bug on ARM. 2015-12-14 10:07:01 -06:00
Zhang Xianyi 640cccc2b1 Refs #697. Fixed gemv bug for Windows.
Thank matzeri's patch.
2015-11-30 15:19:45 -06:00
Ralph Campbell 55a0b27c01 Minor C code fixes in interface/ 2015-11-09 14:15:49 +05:30
Zhang Xianyi 2feef49fa8 Merge branch 'develop' into cmake
Conflicts:
	driver/others/memory.c
2015-10-26 14:54:34 -05:00
Zhang Xianyi 5a291606ad Refs #671. the return of i?max cannot larger than N. 2015-10-24 01:16:34 +08:00
Zhang Xianyi 8fade093aa Fixed cmake bug on Visual Studio. 2015-10-20 14:37:22 -05:00
Zhang Xianyi 94b125255f Merge branch 'develop' into cmake
Conflicts:
	driver/others/memory.c
2015-10-13 04:46:08 +08:00
Zhang Xianyi baec8f5cac Refs #638. Fixed compiling bug with clang on Mac OS X. 2015-09-10 10:32:07 -05:00
Martin Koehler 711ca33bc6 Improved Ximatcopy when lda==ldb.
The Ximatcopy functions create a copy of the input matrix
although they seem to work inplace. The new routines
XIMATCOPY_K_YY perform the operations inplace if the leading
dimension does not change.
2015-09-07 14:36:16 +02:00
Zhang Xianyi f874465bb8 Use cmake to build OpenBLAS GENERIC Target on MSVC x86 64-bit.
Disable CBLAS and LAPACK.
2015-08-10 14:10:44 -05:00
Zhang Xianyi dcd5ba4443 Merge branch 'cmake' of https://github.com/hpanderson/OpenBLAS into hpanderson_cmake 2015-07-22 04:06:39 +08:00
Werner Saar f8f2e261fe use only 1 thread if m or n < 2*GEMM_MULTITHREAD_THRESHOLD 2015-05-06 10:41:53 +02:00
Jerome Robert ab567d8443 gemv: Ensure stack buffer is large enough to handle memory alignment
Ref #478
2015-04-24 10:12:49 +02:00
Zhang Xianyi 847e19c04e Refs #478,#482, Enable stack alloc for s/dgemv_t.(revert 9798491) 2015-04-20 23:22:40 -05:00
Zhang Xianyi fd9fd42936 Refs #478, #482. Fixed bug on previous commit. 2015-04-13 23:22:27 -05:00
Zhang Xianyi 9798481979 Refs #478, #482. Fix segfault bug for gemv_t with MAX_ALLOC_STACK flag.
For gemv_t, directly use malloc to create the buffer.
2015-04-13 19:45:27 -05:00
Zhang Xianyi cdefdb21cd Refs #492. Fixed c/zsyr bug with negative incx. 2015-02-26 06:37:03 +08:00
Hank Anderson 0d8e227ea7 Changed strategy for setting preprocessor definitions.
Instead of generating separate object files for each permutation of
defines for a source file, GenerateNamedObjects now writes an entirely
new source file and inserts the defines as #define c statements.

This solves a problem I ran into with ar.exe where it was refusing to
link objects that had the same filename despite having different paths.
2015-02-24 12:26:33 -06:00
Hank Anderson b2284647a3 More complex objects. 2015-02-23 07:51:05 -06:00
Hank Anderson a6116e5859 Added some more complex-only objects. 2015-02-22 17:49:28 -06:00
Hank Anderson 67e39bd8fb Added mangled complex filenames to interface and lapack CMakeLists.txt. 2015-02-17 13:12:30 -06:00
Hank Anderson 9eb1499095 Added another param to GenerateNamedObjects to mangle complex source names.
There are a lot of sources for complex float types that are the same
names as the real sources, except with z prepended.
2015-02-17 10:30:28 -06:00
Martin Koehler 39cc6b21d3 Add ATLAS-style ?geadd function 2015-02-16 13:46:20 +01:00
Hank Anderson 4662a0b13a Changed generate functions to iterate through a list of float types.
This will generate obj files for SINGLE/DOUBLE/COMPLEX/DOUBLE COMPLEX.
2015-02-15 17:44:37 -06:00
Hank Anderson e74462a3f5 Moved declarations to start of functions to satisfy MSVC C89 implementation. 2015-02-11 11:16:57 -06:00
Hank Anderson e8c39138c6 Removed return value from GenerateNamedObjects.
It sets DBLAS_OBJS directly to save a bunch of list appending in the
CMakeLists.txt files.
2015-02-09 12:28:09 -06:00
Hank Anderson 58cff2fed8 Added CBLAS define/naming convention to GenerateNamedObjects. 2015-02-04 11:30:15 -06:00
Hank Anderson 5690cf3f0e Added override for function names in GenerateNamedObjects.
The BLAS interface folder should now be generated the correct objects
for the DOUBLE case.
2015-02-04 10:52:19 -06:00
Hank Anderson a0aeda6187 Added function to set defines for the object names (e.g. -DNAME=dgemm). 2015-02-04 10:37:34 -06:00
Hank Anderson 20e593a44a Added cblas_ objects to interface CMakeLists.
Naming isn't right, though, not seeing cblas_xxxx exports in the
resulting library.
2015-02-02 16:25:30 -06:00
Hank Anderson 9e154aba58 Added LAPACK object files to interface CMakeLists. 2015-02-02 12:31:15 -06:00
Hank Anderson 5057a4b4df Added openblas add_library call that uses DBLAS_OBJS ojbects. 2015-01-30 15:21:21 -06:00
Hank Anderson a6cf8aafc0 Updated level3/CMakeLists with correct defines using all combos. 2015-01-30 11:21:50 -06:00
Jerome Robert b17ccb4c5c Fix a segfault in gemv when MAX_STACK_ALLOC is set
* stack_alloc_size is needed after the implementation call
but it may be overwritten if it's optimized to a register,
because some gemv implementation (ex: dgemv_n.S) do not
restore all register (ex: r10).
* do the same in ger.c for the same reasons even if the bug
has not been observed.
2015-01-29 09:55:57 +01:00
Hank Anderson 5eefe18ae4 Added CMakeLists.txt for the first of the BLAS folders.
It only does the double precision compile currently.

I realized I didn't finish converting Makefile.system yet, so I made
a note of that.
2015-01-27 16:17:17 -06:00
Jerome Robert e9d9a8eae3 Allow to do gemv and ger buffer allocation on the stack
ger and gemv call blas_memory_alloc/free which in their turn
call blas_lock. blas_lock create thread contention when matrices
are small and the number of thread is high enough. We avoid
call blas_memory_alloc by replacing it with stack allocation.
This can be enabled with:
make -DMAX_STACK_ALLOC=2048
The given size (in byte) must be high enough to avoid thread contention
and small enough to avoid stack overflow.

Fix #478
2014-12-27 14:33:12 +01:00
wernsaar 9e829ce98f enabled cblas gemm3m functions 2014-09-20 17:20:02 +02:00
wernsaar d49fd33885 disabled SYMM3M and HEMM3M functions because segment violations 2014-09-20 15:27:40 +02:00
wernsaar 7aae4a62e7 enabled use of GEMM3M functions 2014-09-20 14:27:10 +02:00
wernsaar 3300f5ebff optimized multithreading lower limits 2014-09-15 11:38:25 +02:00
wernsaar fd2478c9e2 optimized interface/zgemv.c for multithreading 2014-09-12 19:18:23 +02:00
Zhang Xianyi 1cba8e7b11 Merge pull request #446 from grisuthedragon/cblas_matcopy
Add a CBLAS interface for the BLAS extension s/d/c/z*matcopy routines.
2014-09-10 16:31:31 +08:00
Martin Koehler a057e5434d add CBLAS interface for s/d/c/zimatcopy 2014-09-09 09:52:13 +02:00
Martin Köhler 7794766d3c Add cblas_(s/d/c/z)omatcopy in order to have cblas interface for them. 2014-09-08 17:57:44 +02:00
wernsaar f511807fc0 modified multithreading threshold 2014-09-08 12:27:32 +02:00
wernsaar d1800397f5 optimized interface/gemv.c for multithreading 2014-09-02 17:36:07 +02:00
wernsaar f4ff889491 updated interface/gemv.c for multithreading 2014-09-02 16:30:04 +02:00
wernsaar 51413925bd adjust number of threads for small size in cgemv and zgemv 2014-07-15 16:27:02 +02:00
wernsaar b985cea65d adjust number of threads for sgemv and dgemv 2014-07-15 16:04:46 +02:00
wernsaar d286daa2ba adjusted number of threads for small size 2014-07-15 14:41:35 +02:00
wernsaar cedc1f4b14 Ref #410: disabled optimized potri functions ( single threading bug) 2014-07-10 13:42:32 +02:00
wernsaar 02a504c0b8 fixed my bug in ger.c 2014-07-02 10:39:33 +02:00
wernsaar be94db096c disabled *3M functions for x86_64 platforms 2014-07-01 16:18:05 +02:00
wernsaar aee61456a4 disabled SMP for sbmv and zsbmv again 2014-06-29 21:18:38 +02:00
wernsaar 01a119abfc enabled SMP for sbmv and zsbmv, but only for 64bit binaries 2014-06-29 20:35:56 +02:00
wernsaar 1fad2b759f enabled smp for ger.c and zger.c, but only for 64bit binaries 2014-06-29 16:43:04 +02:00
Timothy Gu 6c2ead30f0 Remove all trailing whitespace except lapack-netlib
Signed-off-by: Timothy Gu <timothygu99@gmail.com>
2014-06-27 12:05:18 -07:00
wernsaar 15d5dfa92c fixed compiler warnings 2014-06-25 11:32:44 +02:00
wernsaar 86d8c8978b Ref #391: disabled SMP in ger.c and zger.c 2014-06-22 12:01:24 +02:00
wernsaar a19d209005 Ref #103: enhancement for small matrix dimensions 2014-06-18 15:04:11 +02:00