Martin Kroeker
b46e2b57cc
Make return parameter of cblas_Xdotc_sub, cblas_Xdotu_sub a void pointer as well
2017-11-18 20:28:02 +01:00
Martin Kroeker
3ce401f51b
Make last parameter of cblas_Xdotc_sub/cblas_Xdotu_sub a void pointer as well
2017-11-18 18:58:40 +01:00
Andrew
27575d200a
Eliminate mode variable where not needed
2017-11-15 15:32:38 +01:00
Martin Kroeker
2c222f1faa
Modify complex CBLAS functions to take void pointers
...
Modify complex CBLAS functions to take void pointers instead of float or double arguments (to bring the prototypes in line with netlib and other implementations' cblas.h)
2017-11-05 15:53:14 +01:00
Martin Kroeker
742f54c235
Merge pull request #1303 from martin-frbg/imatcopy-rowscols
...
Fix cols/rows mixup in omatcopy 2nd step for BlasTrans cases
2017-09-14 21:46:26 +02:00
Martin Kroeker
d674fbb4c7
Fix cols/rows mixup in omatcopy 2nd step for BlasTrans cases
...
Equivalent of #1244 (issue #899 ) for the non-complex cases. Fixes #1289
2017-09-14 19:59:05 +02:00
Martin Kroeker
46c9357c72
Merge pull request #1288 from quickwritereader/develop
...
Optimized standard Blas Level-1,2 (excluding nrm2 functions) for z13 (double precision). Issue 884
2017-09-09 23:47:17 +02:00
Abdurrauf
1cfdb2295d
Optimized standard Blas Level-1,2 (excluding nrm2 functions) for z13 (double precision)
2017-09-06 16:41:08 +04:00
Martin Kroeker
00740c0e34
Merge pull request #1290 from martin-frbg/imatcopy
...
Use in-place transform shortcut only if matrix is square
2017-09-03 13:02:10 +02:00
Martin Kroeker
254db9bd7c
Use in-place transform shortcut only if matrix is square
2017-09-03 09:52:55 +02:00
Isuru Fernando
d245caa49a
Support out-of-source build
2017-08-01 15:16:14 +05:30
Martin Kroeker
376048156b
Use in-place transform shortcut only if matrix is square
2017-07-21 11:20:15 +02:00
Martin Kroeker
d1c5b8f913
Add files via upload
2017-07-20 20:51:06 +02:00
Martin Kroeker
91bde7d315
Exchange rows and cols in final omatcopy with BlasTrans
...
This is MicMuc's patch from #899
2017-07-15 22:02:53 +02:00
Martin Kroeker
1e06b49854
Update xerbla.c
2017-04-26 20:29:30 +02:00
Martin Kroeker
7f546f54fa
Add cblas_xerbla
2017-04-26 20:01:34 +02:00
Martin Kroeker
a809431e34
Add cblas_xerbla()
2017-04-26 19:58:59 +02:00
Andrew
99880f7906
Address unlikely memleak in zimatcopy interface ( #1129 )
...
* fix unlikely memleak in zimatcopy interface
* fix only unlikely memleak in zimatcopy interface
* fix only unlikely memleak in zimatcopy interface
2017-03-16 13:13:31 +01:00
Martin Kroeker
211d2eceb5
Update zdot.c
2017-03-13 18:08:00 +01:00
Martin Kroeker
5813ed095b
Update zdot.c
2017-03-13 17:49:07 +01:00
Martin Kroeker
e44b028fe5
Replace gnu _real_, _imag_ extensions in initializers
2017-03-13 00:40:11 +01:00
Ashwin Sekhar T K
071a830e8b
THUNDERX2T99: Add optimized S/D/C/Z SWAP Implementations
2017-02-03 03:55:06 -08:00
Werner Saar
dd6212e684
updated some level1 funcions, that are not thread save
2017-01-10 14:05:07 +01:00
jiahaipeng
84b8170bfb
Adding multi-threading for copy, dot, rot, and asum funcitons
2017-01-10 11:48:58 +08:00
Werner Saar
ae4ac6f984
removed obj-files, that are moved to lapack 3.7.0
2017-01-06 16:14:53 +01:00
Jerome Robert
d346c533b1
Fix z/ctrmv stack allocation on AMD bulldozer and barcelona target
...
* Hopefully, because this was found by error and trial (dark magic)
* Ref #786
2016-06-07 16:11:09 +02:00
Werner Saar
f04af36ad0
Merge pull request #898 from wernsaar/develop
...
added experimental support for optimized lapack fortran functions
2016-05-31 14:13:52 +02:00
Werner Saar
41000c8443
added directory for optimized lapack fortan codes and added dlaqr5.f
2016-05-31 12:53:07 +02:00
John Biddiscombe
053044ae4d
Replace CMAKE_SOURCE_DIR/CMAKE_BINARY_DIR with PROJECT_SOURCE_DIR/PROJECT_BINARY_DIR
...
If OpenBLAS is built using add_subdirectory(OpenBlas) as part of another project
then the paths set by CMAKE_XXX_DIR are relative to the parent project
and not the OpenBLAS project.
2016-05-25 09:13:28 +02:00
Jerome Robert
40af513669
Disable multi-threading in swap
...
* Close #873
2016-05-16 13:07:55 +00:00
Jerome Robert
16ec5323c9
Fix zgemv.c compilation when stack allocation is disabled
2016-02-08 12:05:02 +01:00
Jerome Robert
5fc2203d8a
zgemv: Add a workaround for #746
2016-02-08 11:25:15 +01:00
Jerome Robert
78dcf5c3d5
Improve performances of ztrmv on small matrices
...
* Use stack allocation
* Disable multi-threading
* Ref #727
2016-02-08 11:25:02 +01:00
Jerome Robert
32f793195f
Use stack allocation in zgemv and zger
...
For better performance with small matrices
Ref #727
2016-02-08 11:24:21 +01:00
Jerome Robert
1fe3aab047
Use GEMM_MULTITHREAD_THRESHOLD as a number of ops
...
...not a matrix size. For GEMM_MULTITHREAD_THRESHOLD=4
(the default value) this does not change anything but
for other values it make the GEMM and GEMV thresholds
changing in the same way.
Close #742
2016-01-24 11:31:40 +01:00
Jerome Robert
1a1935507b
[z]ger: increase multithread threshold
...
The ones given in 3ae30cd
was by far to low because I
mixed m and m*n in my measures. Note that the new ones
are closed to the [z]gemv ones which is comforting
that both are right.
2016-01-24 10:46:35 +01:00
Jerome Robert
66eafb16cf
swap: disable multi-threading for small matrices
...
Close #731
2016-01-19 17:14:46 +01:00
Jerome Robert
3ae30cd6b9
Disable multi-threading for small matrices in [z]ger
...
Ref #731
2016-01-19 17:14:31 +01:00
Jerome Robert
87a2ccc37c
Factorize MAX_STACK_ALLOC code to common_stackalloc.h
...
Ref #727
2016-01-08 16:03:52 +01:00
Jerome Robert
f9890a6452
Fix compilation when MAX_STACK_ALLOC is not set
...
Close #722
2015-12-31 14:43:09 +01:00
Zhang Xianyi
285d042b10
Fixed rotg bug on ARM.
2015-12-14 10:07:01 -06:00
Zhang Xianyi
640cccc2b1
Refs #697 . Fixed gemv bug for Windows.
...
Thank matzeri's patch.
2015-11-30 15:19:45 -06:00
Ralph Campbell
55a0b27c01
Minor C code fixes in interface/
2015-11-09 14:15:49 +05:30
Zhang Xianyi
2feef49fa8
Merge branch 'develop' into cmake
...
Conflicts:
driver/others/memory.c
2015-10-26 14:54:34 -05:00
Zhang Xianyi
5a291606ad
Refs #671 . the return of i?max cannot larger than N.
2015-10-24 01:16:34 +08:00
Zhang Xianyi
8fade093aa
Fixed cmake bug on Visual Studio.
2015-10-20 14:37:22 -05:00
Zhang Xianyi
94b125255f
Merge branch 'develop' into cmake
...
Conflicts:
driver/others/memory.c
2015-10-13 04:46:08 +08:00
Zhang Xianyi
baec8f5cac
Refs #638 . Fixed compiling bug with clang on Mac OS X.
2015-09-10 10:32:07 -05:00
Martin Koehler
711ca33bc6
Improved Ximatcopy when lda==ldb.
...
The Ximatcopy functions create a copy of the input matrix
although they seem to work inplace. The new routines
XIMATCOPY_K_YY perform the operations inplace if the leading
dimension does not change.
2015-09-07 14:36:16 +02:00
Zhang Xianyi
f874465bb8
Use cmake to build OpenBLAS GENERIC Target on MSVC x86 64-bit.
...
Disable CBLAS and LAPACK.
2015-08-10 14:10:44 -05:00
Zhang Xianyi
dcd5ba4443
Merge branch 'cmake' of https://github.com/hpanderson/OpenBLAS into hpanderson_cmake
2015-07-22 04:06:39 +08:00
Werner Saar
f8f2e261fe
use only 1 thread if m or n < 2*GEMM_MULTITHREAD_THRESHOLD
2015-05-06 10:41:53 +02:00
Jerome Robert
ab567d8443
gemv: Ensure stack buffer is large enough to handle memory alignment
...
Ref #478
2015-04-24 10:12:49 +02:00
Zhang Xianyi
847e19c04e
Refs #478,#482, Enable stack alloc for s/dgemv_t.(revert 9798491)
2015-04-20 23:22:40 -05:00
Zhang Xianyi
fd9fd42936
Refs #478 , #482 . Fixed bug on previous commit.
2015-04-13 23:22:27 -05:00
Zhang Xianyi
9798481979
Refs #478 , #482 . Fix segfault bug for gemv_t with MAX_ALLOC_STACK flag.
...
For gemv_t, directly use malloc to create the buffer.
2015-04-13 19:45:27 -05:00
Zhang Xianyi
cdefdb21cd
Refs #492 . Fixed c/zsyr bug with negative incx.
2015-02-26 06:37:03 +08:00
Hank Anderson
0d8e227ea7
Changed strategy for setting preprocessor definitions.
...
Instead of generating separate object files for each permutation of
defines for a source file, GenerateNamedObjects now writes an entirely
new source file and inserts the defines as #define c statements.
This solves a problem I ran into with ar.exe where it was refusing to
link objects that had the same filename despite having different paths.
2015-02-24 12:26:33 -06:00
Hank Anderson
b2284647a3
More complex objects.
2015-02-23 07:51:05 -06:00
Hank Anderson
a6116e5859
Added some more complex-only objects.
2015-02-22 17:49:28 -06:00
Hank Anderson
67e39bd8fb
Added mangled complex filenames to interface and lapack CMakeLists.txt.
2015-02-17 13:12:30 -06:00
Hank Anderson
9eb1499095
Added another param to GenerateNamedObjects to mangle complex source names.
...
There are a lot of sources for complex float types that are the same
names as the real sources, except with z prepended.
2015-02-17 10:30:28 -06:00
Martin Koehler
39cc6b21d3
Add ATLAS-style ?geadd function
2015-02-16 13:46:20 +01:00
Hank Anderson
4662a0b13a
Changed generate functions to iterate through a list of float types.
...
This will generate obj files for SINGLE/DOUBLE/COMPLEX/DOUBLE COMPLEX.
2015-02-15 17:44:37 -06:00
Hank Anderson
e74462a3f5
Moved declarations to start of functions to satisfy MSVC C89 implementation.
2015-02-11 11:16:57 -06:00
Hank Anderson
e8c39138c6
Removed return value from GenerateNamedObjects.
...
It sets DBLAS_OBJS directly to save a bunch of list appending in the
CMakeLists.txt files.
2015-02-09 12:28:09 -06:00
Hank Anderson
58cff2fed8
Added CBLAS define/naming convention to GenerateNamedObjects.
2015-02-04 11:30:15 -06:00
Hank Anderson
5690cf3f0e
Added override for function names in GenerateNamedObjects.
...
The BLAS interface folder should now be generated the correct objects
for the DOUBLE case.
2015-02-04 10:52:19 -06:00
Hank Anderson
a0aeda6187
Added function to set defines for the object names (e.g. -DNAME=dgemm).
2015-02-04 10:37:34 -06:00
Hank Anderson
20e593a44a
Added cblas_ objects to interface CMakeLists.
...
Naming isn't right, though, not seeing cblas_xxxx exports in the
resulting library.
2015-02-02 16:25:30 -06:00
Hank Anderson
9e154aba58
Added LAPACK object files to interface CMakeLists.
2015-02-02 12:31:15 -06:00
Hank Anderson
5057a4b4df
Added openblas add_library call that uses DBLAS_OBJS ojbects.
2015-01-30 15:21:21 -06:00
Hank Anderson
a6cf8aafc0
Updated level3/CMakeLists with correct defines using all combos.
2015-01-30 11:21:50 -06:00
Jerome Robert
b17ccb4c5c
Fix a segfault in gemv when MAX_STACK_ALLOC is set
...
* stack_alloc_size is needed after the implementation call
but it may be overwritten if it's optimized to a register,
because some gemv implementation (ex: dgemv_n.S) do not
restore all register (ex: r10).
* do the same in ger.c for the same reasons even if the bug
has not been observed.
2015-01-29 09:55:57 +01:00
Hank Anderson
5eefe18ae4
Added CMakeLists.txt for the first of the BLAS folders.
...
It only does the double precision compile currently.
I realized I didn't finish converting Makefile.system yet, so I made
a note of that.
2015-01-27 16:17:17 -06:00
Jerome Robert
e9d9a8eae3
Allow to do gemv and ger buffer allocation on the stack
...
ger and gemv call blas_memory_alloc/free which in their turn
call blas_lock. blas_lock create thread contention when matrices
are small and the number of thread is high enough. We avoid
call blas_memory_alloc by replacing it with stack allocation.
This can be enabled with:
make -DMAX_STACK_ALLOC=2048
The given size (in byte) must be high enough to avoid thread contention
and small enough to avoid stack overflow.
Fix #478
2014-12-27 14:33:12 +01:00
wernsaar
9e829ce98f
enabled cblas gemm3m functions
2014-09-20 17:20:02 +02:00
wernsaar
d49fd33885
disabled SYMM3M and HEMM3M functions because segment violations
2014-09-20 15:27:40 +02:00
wernsaar
7aae4a62e7
enabled use of GEMM3M functions
2014-09-20 14:27:10 +02:00
wernsaar
3300f5ebff
optimized multithreading lower limits
2014-09-15 11:38:25 +02:00
wernsaar
fd2478c9e2
optimized interface/zgemv.c for multithreading
2014-09-12 19:18:23 +02:00
Zhang Xianyi
1cba8e7b11
Merge pull request #446 from grisuthedragon/cblas_matcopy
...
Add a CBLAS interface for the BLAS extension s/d/c/z*matcopy routines.
2014-09-10 16:31:31 +08:00
Martin Koehler
a057e5434d
add CBLAS interface for s/d/c/zimatcopy
2014-09-09 09:52:13 +02:00
Martin Köhler
7794766d3c
Add cblas_(s/d/c/z)omatcopy in order to have cblas interface for them.
2014-09-08 17:57:44 +02:00
wernsaar
f511807fc0
modified multithreading threshold
2014-09-08 12:27:32 +02:00
wernsaar
d1800397f5
optimized interface/gemv.c for multithreading
2014-09-02 17:36:07 +02:00
wernsaar
f4ff889491
updated interface/gemv.c for multithreading
2014-09-02 16:30:04 +02:00
wernsaar
51413925bd
adjust number of threads for small size in cgemv and zgemv
2014-07-15 16:27:02 +02:00
wernsaar
b985cea65d
adjust number of threads for sgemv and dgemv
2014-07-15 16:04:46 +02:00
wernsaar
d286daa2ba
adjusted number of threads for small size
2014-07-15 14:41:35 +02:00
wernsaar
cedc1f4b14
Ref #410 : disabled optimized potri functions ( single threading bug)
2014-07-10 13:42:32 +02:00
wernsaar
02a504c0b8
fixed my bug in ger.c
2014-07-02 10:39:33 +02:00
wernsaar
be94db096c
disabled *3M functions for x86_64 platforms
2014-07-01 16:18:05 +02:00
wernsaar
aee61456a4
disabled SMP for sbmv and zsbmv again
2014-06-29 21:18:38 +02:00
wernsaar
01a119abfc
enabled SMP for sbmv and zsbmv, but only for 64bit binaries
2014-06-29 20:35:56 +02:00
wernsaar
1fad2b759f
enabled smp for ger.c and zger.c, but only for 64bit binaries
2014-06-29 16:43:04 +02:00
Timothy Gu
6c2ead30f0
Remove all trailing whitespace except lapack-netlib
...
Signed-off-by: Timothy Gu <timothygu99@gmail.com>
2014-06-27 12:05:18 -07:00
wernsaar
15d5dfa92c
fixed compiler warnings
2014-06-25 11:32:44 +02:00
wernsaar
86d8c8978b
Ref #391 : disabled SMP in ger.c and zger.c
2014-06-22 12:01:24 +02:00
wernsaar
a19d209005
Ref #103 : enhancement for small matrix dimensions
2014-06-18 15:04:11 +02:00
wernsaar
faeab93df0
Ref #51 : added blas extensions simatcopy, dimatcopy, cimatcopy, zimatcopy
2014-06-10 16:14:34 +02:00
wernsaar
cee257f384
Ref #51 : added blas extensions zomatcopy and comatcopy
2014-06-10 10:34:54 +02:00
wernsaar
7bfb3011e8
Ref #51 : added blas extension somatcopy
2014-06-09 20:21:13 +02:00
wernsaar
8c8f596238
Ref #51 : added blas extension domatcopy as not opimized reference
2014-06-09 17:11:07 +02:00
wernsaar
bff575d0b1
Ref #375 : added workaround for small sizes to scal.c and zscal.c
2014-06-08 13:49:19 +02:00
wernsaar
faf3ac0aad
Ref #285 : added axpby kernels
2014-06-08 11:54:24 +02:00
Zhang Xianyi
b31ec99372
Fixed #374 .
...
Merge branch 'TimothyGu-develop' into develop
2014-06-05 17:01:44 +08:00
wernsaar
25e899b60b
fixed function profile in zpotri.c
2014-05-25 09:15:22 +02:00
wernsaar
89da450800
enabled and tested optimized potri lapack functions
2014-05-23 12:14:30 +02:00
wernsaar
c26bbee489
enabled abd tested optimized trtri lapack functions
2014-05-23 10:55:39 +02:00
Timothy Gu
ced13574a0
Random "walk (a)round" --> "work-around" typo fixes
...
Signed-off-by: Timothy Gu <timothygu99@gmail.com>
2014-05-22 18:11:52 -07:00
wernsaar
a748d3a75d
enabled optimized trti2 lapack functions again
2014-05-21 11:02:07 +02:00
wernsaar
a5ab231ad4
enabled optimized complex lauum lapack functions again
2014-05-21 10:35:28 +02:00
wernsaar
dbaeea7b59
enabled lauu2 and lauum lapack functions again
2014-05-21 09:49:18 +02:00
wernsaar
0d75f3b6a2
enabled and tested optimized gesv lapack functions
2014-05-19 14:44:53 +02:00
wernsaar
abad6f66d6
marked trti2.c and ztrti2.c as bad
2014-05-19 13:50:02 +02:00
wernsaar
2ff66e661d
enabled and tested optimized laswp lapack function
2014-05-19 13:35:32 +02:00
wernsaar
5e55034922
marked zlauu2.c and zlauum.c as bad
2014-05-19 12:53:22 +02:00
wernsaar
9a9e810239
marked trtri.c and ztrtri as bad
2014-05-19 12:42:52 +02:00
wernsaar
45be9ac111
moved trtri.c and ztrtri.c to the directory lapack
2014-05-19 12:29:29 +02:00
wernsaar
9f201558c9
marked lauu2.c and lauum.c as bad
2014-05-19 12:00:16 +02:00
wernsaar
d4237cb7f3
marked larf.c as obsolete
2014-05-19 11:23:17 +02:00
wernsaar
aaa9d7fbf8
marked potri functions as bad because a lot of errors
2014-05-18 23:41:13 +02:00
wernsaar
ebc95e6f11
enabled and tested optimized potf2 lapack functions
2014-05-18 22:41:43 +02:00
wernsaar
61a2c50e8e
enabled and tested optimized getf2 lapack functions
2014-05-18 22:21:16 +02:00
wernsaar
4f98f8c9b3
enabled and tested optimized potrf lapack functions
2014-05-18 21:42:37 +02:00
wernsaar
536875d463
enabled and tested optimized getrs lapack functions
2014-05-18 21:13:56 +02:00
wernsaar
65f2fba4c3
enabled and tested optimized cgetrf lapack function
2014-05-18 20:32:27 +02:00
wernsaar
eea6f51df9
enabled and tested optimized sgetrf lapack function
2014-05-18 20:01:23 +02:00
wernsaar
6fc4646709
enabled and tested optimized zgetrf lapack function
2014-05-18 19:36:32 +02:00
wernsaar
ac029f81b3
enabled and tested optimized dgetrf function
2014-05-18 19:07:51 +02:00
wernsaar
c0cf875a82
added optimized lapack files from OpenBLAS
2014-05-18 14:09:22 +02:00
wernsaar
189ca1bcee
removed lapack objects from interface/Makefile
2014-05-11 12:09:34 +02:00
wernsaar
4c1caa7454
checked, that zhpr is OK
2014-05-11 11:21:23 +02:00
wernsaar
7bb19cf90e
checked, that zhpr2 is OK
2014-05-11 11:11:05 +02:00
wernsaar
2a94aaaf2e
checked, that zhpmv is OK
2014-05-11 10:46:48 +02:00
wernsaar
5e4b4f6712
checked, that zher is OK
2014-05-11 10:36:34 +02:00
wernsaar
47e8950e77
checked, that zher2 is OK
2014-05-11 10:26:05 +02:00
wernsaar
f45f2c8465
checked, that zhemv is OK
2014-05-11 10:15:06 +02:00
wernsaar
10780ae650
marked zhbmv as smp bug
2014-05-11 09:58:16 +02:00
wernsaar
9bae50f700
checked, that zscal and zswap are OK
2014-05-11 09:30:18 +02:00
wernsaar
0758c1a374
checked, that trtri is OK
2014-05-11 09:11:20 +02:00
wernsaar
564ff395f6
checked, that trsm is OK
2014-05-11 08:59:33 +02:00
wernsaar
7fb78a5f01
checked, that trmv is OK
2014-05-11 08:47:44 +02:00
wernsaar
8204ab4aa8
checked, that tpmv is OK
2014-05-11 08:35:34 +02:00
wernsaar
48d1325784
checked, that tbmv is OK
2014-05-11 08:22:00 +02:00
wernsaar
57bbc586ef
checked, that syrk is OK
2014-05-11 08:10:25 +02:00
wernsaar
bfef3c5dd1
checked, that syr is OK
2014-05-11 07:46:22 +02:00
wernsaar
d972f4a60a
check, that syr2k is OK
2014-05-11 01:04:46 +02:00
wernsaar
eebce01cf2
checked, that syr2 is OK
2014-05-11 00:48:49 +02:00
wernsaar
e2c39a4a8e
checked, that symv is OK
2014-05-11 00:36:56 +02:00
wernsaar
1e8e6faa7e
checked, that symm is OK
2014-05-11 00:22:40 +02:00
wernsaar
c7eb901496
checked, that spr is OK
2014-05-11 00:07:07 +02:00
wernsaar
2ed03ea0a2
checked, that spr2 is OK
2014-05-10 23:55:43 +02:00
wernsaar
de00e2937a
marked as smp bug
2014-05-10 23:18:35 +02:00
wernsaar
e187b5e9d0
removed gesv.c from interface
2014-05-10 22:55:44 +02:00
wernsaar
0947fc1c89
checked, that ger is OK
2014-05-10 22:49:53 +02:00
wernsaar
4d61607c9e
cheched, that gbmv is OK
2014-05-10 22:38:09 +02:00
wernsaar
781bfb6e66
checked, that gemv is OK
2014-05-10 22:24:05 +02:00
wernsaar
79a82ba7f1
checked that axpy is OK
2014-05-10 22:09:49 +02:00
wernsaar
d63bd7fa5e
checked that gemm.c is OK
2014-05-10 21:51:44 +02:00
wernsaar
e265c4ec86
added C files in interface
2014-05-10 21:27:47 +02:00
wernsaar
0732238213
removed all C files in interface
2014-05-10 21:25:17 +02:00
wernsaar
320c805905
fixed incorrect parameter 2 errors
2014-05-08 11:06:32 +02:00
wernsaar
025fc914cc
fixed 2 bugs as reported by Brendan Tracey
2014-05-02 11:34:26 +02:00
wernsaar
9db0fb8b02
bugfix for sdsdot
2014-02-28 14:59:36 +01:00
wernsaar
692b14cecd
rewrote rotmg.c instead of modifying very old code
2014-02-28 14:43:28 +01:00
Zhang Xianyi
3e0a7b931c
Refs #333 . Detect the wrong parameter for zherk/zher2k.
2014-01-21 01:27:51 +08:00
Zhang Xianyi
73770e60b8
Refs #309 . Fixed trtri_U single thread computational bug.
2013-11-07 01:08:39 +08:00
Lars Buitinck
3f7b0cd994
Merge pull request #290 from larsmans/missing-threshold
...
check if GEMM_MULTITHREAD_THRESHOLD defined in gemm.c
Set a fallback value.
2013-08-29 00:33:55 +08:00
Zhang Xianyi
c92ae012a6
Refs #279 . Provide ONLY_CBLAS flag. If you only need CBLAS without
...
a fortran compiler, please try make ONLY_CBLAS=1.
This mode only compiler CBLAS without BLAS fortran interface and LAPACK.
2013-08-21 00:03:25 +08:00
Zhang Xianyi
a07cc39571
Refs #266 . Fixed the compiling bug with Open64 5.0.
2013-07-31 14:41:39 +08:00
Zhang Xianyi
b5c2ac4fd6
Fixed #264 the memory leak bug in dtrtri_U.
2013-07-29 23:21:10 +08:00
Elliot Saba
6f5b395009
Fix xianyi/OpenBLAS#256
2013-07-22 17:02:06 -07:00
Zhang Xianyi
fd0c388681
Refs #191 . A walk around for dtrtri_U single thread bug.
...
This function caused the failure of ERKALE serial test.
I replaced it with LAPACK source code.
2013-07-14 22:16:30 +08:00
Jameson Nash
d0e731e8b8
provide support for passing CFLAGS, FFLAGS, PFLAGS, FPFLAGS to make on the command line
2012-08-21 00:31:12 -04:00
Xianyi Zhang
83ecfbb9b3
Merge branch 'loongson3a' into release-0.1.0
2012-03-23 01:26:27 +08:00
Xianyi Zhang
31c836ac25
Ref #79 Added GEMM_MULTITHREAD_THRESHOLD flag to use single thread in gemm function with small matrices.
2012-03-23 01:17:41 +08:00
Xianyi Zhang
722dd08703
ref #80 . On P4 CPU with 32-bit Windows XP, Octave crashed with OpenBLAS. Walkaroud: Use netlib reference gemv instead of own funtions.
...
For example, make USE_NETLIB_GEMV=1
2012-03-16 20:29:39 +08:00
traz
a4292976e9
Adding detection of complex situations in symm.c, otherwise the buffer address of sb will overlap the end of sa.
2011-12-05 14:54:25 +00:00
Xianyi Zhang
aeed8d6225
Fixed #27 . Temporarily walk around axpy's low performance issue with small imput size & multithreads.
2011-06-19 11:55:29 +08:00
Xianyi Zhang
1496383224
Print the wall time (cycles) with enabling FUNCTION_PROFILE.
2011-06-09 10:40:15 +08:00
Xianyi Zhang
fcb5ce011b
Fixed #28 . Convert the result to double precision in MIPS64 dsdot_k kernel.
2011-05-17 21:24:00 +00:00
Xianyi Zhang
fa8e4fd879
Fixed #26 the wrong result of rotmg. Used fabs() instead of abs().
2011-05-11 01:12:32 +08:00
Xianyi Zhang
8f1090d32a
Support NO_LAPACK=1 to build the lib without LAPACK functions.
2011-03-04 11:51:32 +08:00
Xianyi Zhang
0cfd29a819
Fixed #7 . 1)Disable the multi-thread and 2) Modified kernel codes to avoid unloop in axpy function when incx==0 or incy==0.
2011-02-21 00:24:21 +08:00
Xianyi Zhang
78da0e0a0c
Fixed #6 . Disable multi-thread swap when incx==0 or incy==0.
2011-02-20 17:14:38 +08:00
Xianyi Zhang
342bbc3871
Import GotoBLAS2 1.13 BSD version codes.
2011-01-24 14:54:24 +00:00