traits
|
b1fe26c45a
|
refs #55. Changed DTB_ENTRIES to DTB_DEFAULT_ENTRIES in x86 gemv_n kernel codes.
|
2011-09-06 14:14:07 +08:00 |
traz
|
0389b631fa
|
Merge branch 'loongson3a' of github.com:xianyi/OpenBLAS into loongson3a
|
2011-09-05 16:31:40 +00:00 |
traz
|
64fa709d1f
|
Fixed #46. Initialize variables in cblat3.f and zblat3.f.
|
2011-09-05 16:30:55 +00:00 |
Xianyi Zhang
|
4727fe8abf
|
Refs #47. On Loongson 3A, set DGEMM_R parameter depending on different number of threads. It would improve double precision BLAS3 on multi-threads.
|
2011-09-05 15:13:52 +00:00 |
traits
|
90481ce742
|
Updated the doc about 0.1alpha2.3.
|
2011-09-05 17:40:55 +08:00 |
traits
|
9fc6764fa7
|
refs #55. Added DTB_ENTRIES into dynamic arch setting parameters. Now, it can read DTB_ENTRIES on runtime.
|
2011-09-05 17:37:07 +08:00 |
traz
|
74d4cdb81a
|
Fix an illegal instruction for strmm_RTLU.
|
2011-09-02 19:41:06 +00:00 |
traz
|
7906146836
|
Fix an error for strmm_LLTN.
|
2011-09-02 16:57:33 +00:00 |
traz
|
3274ff47b8
|
Fix an error for strmm_LLTN.
|
2011-09-02 16:50:50 +00:00 |
traz
|
a059c553a1
|
Fix a compute error for strmm.
|
2011-09-02 16:00:04 +00:00 |
traz
|
23e182ca7c
|
Fix stack-pointer bug for strmm.
|
2011-09-02 15:28:01 +00:00 |
traz
|
a15bc95824
|
Add strmm part.
|
2011-09-02 09:15:09 +00:00 |
traz
|
74a3f63489
|
Tuning mb, kb, nb size to get the best performance.
|
2011-09-01 17:15:28 +00:00 |
traz
|
09f49fa891
|
Using PS instructions to improve the performance of sgemm and it is 4.2Gflops now.
|
2011-08-31 21:24:03 +00:00 |
Xianyi Zhang
|
b9d89f8aaa
|
Fixed the bug about installation. f77blas.h works OK now.
|
2011-08-31 18:21:37 +08:00 |
traz
|
cb0214787b
|
Modify compile options.
|
2011-08-30 20:57:00 +00:00 |
traz
|
2e8cdd1542
|
Using ps instruction.
|
2011-08-30 20:54:19 +00:00 |
traz
|
b29d327d14
|
Merge branch 'loongson3a' of github.com:xianyi/OpenBLAS into loongson3a
|
2011-07-18 17:06:53 +00:00 |
traz
|
c8360e3ae5
|
Complete all the plura single precision functions of level3 on Loongson3a, the performance is 2.3GFlops.
|
2011-07-18 17:03:38 +00:00 |
traits
|
19d2ab4853
|
Merge branch 'hotfix-0.1alpha2.2' into develop
|
2011-07-14 01:09:21 +08:00 |
traits
|
12d77deeee
|
Merge branch 'hotfix-0.1alpha2.2'
|
2011-07-14 01:03:09 +08:00 |
traits
|
043927c7db
|
Update the documents for 0.1alpha2.2 version.
|
2011-07-14 01:02:19 +08:00 |
traits
|
30947ea2d5
|
Fixed #44 a makefile bug when DYNAMIC_ARCH=1 and INTERFACE64=1.
|
2011-07-14 00:54:23 +08:00 |
Xianyi Zhang
|
33313b0221
|
Merge branch 'develop' into loongson3a
|
2011-07-07 14:25:51 +08:00 |
traits
|
a5300420e2
|
Merge branch 'hotfix-0.1alpha2.1' into develop
|
2011-06-28 15:46:55 +08:00 |
traits
|
9b46bf1eb4
|
Merge branch 'hotfix-0.1alpha2.1'
|
2011-06-28 15:43:08 +08:00 |
traits
|
c06b7be32f
|
Refs #42. Output the error message when detecting fortran compiler failed.
|
2011-06-28 15:42:09 +08:00 |
traz
|
68532fa9ec
|
Merge branch 'loongson3a' of github.com:xianyi/OpenBLAS into loongson3a
|
2011-06-24 09:28:12 +00:00 |
traz
|
708d2b6255
|
Fix compute error in ztrmm.
|
2011-06-24 09:27:41 +00:00 |
traz
|
e72113f06a
|
Add ztrmm and ztrsm part on loongson3a. The average performance is 2.2G.
|
2011-06-23 21:11:00 +00:00 |
traz
|
14f81da375
|
Change prefetch length of A and B, the performance is 2.1G now.
|
2011-06-23 10:46:58 +00:00 |
Xianyi Zhang
|
fc21f7ad28
|
Merge branch 'release-v0.1alpha2' into loongson3a
|
2011-06-23 16:08:23 +08:00 |
Xianyi Zhang
|
ca8bf5abb0
|
Merge branch 'release-v0.1alpha2' into develop
|
2011-06-23 16:07:34 +08:00 |
traits
|
4a73f5c5ea
|
Merge branch 'release-v0.1alpha2'
|
2011-06-23 15:18:40 +08:00 |
traits
|
6a0762949d
|
Fixed #38. Released v0.1 alpha2.
|
2011-06-23 15:16:24 +08:00 |
traits
|
859b71645a
|
Refs #37. Updated REAME about the compatible issue with EKOPath compiler.
|
2011-06-23 15:09:34 +08:00 |
Xianyi Zhang
|
078bfd0b4f
|
Refs #39. Moved the shared lib (dll) to top directory in MingW64 compiler environment.
|
2011-06-22 13:19:39 +08:00 |
traz
|
1c96d345e2
|
Improve zgemm performance from 1G to 1.8G, change block size in param.h.
|
2011-06-21 22:16:23 +00:00 |
Xianyi Zhang
|
82f5274828
|
Refs #39. It's unnecessary to include sys/mman.h file in blas_server_omp.c.
|
2011-06-22 01:52:20 +08:00 |
Xianyi Zhang
|
e568df0dae
|
Refs #38. Prepare the docs with v0.1alpha2.
|
2011-06-21 18:06:13 +08:00 |
Xianyi Zhang
|
c4efde7713
|
Merge branch 'loongson3a' into release-v0.1alpha2
|
2011-06-21 17:50:00 +08:00 |
Xianyi Zhang
|
7a1e6202e1
|
Merge branch 'add_install_target' into develop
|
2011-06-21 17:40:16 +08:00 |
Xianyi Zhang
|
32353a9d30
|
Refs #20. Fixed the installation bug with DYNAMIC_ARCH=1.
|
2011-06-21 17:39:08 +08:00 |
Xianyi Zhang
|
2e6e9272fe
|
Merge branch 'add_install_target' into develop
Conflicts:
Changelog.txt
|
2011-06-20 18:40:05 +08:00 |
Xianyi Zhang
|
d978436c4b
|
Refs #20. Updated the docs.
|
2011-06-20 18:36:29 +08:00 |
Xianyi Zhang
|
fab36f1adb
|
Fixed #20. Added install target in makefile. You can use "make install PREFIX=your_installation_directory".
|
2011-06-20 18:35:35 +08:00 |
Xianyi Zhang
|
7945919f22
|
Updated gitignore file.
|
2011-06-19 12:07:31 +08:00 |
Xianyi Zhang
|
c642b61d4d
|
Merge branch 'master' of github.com:xianyi/OpenBLAS into develop
|
2011-06-19 11:59:38 +08:00 |
Xianyi Zhang
|
aeed8d6225
|
Fixed #27. Temporarily walk around axpy's low performance issue with small imput size & multithreads.
|
2011-06-19 11:55:29 +08:00 |
Xianyi Zhang
|
1a4181afd0
|
Merge pull request #36 from pipping/master
Fixed the bug about USE_OPENMP=0 enabling OpenMP
|
2011-06-11 05:59:00 -07:00 |