Commit Graph

1318 Commits

Author SHA1 Message Date
traits b1fe26c45a refs #55. Changed DTB_ENTRIES to DTB_DEFAULT_ENTRIES in x86 gemv_n kernel codes. 2011-09-06 14:14:07 +08:00
traz 0389b631fa Merge branch 'loongson3a' of github.com:xianyi/OpenBLAS into loongson3a 2011-09-05 16:31:40 +00:00
traz 64fa709d1f Fixed #46. Initialize variables in cblat3.f and zblat3.f. 2011-09-05 16:30:55 +00:00
Xianyi Zhang 4727fe8abf Refs #47. On Loongson 3A, set DGEMM_R parameter depending on different number of threads. It would improve double precision BLAS3 on multi-threads. 2011-09-05 15:13:52 +00:00
traits 90481ce742 Updated the doc about 0.1alpha2.3. 2011-09-05 17:40:55 +08:00
traits 9fc6764fa7 refs #55. Added DTB_ENTRIES into dynamic arch setting parameters. Now, it can read DTB_ENTRIES on runtime. 2011-09-05 17:37:07 +08:00
traz 74d4cdb81a Fix an illegal instruction for strmm_RTLU. 2011-09-02 19:41:06 +00:00
traz 7906146836 Fix an error for strmm_LLTN. 2011-09-02 16:57:33 +00:00
traz 3274ff47b8 Fix an error for strmm_LLTN. 2011-09-02 16:50:50 +00:00
traz a059c553a1 Fix a compute error for strmm. 2011-09-02 16:00:04 +00:00
traz 23e182ca7c Fix stack-pointer bug for strmm. 2011-09-02 15:28:01 +00:00
traz a15bc95824 Add strmm part. 2011-09-02 09:15:09 +00:00
traz 74a3f63489 Tuning mb, kb, nb size to get the best performance. 2011-09-01 17:15:28 +00:00
traz 09f49fa891 Using PS instructions to improve the performance of sgemm and it is 4.2Gflops now. 2011-08-31 21:24:03 +00:00
Xianyi Zhang b9d89f8aaa Fixed the bug about installation. f77blas.h works OK now. 2011-08-31 18:21:37 +08:00
traz cb0214787b Modify compile options. 2011-08-30 20:57:00 +00:00
traz 2e8cdd1542 Using ps instruction. 2011-08-30 20:54:19 +00:00
traz b29d327d14 Merge branch 'loongson3a' of github.com:xianyi/OpenBLAS into loongson3a 2011-07-18 17:06:53 +00:00
traz c8360e3ae5 Complete all the plura single precision functions of level3 on Loongson3a, the performance is 2.3GFlops. 2011-07-18 17:03:38 +00:00
traits 19d2ab4853 Merge branch 'hotfix-0.1alpha2.2' into develop 2011-07-14 01:09:21 +08:00
traits 12d77deeee Merge branch 'hotfix-0.1alpha2.2' 2011-07-14 01:03:09 +08:00
traits 043927c7db Update the documents for 0.1alpha2.2 version. 2011-07-14 01:02:19 +08:00
traits 30947ea2d5 Fixed #44 a makefile bug when DYNAMIC_ARCH=1 and INTERFACE64=1. 2011-07-14 00:54:23 +08:00
Xianyi Zhang 33313b0221 Merge branch 'develop' into loongson3a 2011-07-07 14:25:51 +08:00
traits a5300420e2 Merge branch 'hotfix-0.1alpha2.1' into develop 2011-06-28 15:46:55 +08:00
traits 9b46bf1eb4 Merge branch 'hotfix-0.1alpha2.1' 2011-06-28 15:43:08 +08:00
traits c06b7be32f Refs #42. Output the error message when detecting fortran compiler failed. 2011-06-28 15:42:09 +08:00
traz 68532fa9ec Merge branch 'loongson3a' of github.com:xianyi/OpenBLAS into loongson3a 2011-06-24 09:28:12 +00:00
traz 708d2b6255 Fix compute error in ztrmm. 2011-06-24 09:27:41 +00:00
traz e72113f06a Add ztrmm and ztrsm part on loongson3a. The average performance is 2.2G. 2011-06-23 21:11:00 +00:00
traz 14f81da375 Change prefetch length of A and B, the performance is 2.1G now. 2011-06-23 10:46:58 +00:00
Xianyi Zhang fc21f7ad28 Merge branch 'release-v0.1alpha2' into loongson3a 2011-06-23 16:08:23 +08:00
Xianyi Zhang ca8bf5abb0 Merge branch 'release-v0.1alpha2' into develop 2011-06-23 16:07:34 +08:00
traits 4a73f5c5ea Merge branch 'release-v0.1alpha2' 2011-06-23 15:18:40 +08:00
traits 6a0762949d Fixed #38. Released v0.1 alpha2. 2011-06-23 15:16:24 +08:00
traits 859b71645a Refs #37. Updated REAME about the compatible issue with EKOPath compiler. 2011-06-23 15:09:34 +08:00
Xianyi Zhang 078bfd0b4f Refs #39. Moved the shared lib (dll) to top directory in MingW64 compiler environment. 2011-06-22 13:19:39 +08:00
traz 1c96d345e2 Improve zgemm performance from 1G to 1.8G, change block size in param.h. 2011-06-21 22:16:23 +00:00
Xianyi Zhang 82f5274828 Refs #39. It's unnecessary to include sys/mman.h file in blas_server_omp.c. 2011-06-22 01:52:20 +08:00
Xianyi Zhang e568df0dae Refs #38. Prepare the docs with v0.1alpha2. 2011-06-21 18:06:13 +08:00
Xianyi Zhang c4efde7713 Merge branch 'loongson3a' into release-v0.1alpha2 2011-06-21 17:50:00 +08:00
Xianyi Zhang 7a1e6202e1 Merge branch 'add_install_target' into develop 2011-06-21 17:40:16 +08:00
Xianyi Zhang 32353a9d30 Refs #20. Fixed the installation bug with DYNAMIC_ARCH=1. 2011-06-21 17:39:08 +08:00
Xianyi Zhang 2e6e9272fe Merge branch 'add_install_target' into develop
Conflicts:
	Changelog.txt
2011-06-20 18:40:05 +08:00
Xianyi Zhang d978436c4b Refs #20. Updated the docs. 2011-06-20 18:36:29 +08:00
Xianyi Zhang fab36f1adb Fixed #20. Added install target in makefile. You can use "make install PREFIX=your_installation_directory". 2011-06-20 18:35:35 +08:00
Xianyi Zhang 7945919f22 Updated gitignore file. 2011-06-19 12:07:31 +08:00
Xianyi Zhang c642b61d4d Merge branch 'master' of github.com:xianyi/OpenBLAS into develop 2011-06-19 11:59:38 +08:00
Xianyi Zhang aeed8d6225 Fixed #27. Temporarily walk around axpy's low performance issue with small imput size & multithreads. 2011-06-19 11:55:29 +08:00
Xianyi Zhang 1a4181afd0 Merge pull request #36 from pipping/master
Fixed the bug about USE_OPENMP=0 enabling OpenMP
2011-06-11 05:59:00 -07:00