Commit Graph

  • b281f3dee4 Merge remote branch 'origin/loongson3a' into loongson3b traz 2011-12-06 13:49:39 +0000
  • a4292976e9 Adding detection of complex situations in symm.c, otherwise the buffer address of sb will overlap the end of sa. traz 2011-12-05 14:54:25 +0000
  • c2dad58ad1 Adding n32 multiple threads condition. Wang Qian 2011-12-01 16:33:11 +0000
  • d5a6d789e6 Fixed a typo in Makefile. Xianyi Zhang 2011-11-28 15:31:46 +0800
  • 875dde437d Merge branch 'lapack_3.4.0' into develop Xianyi Zhang 2011-11-28 15:28:54 +0800
  • 5be22ca80d Refs #72. Upgraded LAPACK to 3.4.0 version. Xianyi Zhang 2011-11-28 15:28:22 +0800
  • 66904fc4e8 BLAS3 used standard MIPS instructions without extensions on Loongson 3B. Wang Qian 2011-11-25 11:20:25 +0000
  • 8163ab7e55 Change the block size on Loongson 3B. Wang Qian 2011-11-23 18:40:35 +0000
  • ef6f7f32ae Fixed mbind bug on Loongson 3B. Check the return value of my_mbind function. Xianyi Zhang 2011-11-23 17:17:41 +0000
  • 285e69e2d1 Disable using simple thread level3 to fix a bug on Loongson 3B. Xianyi Zhang 2011-11-17 16:46:26 +0000
  • d1baf14a64 Enable thread affinity on Loongson 3B. Fixed the bug of reading cycle counter. Xianyi Zhang 2011-11-11 17:49:41 +0000
  • 0884f6b78d Merge branch 'loongson3a' of github.com:xianyi/OpenBLAS into loongson3b Xianyi Zhang 2011-11-11 14:26:49 +0000
  • 2d78fb05c8 Add conjugate condition to gemv. traz 2011-11-10 15:38:48 +0000
  • b95ad4cfaf Support detecting ICT Loongson-3B CPU. Xianyi Zhang 2011-11-09 19:28:22 +0000
  • 3bbe3ddb31 Merge branch 'develop' of github.com:xianyi/OpenBLAS into loongson3b Xianyi Zhang 2011-11-09 19:08:29 +0000
  • a32e56500a Fix the compute error of gemv when incx and incy are negative numbers. traz 2011-11-04 19:32:21 +0000
  • c1e618ea2d Add complete gemv function on Loongson3a platform. traz 2011-11-03 13:53:48 +0000
  • 19f5b5c132 Fixed #66 the bug in zgemv kernel with transpose matrix on 64-bit MingW (Windows). traits 2011-10-18 18:44:23 +0800
  • c852ce3981 Ref #65. Fixed 64-bit Windows calling convention bug in cdot and zdot. traits 2011-10-18 10:23:17 +0800
  • ba31b19c00 Ref #62. In OpenMP implementation, check the return value of omp_get_max_threads(). It makes sure the return value as same as blas_cpu_numbers which is an internal global variable to store the number of threads in OpenBLAS. Xianyi Zhang 2011-10-16 22:56:19 +0800
  • 66a3c6df4e Ref #63. Fixed generating DLL bug on ming-w64. traits 2011-10-09 17:25:44 +0800
  • 57658a8c14 ref #62. Added the user friendly message with USE_OPENMP=1. The users should use OMP_NUM_THREADS. Xianyi Zhang 2011-10-09 15:14:48 +0800
  • 9fe3049de6 Adding conditional compilation(#if defined(LOONGSON3A)) to avoid affecting the performance of other platforms. traz 2011-09-26 15:21:45 +0000
  • 831858b883 Modify aligned address of sa and sb to improve the performance of multi-threads. traz 2011-09-23 20:59:48 +0000
  • 8de2ba67dd Merge branch 'hotfix-0.1alpha2.4' into develop Xianyi 2011-09-18 17:00:29 +0800
  • fe7a932ab8 Merge branch 'hotfix-0.1alpha2.4' v0.1alpha2.4 Xianyi 2011-09-18 16:57:28 +0800
  • 1d31c79dc9 Prepared the document for 0.1 alpha 2.4 version. Xianyi 2011-09-18 05:46:08 +0800
  • d40e5621e9 Change the installation folder into /include and /lib. Xianyi 2011-09-18 05:07:00 +0800
  • bcc7956216 Refs #57. Continue to fix absolute path issue about shared library on Mac OSX. Xianyi 2011-09-18 01:35:12 +0800
  • 821cbb2995 Updated the document for 0.1 alpha 2.4. Xianyi 2011-09-17 07:55:59 +0800
  • 74fa790354 Merge branch 'develop' into hotfix-0.1alpha2.4 Xianyi 2011-09-17 07:32:10 +0800
  • 756477bfe3 Output the installation tip after building complete. Xianyi 2011-09-17 07:21:11 +0800
  • 864c68ffc5 Bump the version number. Xianyi 2011-09-17 03:05:26 +0800
  • 68cae521df Refs #57. The bug about absolute path of shared library on Mac OSX. Xianyi 2011-09-17 02:58:01 +0800
  • d0152ec8ca Fixed #61 a building bug about setting TARGET and DYNAMIC_ARCH at the same time. Xianyi 2011-09-17 02:27:56 +0800
  • e08cfaf9ca Complete all the complex single-precision functions of level3, but the performance needs further improve. traz 2011-09-16 17:50:40 +0000
  • ee4bb8bd25 Add ctrmm part in cgemm_kernel_loongson3a_4x2_ps.S. traz 2011-09-16 16:08:39 +0000
  • 7fa3d23dd9 Complete cgemm function, but no optimization. traz 2011-09-15 16:08:23 +0000
  • 9679dd077e Fix some compute error. traz 2011-09-14 20:00:35 +0000
  • 048742f38f Merge branch 'loongson3a' of github.com:xianyi/OpenBLAS into loongson3a traz 2011-09-14 16:32:36 +0000
  • 7b410b7f0e Fixed #58 zdot SEGFAULT bug with GCC-4.6. Thank Mr. John for this patch. Zhang Xiianyi 2011-09-14 23:52:51 +0800
  • d238a768ab Use ps instructions in cgemm. traz 2011-09-14 15:32:25 +0000
  • 260db9fb9e Merge branch 'hotfix-0.1alpha2.3' into develop traits 2011-09-09 00:57:47 +0800
  • e27b761d7c Merge branch 'hotfix-0.1alpha2.3' v0.1alpha2.3 traits 2011-09-09 00:55:04 +0800
  • 16fc083322 Refs #47. Fixed the seting parameter bug on Loongson 3A single thread version. Xianyi Zhang 2011-09-08 16:39:34 +0000
  • 3c856c0c1a Check the return value of pthread_create. Update the docs with known issue on Loongson 3A. Xianyi Zhang 2011-09-06 18:27:33 +0000
  • dc9c69db93 Merge branch 'develop' into loongson3a Xianyi Zhang 2011-09-06 18:19:50 +0000
  • b1fe26c45a refs #55. Changed DTB_ENTRIES to DTB_DEFAULT_ENTRIES in x86 gemv_n kernel codes. traits 2011-09-06 14:14:07 +0800
  • 0389b631fa Merge branch 'loongson3a' of github.com:xianyi/OpenBLAS into loongson3a traz 2011-09-05 16:31:40 +0000
  • 64fa709d1f Fixed #46. Initialize variables in cblat3.f and zblat3.f. traz 2011-09-05 16:30:55 +0000
  • 4727fe8abf Refs #47. On Loongson 3A, set DGEMM_R parameter depending on different number of threads. It would improve double precision BLAS3 on multi-threads. Xianyi Zhang 2011-09-05 15:13:05 +0000
  • 90481ce742 Updated the doc about 0.1alpha2.3. traits 2011-09-05 17:40:55 +0800
  • 9fc6764fa7 refs #55. Added DTB_ENTRIES into dynamic arch setting parameters. Now, it can read DTB_ENTRIES on runtime. traits 2011-09-05 17:37:07 +0800
  • 74d4cdb81a Fix an illegal instruction for strmm_RTLU. traz 2011-09-02 19:41:06 +0000
  • 7906146836 Fix an error for strmm_LLTN. traz 2011-09-02 16:57:33 +0000
  • 3274ff47b8 Fix an error for strmm_LLTN. traz 2011-09-02 16:50:50 +0000
  • a059c553a1 Fix a compute error for strmm. traz 2011-09-02 16:00:04 +0000
  • 23e182ca7c Fix stack-pointer bug for strmm. traz 2011-09-02 15:28:01 +0000
  • a15bc95824 Add strmm part. traz 2011-09-02 09:15:09 +0000
  • 74a3f63489 Tuning mb, kb, nb size to get the best performance. traz 2011-09-01 17:15:28 +0000
  • 09f49fa891 Using PS instructions to improve the performance of sgemm and it is 4.2Gflops now. traz 2011-08-31 21:24:03 +0000
  • b9d89f8aaa Fixed the bug about installation. f77blas.h works OK now. Xianyi Zhang 2011-08-31 18:21:37 +0800
  • cb0214787b Modify compile options. traz 2011-08-30 20:57:00 +0000
  • 2e8cdd1542 Using ps instruction. traz 2011-08-30 20:54:19 +0000
  • b29d327d14 Merge branch 'loongson3a' of github.com:xianyi/OpenBLAS into loongson3a traz 2011-07-18 17:06:53 +0000
  • c8360e3ae5 Complete all the plura single precision functions of level3 on Loongson3a, the performance is 2.3GFlops. traz 2011-07-18 17:03:38 +0000
  • 19d2ab4853 Merge branch 'hotfix-0.1alpha2.2' into develop traits 2011-07-14 01:09:21 +0800
  • 12d77deeee Merge branch 'hotfix-0.1alpha2.2' v0.1alpha2.2 traits 2011-07-14 01:03:09 +0800
  • 043927c7db Update the documents for 0.1alpha2.2 version. traits 2011-07-14 01:02:19 +0800
  • 30947ea2d5 Fixed #44 a makefile bug when DYNAMIC_ARCH=1 and INTERFACE64=1. traits 2011-07-14 00:54:23 +0800
  • 33313b0221 Merge branch 'develop' into loongson3a Xianyi Zhang 2011-07-07 14:25:51 +0800
  • a5300420e2 Merge branch 'hotfix-0.1alpha2.1' into develop traits 2011-06-28 15:46:55 +0800
  • 9b46bf1eb4 Merge branch 'hotfix-0.1alpha2.1' v0.1alpha2.1 traits 2011-06-28 15:43:08 +0800
  • c06b7be32f Refs #42. Output the error message when detecting fortran compiler failed. traits 2011-06-28 15:42:09 +0800
  • 68532fa9ec Merge branch 'loongson3a' of github.com:xianyi/OpenBLAS into loongson3a traz 2011-06-24 09:28:12 +0000
  • 708d2b6255 Fix compute error in ztrmm. traz 2011-06-24 09:27:41 +0000
  • e72113f06a Add ztrmm and ztrsm part on loongson3a. The average performance is 2.2G. traz 2011-06-23 21:11:00 +0000
  • fc21f7ad28 Merge branch 'release-v0.1alpha2' into loongson3a Xianyi Zhang 2011-06-23 16:08:23 +0800
  • 14f81da375 Change prefetch length of A and B, the performance is 2.1G now. traz 2011-06-23 10:46:58 +0000
  • ca8bf5abb0 Merge branch 'release-v0.1alpha2' into develop Xianyi Zhang 2011-06-23 16:07:34 +0800
  • 4a73f5c5ea Merge branch 'release-v0.1alpha2' v0.1alpha2 traits 2011-06-23 15:18:40 +0800
  • 6a0762949d Fixed #38. Released v0.1 alpha2. traits 2011-06-23 15:16:24 +0800
  • 859b71645a Refs #37. Updated REAME about the compatible issue with EKOPath compiler. traits 2011-06-23 15:09:34 +0800
  • 078bfd0b4f Refs #39. Moved the shared lib (dll) to top directory in MingW64 compiler environment. Xianyi Zhang 2011-06-22 13:19:39 +0800
  • 1c96d345e2 Improve zgemm performance from 1G to 1.8G, change block size in param.h. traz 2011-06-21 22:16:23 +0000
  • 82f5274828 Refs #39. It's unnecessary to include sys/mman.h file in blas_server_omp.c. Xianyi Zhang 2011-06-22 01:52:20 +0800
  • e568df0dae Refs #38. Prepare the docs with v0.1alpha2. Xianyi Zhang 2011-06-21 18:06:13 +0800
  • c4efde7713 Merge branch 'loongson3a' into release-v0.1alpha2 Xianyi Zhang 2011-06-21 17:50:00 +0800
  • 7a1e6202e1 Merge branch 'add_install_target' into develop Xianyi Zhang 2011-06-21 17:40:16 +0800
  • 32353a9d30 Refs #20. Fixed the installation bug with DYNAMIC_ARCH=1. Xianyi Zhang 2011-06-21 17:39:08 +0800
  • 2e6e9272fe Merge branch 'add_install_target' into develop Xianyi Zhang 2011-06-20 18:40:05 +0800
  • d978436c4b Refs #20. Updated the docs. Xianyi Zhang 2011-06-20 18:36:29 +0800
  • fab36f1adb Fixed #20. Added install target in makefile. You can use "make install PREFIX=your_installation_directory". Xianyi Zhang 2011-06-20 18:35:35 +0800
  • 7945919f22 Updated gitignore file. Xianyi Zhang 2011-06-19 12:07:31 +0800
  • c642b61d4d Merge branch 'master' of github.com:xianyi/OpenBLAS into develop Xianyi Zhang 2011-06-19 11:59:38 +0800
  • aeed8d6225 Fixed #27. Temporarily walk around axpy's low performance issue with small imput size & multithreads. Xianyi Zhang 2011-06-19 11:55:29 +0800
  • 1a4181afd0 Merge pull request #36 from pipping/master Xianyi Zhang 2011-06-11 05:59:00 -0700
  • a36468f5cb Merge 49742cb2d3 into 8cc628a953 GitHub Merge Button 2011-06-11 05:56:41 -0700
  • 49742cb2d3 Make USE_OPENMP=0 disable openmp Elias Pipping 2011-06-11 14:36:16 +0200
  • b3d1887745 Fixed #35 a build bug with NO_LAPACK=1 DYNAMIC_ARCH=1 FC=gfortran. I forgot to test it with gfortran in last bug fixed commit. Xianyi Zhang 2011-06-09 22:59:49 +0800