048742f38f 
								
							 
						 
						
							
							
								
								Merge branch 'loongson3a' of github.com:xianyi/OpenBLAS into loongson3a  
							
							
							
						 
						
							2011-09-14 16:32:36 +00:00  
				
					
						
							
							
								 
						
							
								7b410b7f0e 
								
							 
						 
						
							
							
								
								Fixed   #58  zdot SEGFAULT bug with GCC-4.6. Thank Mr. John for this patch.  
							
							... 
							
							
							
							In i386 calling convention, the caller put the address of return value of zdot into the first hidden parameter.
Thus, the callee should delete this address before return.
Actually, I have fixed the same bug on x86/zdot_sse2.S (issue #32 ). However, that is not a good implementation which uses 3 instructions. Mr. John told me used "ret $0x4" to skip the first hidden address (4 bytes). 
							
						 
						
							2011-09-14 23:52:51 +08:00  
				
					
						
							
							
								 
						
							
								d238a768ab 
								
							 
						 
						
							
							
								
								Use ps instructions in cgemm.  
							
							
							
						 
						
							2011-09-14 15:32:25 +00:00  
				
					
						
							
							
								 
						
							
								260db9fb9e 
								
							 
						 
						
							
							
								
								Merge branch 'hotfix-0.1alpha2.3' into develop  
							
							
							
						 
						
							2011-09-09 00:57:47 +08:00  
				
					
						
							
							
								 
						
							
								e27b761d7c 
								
							 
						 
						
							
							
								
								Merge branch 'hotfix-0.1alpha2.3'  
							
							
							
						 
						
							2011-09-09 00:55:04 +08:00  
				
					
						
							
							
								 
						
							
								16fc083322 
								
							 
						 
						
							
							
								
								Refs  #47 . Fixed the seting parameter bug on Loongson 3A single thread version.  
							
							
							
						 
						
							2011-09-08 16:39:34 +00:00  
				
					
						
							
							
								 
						
							
								3c856c0c1a 
								
							 
						 
						
							
							
								
								Check the return value of pthread_create. Update the docs with known issue on Loongson 3A.  
							
							
							
						 
						
							2011-09-06 18:27:33 +00:00  
				
					
						
							
							
								 
						
							
								dc9c69db93 
								
							 
						 
						
							
							
								
								Merge branch 'develop' into loongson3a  
							
							
							
						 
						
							2011-09-06 18:19:50 +00:00  
				
					
						
							
							
								 
						
							
								b1fe26c45a 
								
							 
						 
						
							
							
								
								refs  #55 . Changed  DTB_ENTRIES to DTB_DEFAULT_ENTRIES in x86 gemv_n kernel codes.  
							
							
							
						 
						
							2011-09-06 14:14:07 +08:00  
				
					
						
							
							
								 
						
							
								0389b631fa 
								
							 
						 
						
							
							
								
								Merge branch 'loongson3a' of github.com:xianyi/OpenBLAS into loongson3a  
							
							
							
						 
						
							2011-09-05 16:31:40 +00:00  
				
					
						
							
							
								 
						
							
								64fa709d1f 
								
							 
						 
						
							
							
								
								Fixed   #46 . Initialize variables in cblat3.f and zblat3.f.  
							
							
							
						 
						
							2011-09-05 16:30:55 +00:00  
				
					
						
							
							
								 
						
							
								4727fe8abf 
								
							 
						 
						
							
							
								
								Refs  #47 . On Loongson 3A, set DGEMM_R parameter depending on different number of threads. It would improve double precision BLAS3 on multi-threads.  
							
							
							
						 
						
							2011-09-05 15:13:52 +00:00  
				
					
						
							
							
								 
						
							
								90481ce742 
								
							 
						 
						
							
							
								
								Updated the doc about 0.1alpha2.3.  
							
							
							
						 
						
							2011-09-05 17:40:55 +08:00  
				
					
						
							
							
								 
						
							
								9fc6764fa7 
								
							 
						 
						
							
							
								
								refs  #55 . Added DTB_ENTRIES into dynamic arch setting parameters. Now, it can read DTB_ENTRIES on runtime.  
							
							
							
						 
						
							2011-09-05 17:37:07 +08:00  
				
					
						
							
							
								 
						
							
								74d4cdb81a 
								
							 
						 
						
							
							
								
								Fix an illegal instruction for strmm_RTLU.  
							
							
							
						 
						
							2011-09-02 19:41:06 +00:00  
				
					
						
							
							
								 
						
							
								7906146836 
								
							 
						 
						
							
							
								
								Fix an error for strmm_LLTN.  
							
							
							
						 
						
							2011-09-02 16:57:33 +00:00  
				
					
						
							
							
								 
						
							
								3274ff47b8 
								
							 
						 
						
							
							
								
								Fix an error for strmm_LLTN.  
							
							
							
						 
						
							2011-09-02 16:50:50 +00:00  
				
					
						
							
							
								 
						
							
								a059c553a1 
								
							 
						 
						
							
							
								
								Fix a compute error for strmm.  
							
							
							
						 
						
							2011-09-02 16:00:04 +00:00  
				
					
						
							
							
								 
						
							
								23e182ca7c 
								
							 
						 
						
							
							
								
								Fix stack-pointer bug for strmm.  
							
							
							
						 
						
							2011-09-02 15:28:01 +00:00  
				
					
						
							
							
								 
						
							
								a15bc95824 
								
							 
						 
						
							
							
								
								Add strmm part.  
							
							
							
						 
						
							2011-09-02 09:15:09 +00:00  
				
					
						
							
							
								 
						
							
								74a3f63489 
								
							 
						 
						
							
							
								
								Tuning mb, kb, nb size to get the best performance.  
							
							
							
						 
						
							2011-09-01 17:15:28 +00:00  
				
					
						
							
							
								 
						
							
								09f49fa891 
								
							 
						 
						
							
							
								
								Using PS instructions to improve the performance of sgemm and it is 4.2Gflops now.  
							
							
							
						 
						
							2011-08-31 21:24:03 +00:00  
				
					
						
							
							
								 
						
							
								b9d89f8aaa 
								
							 
						 
						
							
							
								
								Fixed the bug about installation. f77blas.h works OK now.  
							
							
							
						 
						
							2011-08-31 18:21:37 +08:00  
				
					
						
							
							
								 
						
							
								cb0214787b 
								
							 
						 
						
							
							
								
								Modify compile options.  
							
							
							
						 
						
							2011-08-30 20:57:00 +00:00  
				
					
						
							
							
								 
						
							
								2e8cdd1542 
								
							 
						 
						
							
							
								
								Using ps instruction.  
							
							
							
						 
						
							2011-08-30 20:54:19 +00:00  
				
					
						
							
							
								 
						
							
								b29d327d14 
								
							 
						 
						
							
							
								
								Merge branch 'loongson3a' of github.com:xianyi/OpenBLAS into loongson3a  
							
							
							
						 
						
							2011-07-18 17:06:53 +00:00  
				
					
						
							
							
								 
						
							
								c8360e3ae5 
								
							 
						 
						
							
							
								
								Complete all the plura single precision functions of level3 on Loongson3a, the performance is 2.3GFlops.  
							
							
							
						 
						
							2011-07-18 17:03:38 +00:00  
				
					
						
							
							
								 
						
							
								19d2ab4853 
								
							 
						 
						
							
							
								
								Merge branch 'hotfix-0.1alpha2.2' into develop  
							
							
							
						 
						
							2011-07-14 01:09:21 +08:00  
				
					
						
							
							
								 
						
							
								12d77deeee 
								
							 
						 
						
							
							
								
								Merge branch 'hotfix-0.1alpha2.2'  
							
							
							
						 
						
							2011-07-14 01:03:09 +08:00  
				
					
						
							
							
								 
						
							
								043927c7db 
								
							 
						 
						
							
							
								
								Update the documents for 0.1alpha2.2 version.  
							
							
							
						 
						
							2011-07-14 01:02:19 +08:00  
				
					
						
							
							
								 
						
							
								30947ea2d5 
								
							 
						 
						
							
							
								
								Fixed   #44  a makefile bug when DYNAMIC_ARCH=1 and INTERFACE64=1.  
							
							
							
						 
						
							2011-07-14 00:54:23 +08:00  
				
					
						
							
							
								 
						
							
								33313b0221 
								
							 
						 
						
							
							
								
								Merge branch 'develop' into loongson3a  
							
							
							
						 
						
							2011-07-07 14:25:51 +08:00  
				
					
						
							
							
								 
						
							
								a5300420e2 
								
							 
						 
						
							
							
								
								Merge branch 'hotfix-0.1alpha2.1' into develop  
							
							
							
						 
						
							2011-06-28 15:46:55 +08:00  
				
					
						
							
							
								 
						
							
								9b46bf1eb4 
								
							 
						 
						
							
							
								
								Merge branch 'hotfix-0.1alpha2.1'  
							
							
							
						 
						
							2011-06-28 15:43:08 +08:00  
				
					
						
							
							
								 
						
							
								c06b7be32f 
								
							 
						 
						
							
							
								
								Refs  #42 . Output the error message when detecting fortran compiler failed.  
							
							
							
						 
						
							2011-06-28 15:42:09 +08:00  
				
					
						
							
							
								 
						
							
								68532fa9ec 
								
							 
						 
						
							
							
								
								Merge branch 'loongson3a' of github.com:xianyi/OpenBLAS into loongson3a  
							
							
							
						 
						
							2011-06-24 09:28:12 +00:00  
				
					
						
							
							
								 
						
							
								708d2b6255 
								
							 
						 
						
							
							
								
								Fix compute error in ztrmm.  
							
							
							
						 
						
							2011-06-24 09:27:41 +00:00  
				
					
						
							
							
								 
						
							
								e72113f06a 
								
							 
						 
						
							
							
								
								Add ztrmm and ztrsm part on loongson3a. The average performance is 2.2G.  
							
							
							
						 
						
							2011-06-23 21:11:00 +00:00  
				
					
						
							
							
								 
						
							
								14f81da375 
								
							 
						 
						
							
							
								
								Change prefetch length of A and B, the performance is 2.1G now.  
							
							
							
						 
						
							2011-06-23 10:46:58 +00:00  
				
					
						
							
							
								 
						
							
								fc21f7ad28 
								
							 
						 
						
							
							
								
								Merge branch 'release-v0.1alpha2' into loongson3a  
							
							
							
						 
						
							2011-06-23 16:08:23 +08:00  
				
					
						
							
							
								 
						
							
								ca8bf5abb0 
								
							 
						 
						
							
							
								
								Merge branch 'release-v0.1alpha2' into develop  
							
							
							
						 
						
							2011-06-23 16:07:34 +08:00  
				
					
						
							
							
								 
						
							
								4a73f5c5ea 
								
							 
						 
						
							
							
								
								Merge branch 'release-v0.1alpha2'  
							
							
							
						 
						
							2011-06-23 15:18:40 +08:00  
				
					
						
							
							
								 
						
							
								6a0762949d 
								
							 
						 
						
							
							
								
								Fixed   #38 . Released v0.1 alpha2.  
							
							
							
						 
						
							2011-06-23 15:16:24 +08:00  
				
					
						
							
							
								 
						
							
								859b71645a 
								
							 
						 
						
							
							
								
								Refs  #37 . Updated REAME about the compatible issue with EKOPath compiler.  
							
							
							
						 
						
							2011-06-23 15:09:34 +08:00  
				
					
						
							
							
								 
						
							
								078bfd0b4f 
								
							 
						 
						
							
							
								
								Refs  #39 . Moved the shared lib (dll) to top directory in MingW64 compiler environment.  
							
							
							
						 
						
							2011-06-22 13:19:39 +08:00  
				
					
						
							
							
								 
						
							
								1c96d345e2 
								
							 
						 
						
							
							
								
								Improve zgemm performance from 1G to 1.8G, change block size in param.h.  
							
							
							
						 
						
							2011-06-21 22:16:23 +00:00  
				
					
						
							
							
								 
						
							
								82f5274828 
								
							 
						 
						
							
							
								
								Refs  #39 . It's unnecessary to include sys/mman.h file in blas_server_omp.c.  
							
							
							
						 
						
							2011-06-22 01:52:20 +08:00  
				
					
						
							
							
								 
						
							
								e568df0dae 
								
							 
						 
						
							
							
								
								Refs  #38 . Prepare the docs with v0.1alpha2.  
							
							
							
						 
						
							2011-06-21 18:06:13 +08:00  
				
					
						
							
							
								 
						
							
								c4efde7713 
								
							 
						 
						
							
							
								
								Merge branch 'loongson3a' into release-v0.1alpha2  
							
							
							
						 
						
							2011-06-21 17:50:00 +08:00  
				
					
						
							
							
								 
						
							
								7a1e6202e1 
								
							 
						 
						
							
							
								
								Merge branch 'add_install_target' into develop  
							
							
							
						 
						
							2011-06-21 17:40:16 +08:00