|  Wang Qian | d5cffd506a | Modified the default kernel makefile in MIPS64 arch. | 2011-03-07 11:23:12 +00:00 | 
				
					
						|  Xianyi Zhang | 5838f12995 | Support unalign address in daxpy on loongson3a simd.. | 2011-03-05 10:17:10 +08:00 | 
				
					
						|  Xianyi Zhang | 5444a3f8f7 | Unroll to 16 in daxpy on loongson3a. | 2011-03-04 17:50:17 +08:00 | 
				
					
						|  Xianyi Zhang | 88cbfcc5b5 | Merge commit 'origin/x86' into loongson3a | 2011-03-04 14:11:52 +00:00 | 
				
					
						|  Xianyi Zhang | ce78abe37e | Merge branch 'x86' of github.com:xianyi/OpenBLAS into x86 | 2011-03-04 11:53:04 +08:00 | 
				
					
						|  Xianyi Zhang | 8f1090d32a | Support NO_LAPACK=1 to build the lib without LAPACK functions. | 2011-03-04 11:51:32 +08:00 | 
				
					
						|  Xianyi | 272f62a2b6 | Changed movlps macro name in capital in x86/zdot_sse2.S file. | 2011-03-03 00:46:39 +08:00 | 
				
					
						|  Xianyi | 36016fe349 | On x86 32bits, gcc 4.4.3 generated wrong codes (movsd) from movlps in zdot_sse2.S line 191. This would casue zdotu & zdotc failures. Instead, use movlpd to walk around it. Fixed #8. Fixed #9. | 2011-03-02 18:45:43 +08:00 | 
				
					
						|  Xianyi Zhang | 6eb02bbb9c | Merge remote branch 'origin/x86' into loongson3a | 2011-03-02 13:52:05 +08:00 | 
				
					
						|  Xianyi | 12214e1d0f | Fixed #7. Modified axpy kernel codes to avoid unloop with incx==0 or incy==0 in x86 32bits arch. | 2011-02-23 20:08:34 +08:00 | 
				
					
						|  Xianyi Zhang | 0cfd29a819 | Fixed #7. 1)Disable the multi-thread and  2) Modified kernel codes to avoid unloop in axpy function when incx==0 or incy==0. | 2011-02-21 00:24:21 +08:00 | 
				
					
						|  Xianyi | bfaa80c316 | fixed #4 csrot & drot returned the wrong result when incx==incy==0 on i686 arch. | 2011-02-18 03:00:58 +08:00 | 
				
					
						|  Xianyi Zhang | c5852d4e30 | fixed #4 csrot returned the wrong result when incx==incy==0. | 2011-02-16 23:39:43 +08:00 | 
				
					
						|  Xianyi Zhang | 84ba64e65b | fixed a bug in drot whe incx or incy equals to zero. | 2011-02-16 23:35:41 +08:00 | 
				
					
						|  Xianyi Zhang | 1e671b49f3 | Did the experiment with Loongson 3A 128bit load & store instruction. | 2011-01-29 03:05:27 +08:00 | 
				
					
						|  Xianyi Zhang | 77b7020d69 | changed prefetch order. | 2011-01-29 03:03:34 +08:00 | 
				
					
						|  Xianyi Zhang | e003b811ab | load x & y contiguously in axpy. | 2011-01-28 11:18:50 +08:00 | 
				
					
						|  Xianyi Zhang | ebe2da8474 | Modified aligned size. Added additional prefetch instruction because of cache line is 32 bytes in Loongson 3A. | 2011-01-27 23:07:06 +08:00 | 
				
					
						|  Xianyi Zhang | c0b5992fab | added axpy kernel with prefetch for Loongson3A. To-Do: tuning prefetch distance & instruction order. | 2011-01-26 22:34:33 +08:00 | 
				
					
						|  Xianyi Zhang | 342bbc3871 | Import GotoBLAS2 1.13 BSD version codes. | 2011-01-24 14:54:24 +00:00 |