2343 Commits

Author SHA1 Message Date
wernsaar
9f0a3a35b3 removed obsolete file sdot_vfpv3.S 2013-11-21 23:42:54 +01:00
wernsaar
dbae93110b added sdot_vfp.S 2013-11-21 23:34:51 +01:00
wernsaar
19cd5c64a2 renamed swap_vfpv3.S to swap_vfp.S 2013-11-21 23:19:32 +01:00
wernsaar
9adf87495e renamed some dot kernels 2013-11-21 23:07:51 +01:00
wernsaar
440db4cdda delete rot_vfpv3.S 2013-11-21 22:52:24 +01:00
wernsaar
cd93cae5a7 renamed rot_vfpv3.S to rot_vfp.S 2013-11-21 22:49:28 +01:00
wernsaar
8565afb3c2 renamed asum_vfpv3.S to asum_vfp.S 2013-11-21 22:26:27 +01:00
wernsaar
5bf7cf8d67 renamed scal_vfpv3.S to scal_vfp.S 2013-11-21 22:03:36 +01:00
wernsaar
29a005c635 renamed iamax assembler kernel 2013-11-21 21:12:33 +01:00
wernsaar
f1be3a168a renamed some BLAS kernels, which are compatible to ARMV6 2013-11-21 20:48:57 +01:00
wernsaar
410afda9b4 added cpu detection and target ARMV6, used in raspberry pi 2013-11-21 20:18:51 +01:00
wernsaar
bf04544902 added gemv_n kernel for single and double precision 2013-11-19 15:07:20 +01:00
wernsaar
86283c0be1 added gemv_t kernel for single and double precision 2013-11-19 09:55:54 +01:00
wernsaar
f27cabfd08 added nrm2 kernel for all precisions 2013-11-16 16:17:17 +01:00
wernsaar
23dd474cd0 added rot kernel for all precisions 2013-11-15 14:08:57 +01:00
wernsaar
f1b452e160 added scal kernel for all precisions 2013-11-15 11:56:43 +01:00
wernsaar
3dabd7e6e6 added swap-kernel for all precisions 2013-11-14 19:06:19 +01:00
wernsaar
6f4a0ebe38 added max- und min-kernels for all precisions 2013-11-14 13:52:47 +01:00
wernsaar
f1db386211 changes for compatibility with Pathscale compiler 2013-11-13 17:59:11 +01:00
wernsaar
6da558d2ab changes for compatibility with Pathscale compiler 2013-11-13 17:39:13 +01:00
wernsaar
f750103336 small optimizations on dot-kernels 2013-11-11 15:47:56 +01:00
wernsaar
00f33c0134 added asum_kernel for all precisions and complex 2013-11-11 14:20:59 +01:00
wernsaar
5b36cc0f47 added blas level1 dot kernels for complex and double complex 2013-11-08 09:08:11 +01:00
wernsaar
c8f1aeb154 added optimized blas level1 dot kernels for single and double precision 2013-11-07 17:22:03 +01:00
wernsaar
8fa93be06e added optimized blas level1 copy kernels 2013-11-07 17:18:56 +01:00
wernsaar
1e8128f41c added cgemm_tcopy_2_vfpv3.S and zgemm_tcopy_2_vfpv3.S 2013-11-07 17:15:50 +01:00
Zhang Xianyi
2f5fdd2000 Refs #314. Fixed clang compiling bug on OSX. 2013-11-07 08:12:03 +08:00
wernsaar
80a2e901b1 added dgemm_tcopy_4_vfpv3.S and sgemm_tcopy_4_vfpv3.S 2013-11-06 20:01:18 +01:00
wernsaar
ac50bccbd2 added cgemm_ncopy_2_vfpv3.S and made assembler labels unique 2013-11-05 20:21:35 +01:00
wernsaar
82015beaef added zgemm_ncopy_2_vfpv3.S and made assembler labels unique 2013-11-05 19:31:22 +01:00
wernsaar
6216ab8a7e removed obsolete gemm_kernels from haswell branch 2013-11-04 08:33:04 +01:00
wernsaar
370e3834a9 added missing file kernel/arm/Makefile 2013-11-03 11:54:39 +01:00
wernsaar
e31186efd4 deleted obsolete dgemm_kernel and dtrmm_kernel 2013-11-02 13:12:21 +01:00
wernsaar
2b801a00a5 small optimizations on sgemm_kernel for ARMV7 2013-11-02 13:06:11 +01:00
wernsaar
b3eab8fcb7 minor optimizations on zgemm_kernel for ARMV7 2013-11-02 09:43:53 +01:00
wernsaar
02bc36ac79 added sgemm_ncopy routine and made some improvements on cgemm_kernel for ARMV7 2013-11-01 18:22:27 +01:00
wernsaar
5118a7f4d1 small optimizations on dgemm_kernel for Piledriver 2013-10-31 11:53:26 +01:00
wernsaar
e172b70ea2 added cgemm_kernel for Piledriver 2013-10-31 08:38:17 +01:00
wernsaar
1cf4b974b2 added zgemm_kernel for Piledriver 2013-10-30 09:12:17 +01:00
wernsaar
7bccff1512 added sgemm_kernel for PILEDRIVER 2013-10-29 22:53:04 +01:00
wernsaar
afe44b0241 tests and code cleanup of gemm_kernels for HASWELL 2013-10-28 14:23:48 +01:00
wernsaar
a77c71eaf5 added highly optimized dgemm_kernel for HASWELL 2013-10-28 10:23:47 +01:00
wernsaar
fe8c5666f9 optimized dgemm_kernel for HASWELL 2013-10-20 16:52:26 +02:00
wernsaar
f6b50057e2 corrected and testet FMA3 Code 2013-10-19 10:52:20 +02:00
wernsaar
2840d56aeb added dgemm_kernel for Piledriver 2013-10-19 09:47:15 +02:00
wernsaar
85484a42df added kernels for cgemm, ctrmm, zgemm and ztrmm 2013-10-16 18:00:41 +02:00
wernsaar
3983011f0b added sgemm- and strmm_kernel 2013-10-14 08:22:27 +02:00
wernsaar
2a1515c9dd added dgemm_ncopy_4_vfpv3.S 2013-10-12 16:48:29 +02:00
wernsaar
31f51e78bc minor optimizations on dgemm_kernel 2013-10-12 09:42:18 +02:00
wangqian
beffee7d91 Fixed buffer overflow bug in kernel/x86_64/dgemv_t.S file. 2013-10-11 03:20:20 +08:00