wernsaar
|
23dd474cd0
|
added rot kernel for all precisions
|
2013-11-15 14:08:57 +01:00 |
wernsaar
|
f1b452e160
|
added scal kernel for all precisions
|
2013-11-15 11:56:43 +01:00 |
wernsaar
|
3dabd7e6e6
|
added swap-kernel for all precisions
|
2013-11-14 19:06:19 +01:00 |
wernsaar
|
6f4a0ebe38
|
added max- und min-kernels for all precisions
|
2013-11-14 13:52:47 +01:00 |
wernsaar
|
f750103336
|
small optimizations on dot-kernels
|
2013-11-11 15:47:56 +01:00 |
wernsaar
|
00f33c0134
|
added asum_kernel for all precisions and complex
|
2013-11-11 14:20:59 +01:00 |
wernsaar
|
5b36cc0f47
|
added blas level1 dot kernels for complex and double complex
|
2013-11-08 09:08:11 +01:00 |
wernsaar
|
c8f1aeb154
|
added optimized blas level1 dot kernels for single and double precision
|
2013-11-07 17:22:03 +01:00 |
wernsaar
|
8fa93be06e
|
added optimized blas level1 copy kernels
|
2013-11-07 17:18:56 +01:00 |
wernsaar
|
1e8128f41c
|
added cgemm_tcopy_2_vfpv3.S and zgemm_tcopy_2_vfpv3.S
|
2013-11-07 17:15:50 +01:00 |
wernsaar
|
80a2e901b1
|
added dgemm_tcopy_4_vfpv3.S and sgemm_tcopy_4_vfpv3.S
|
2013-11-06 20:01:18 +01:00 |
wernsaar
|
ac50bccbd2
|
added cgemm_ncopy_2_vfpv3.S and made assembler labels unique
|
2013-11-05 20:21:35 +01:00 |
wernsaar
|
82015beaef
|
added zgemm_ncopy_2_vfpv3.S and made assembler labels unique
|
2013-11-05 19:31:22 +01:00 |
wernsaar
|
370e3834a9
|
added missing file kernel/arm/Makefile
|
2013-11-03 11:54:39 +01:00 |
wernsaar
|
e31186efd4
|
deleted obsolete dgemm_kernel and dtrmm_kernel
|
2013-11-02 13:12:21 +01:00 |
wernsaar
|
2b801a00a5
|
small optimizations on sgemm_kernel for ARMV7
|
2013-11-02 13:06:11 +01:00 |
wernsaar
|
b3eab8fcb7
|
minor optimizations on zgemm_kernel for ARMV7
|
2013-11-02 09:43:53 +01:00 |
wernsaar
|
02bc36ac79
|
added sgemm_ncopy routine and made some improvements on cgemm_kernel for ARMV7
|
2013-11-01 18:22:27 +01:00 |
wernsaar
|
85484a42df
|
added kernels for cgemm, ctrmm, zgemm and ztrmm
|
2013-10-16 18:00:41 +02:00 |
wernsaar
|
3983011f0b
|
added sgemm- and strmm_kernel
|
2013-10-14 08:22:27 +02:00 |
wernsaar
|
2a1515c9dd
|
added dgemm_ncopy_4_vfpv3.S
|
2013-10-12 16:48:29 +02:00 |
wernsaar
|
31f51e78bc
|
minor optimizations on dgemm_kernel
|
2013-10-12 09:42:18 +02:00 |
wernsaar
|
e0b968c3a7
|
Changed kernels for dgemm and dtrmm
|
2013-10-05 12:59:44 +02:00 |
wernsaar
|
1c63180bb6
|
updated dgemm_kernel_8x2_vfpv3.S
|
2013-09-30 17:31:23 +02:00 |
wernsaar
|
4a474ea7dc
|
changed dgemm_kernel to use fused multiply add
|
2013-09-29 17:46:23 +02:00 |
wernsaar
|
69ce737cc5
|
modified Makefile.L3 for ARM
|
2013-09-28 19:13:47 +02:00 |
wernsaar
|
70411af888
|
initial checkin of kernel/arm
|
2013-09-28 19:02:25 +02:00 |
wernsaar
|
067e8417fd
|
removed unnessesary instructions from zgemm_kernel_2x2_bulldozer.S
|
2013-08-23 22:22:43 +08:00 |
wernsaar
|
a82da3d069
|
removed unnessesary instructions
|
2013-08-23 22:22:27 +08:00 |
Zhang Xianyi
|
1569bf14f8
|
Refs #282. Fixed zgemv_n typo bug on Win64.
|
2013-08-23 16:27:17 +08:00 |
Zhang Xianyi
|
c0159d44a3
|
Merge branch 'develop' of https://github.com/wernsaar/OpenBLAS into wernsaar-develop
|
2013-08-09 10:48:46 +08:00 |
wernsaar
|
c17a850c1c
|
modified KERNEL.BULLDOZER
|
2013-08-08 17:49:30 +02:00 |
wernsaar
|
099853fff6
|
added dtrsm_kernel_RN_8x2_bulldozer.S
|
2013-08-08 07:14:08 +02:00 |
wernsaar
|
44d23881b5
|
dtrsm_kernel_LT_8x2_bulldozer.S performance optimization
|
2013-08-05 11:27:16 +02:00 |
Zhang Xianyi
|
32fb6b9bb2
|
Merge branch 'develop' of https://github.com/wernsaar/OpenBLAS into wernsaar-develop
|
2013-08-05 16:09:47 +08:00 |
wernsaar
|
aaeb8eaecd
|
modified dtrsm_kernel_LT_8x2_bulldozer.S
|
2013-08-04 12:16:12 +02:00 |
wernsaar
|
8aeec32ea0
|
modified dtrsm_kernel_LT_8x2_bulldozer.S
|
2013-08-04 10:15:33 +02:00 |
wernsaar
|
87fc9de572
|
added dtrsm_kernel_LT_8x2_bulldozer.S
|
2013-08-04 09:54:40 +02:00 |
wernsaar
|
564aa60fec
|
removed dtrsm_kernel_LT_8x2_bulldozer.S
|
2013-08-03 15:40:51 +02:00 |
wernsaar
|
f645665dd6
|
fixed bug in dgemv_t_bulldozer.S
|
2013-08-03 12:19:29 +02:00 |
wernsaar
|
e45a347cd2
|
repaired trmm bug in sgemm_kernel_16x2_bulldozer.S
|
2013-08-03 11:43:25 +02:00 |
wernsaar
|
99727ac013
|
repaired trmm bug in cgemm_kernel_4x2_bulldozer.S
|
2013-08-03 10:32:51 +02:00 |
wernsaar
|
6e0a2fbc0c
|
repaired trmm bug in zgemm_kernel_2x2_bulldozer.S
|
2013-08-03 10:17:08 +02:00 |
wernsaar
|
0a22f99c58
|
repaired trmm bug in dgemm_kernel_8x2_bulldozer.S
|
2013-08-03 09:35:39 +02:00 |
wernsaar
|
cff70a666d
|
added generic trmm kernels and modified Makefile.L3
|
2013-07-30 20:18:57 +02:00 |
wernsaar
|
84bd0aabaa
|
added dtrsm_kernel_LT_8x2_bulldozer.S
|
2013-07-28 16:47:58 +02:00 |
Zhang Xianyi
|
72b1edaf1b
|
Merge branch 'develop' into bulldozer
Conflicts:
kernel/x86_64/KERNEL.BULLDOZER
|
2013-07-28 06:38:25 +02:00 |
wangqian
|
1b3b9e841d
|
Fixed a computational error in zgemm_kernel_4x4_sandy.S file.
|
2013-07-18 20:23:21 +08:00 |
Zhang Xianyi
|
2ed0f6ab60
|
Fixed the typo.
|
2013-07-11 23:47:07 +08:00 |
Zhang Xianyi
|
886cbaf4e4
|
Support AMD Piledriver by bulldozer kernels.
|
2013-07-06 12:06:43 -03:00 |