Commit Graph

617 Commits

Author SHA1 Message Date
wernsaar
f1db386211 changes for compatibility with Pathscale compiler 2013-11-13 17:59:11 +01:00
wernsaar
6216ab8a7e removed obsolete gemm_kernels from haswell branch 2013-11-04 08:33:04 +01:00
wernsaar
afe44b0241 tests and code cleanup of gemm_kernels for HASWELL 2013-10-28 14:23:48 +01:00
wernsaar
a77c71eaf5 added highly optimized dgemm_kernel for HASWELL 2013-10-28 10:23:47 +01:00
wernsaar
fe8c5666f9 optimized dgemm_kernel for HASWELL 2013-10-20 16:52:26 +02:00
wernsaar
f6b50057e2 corrected and testet FMA3 Code 2013-10-19 10:52:20 +02:00
Zhang Xianyi
f51a849d91 Merge pull request #278 from wernsaar/haswell
Merge wernsaar's Haswell gemm kernels.
2013-08-17 08:24:37 -07:00
wernsaar
44ef70420c added cgemm_kernel_8x2_haswell.S 2013-08-16 18:54:56 +02:00
wernsaar
d488b1b1aa added zgemm_kernel_4x2_haswell.S 2013-08-16 10:29:47 +02:00
wernsaar
4070d9a123 added dgemm_kernel_16x2_haswell.S 2013-08-15 19:17:20 +02:00
wernsaar
0b90c0ec64 added sgemm_kernel_16x4_haswell.S 2013-08-15 18:46:14 +02:00
wernsaar
2b8ab8f55b sgemm_kernel_16x4_haswell.S minor changes 2013-08-14 01:44:41 +02:00
wernsaar
1cb9579cd0 added zgemm_kernel_4x2_haswell.S and fixed a bug in sgemm_kernel_16x4_haswell.S 2013-08-14 01:23:15 +02:00
Zhang Xianyi
2638370844 Init code base for Intel Haswell. 2013-08-13 00:54:59 +08:00
wernsaar
89637f87c8 added sgemm- and dgemm-kernel for HASWELL processor 2013-08-12 18:04:10 +02:00
Zhang Xianyi
c0b1e41bec Merge branch 'bulldozer' into develop 2013-08-12 23:22:10 +08:00
Zhang Xianyi
49faee1a51 Fixed #276. Merge branch 'wernsaar-develop' into bulldozer 2013-08-09 10:50:06 +08:00
Zhang Xianyi
c0159d44a3 Merge branch 'develop' of https://github.com/wernsaar/OpenBLAS into wernsaar-develop 2013-08-09 10:48:46 +08:00
wernsaar
c17a850c1c modified KERNEL.BULLDOZER 2013-08-08 17:49:30 +02:00
wernsaar
099853fff6 added dtrsm_kernel_RN_8x2_bulldozer.S 2013-08-08 07:14:08 +02:00
wernsaar
44d23881b5 dtrsm_kernel_LT_8x2_bulldozer.S performance optimization 2013-08-05 11:27:16 +02:00
Zhang Xianyi
2905042c6a Refs #270 #268. Merge branch 'wernsaar-develop' into bulldozer 2013-08-05 16:17:28 +08:00
Zhang Xianyi
32fb6b9bb2 Merge branch 'develop' of https://github.com/wernsaar/OpenBLAS into wernsaar-develop 2013-08-05 16:09:47 +08:00
Zhang Xianyi
673e453b3f Enable bulldozer kernels. 2013-08-05 16:07:54 +08:00
Zhang Xianyi
143cca4dd5 Merge branch 'develop' into bulldozer 2013-08-05 15:51:53 +08:00
wernsaar
aaeb8eaecd modified dtrsm_kernel_LT_8x2_bulldozer.S 2013-08-04 12:16:12 +02:00
wernsaar
8aeec32ea0 modified dtrsm_kernel_LT_8x2_bulldozer.S 2013-08-04 10:15:33 +02:00
wernsaar
87fc9de572 added dtrsm_kernel_LT_8x2_bulldozer.S 2013-08-04 09:54:40 +02:00
wernsaar
564aa60fec removed dtrsm_kernel_LT_8x2_bulldozer.S 2013-08-03 15:40:51 +02:00
wernsaar
f645665dd6 fixed bug in dgemv_t_bulldozer.S 2013-08-03 12:19:29 +02:00
wernsaar
e45a347cd2 repaired trmm bug in sgemm_kernel_16x2_bulldozer.S 2013-08-03 11:43:25 +02:00
wernsaar
99727ac013 repaired trmm bug in cgemm_kernel_4x2_bulldozer.S 2013-08-03 10:32:51 +02:00
wernsaar
6e0a2fbc0c repaired trmm bug in zgemm_kernel_2x2_bulldozer.S 2013-08-03 10:17:08 +02:00
wernsaar
0a22f99c58 repaired trmm bug in dgemm_kernel_8x2_bulldozer.S 2013-08-03 09:35:39 +02:00
Zhang Xianyi
79ba52115d Merge branch 'hotfix-v0.2.8' into develop 2013-08-01 23:57:19 +08:00
Zhang Xianyi
b736aa8110 Update the doc for 0.2.8 version. 2013-08-01 23:52:43 +08:00
Zhang Xianyi
ae521ecc3e OpenBLAS 0.2.8 rc1. 2013-07-31 14:49:16 +08:00
Zhang Xianyi
36adfe8d64 Merge branch 'hotfix-v0.2.8' into develop 2013-07-31 14:46:56 +08:00
Zhang Xianyi
a07cc39571 Refs #266. Fixed the compiling bug with Open64 5.0. 2013-07-31 14:41:39 +08:00
wernsaar
cff70a666d added generic trmm kernels and modified Makefile.L3 2013-07-30 20:18:57 +02:00
Zhang Xianyi
b5c2ac4fd6 Fixed #264 the memory leak bug in dtrtri_U. 2013-07-29 23:21:10 +08:00
Zhang Xianyi
749f45ffc8 Fixed the FMA3 detection bug. 2013-07-29 16:48:53 +08:00
Zhang Xianyi
534c5ec919 Fixed #261. Use strncmp instead of a comparing trick. 2013-07-29 16:48:35 +08:00
Zhang Xianyi
bd2da90e13 Fixed typo in getarch_2nd.c. 2013-07-29 15:42:00 +08:00
wernsaar
84bd0aabaa added dtrsm_kernel_LT_8x2_bulldozer.S 2013-07-28 16:47:58 +02:00
Zhang Xianyi
5b504d6c23 Refs #263. Rollback bulldozer and piledriver kernels to barcelona kernels. 2013-07-28 17:39:24 +08:00
Zhang Xianyi
72b1edaf1b Merge branch 'develop' into bulldozer
Conflicts:
	kernel/x86_64/KERNEL.BULLDOZER
2013-07-28 06:38:25 +02:00
Zhang Xianyi
a2930664f4 Refs #262. Added executable stack markings. 2013-07-28 00:09:40 +08:00
Zhang Xianyi
6e0db36373 Merge branch 'sfabbro-ldflags' into develop 2013-07-27 23:03:07 +08:00
Zhang Xianyi
1e1250b703 Fixed #260. Fixed generating 32-bit shared library on previous commit. 2013-07-27 23:01:36 +08:00