Commit Graph

  • aba0751d4b Optimizations for APM's xgene-1 (aarch64). Benedikt Huber 2014-10-09 06:52:10 -0700
  • 2987bc7b40 refs #464. Fixed the bug of detecting L2 associative on x86. Zhang Xianyi 2014-11-10 17:15:34 +0800
  • 695e0fa649 #463 fixed a compiling bug on AIX. Zhang Xianyi 2014-11-10 14:39:56 +0800
  • cbb23c46c2 Merge pull request #459 from tkelman/symbol-rename Zhang Xianyi 2014-10-25 19:49:03 +0800
  • 0b4602b753 add SYMBOLPREFIX and SYMBOLSUFFIX makefile options Tony Kelman 2014-10-24 22:27:00 -0700
  • ac5a7e1c1b Update dot to 0.2.12 version. Zhang Xianyi 2014-10-13 17:10:12 +0800
  • f1b9a4a1ca Ref #454: fixed bug in common_param.h wernsaar 2014-09-23 11:34:29 +0200
  • ae6b7caf32 Merge pull request #453 from wernsaar/develop Zhang Xianyi 2014-09-22 16:47:54 +0800
  • f446d2368a updated cblas.h and cblas_noconst.h wernsaar 2014-09-21 13:39:15 +0200
  • dab4edd069 added benchmark for gemm3m functions wernsaar 2014-09-21 12:00:41 +0200
  • 9d7057366d bugfix for GEMM3M functions wernsaar 2014-09-21 11:41:43 +0200
  • 7f234f8ed1 added GEMM3M tests wernsaar 2014-09-21 10:55:08 +0200
  • 9e829ce98f enabled cblas gemm3m functions wernsaar 2014-09-20 17:20:02 +0200
  • d49fd33885 disabled SYMM3M and HEMM3M functions because segment violations wernsaar 2014-09-20 15:27:40 +0200
  • f0f9b25bb6 added test for CGEMM3M function wernsaar 2014-09-20 14:53:30 +0200
  • 7aae4a62e7 enabled use of GEMM3M functions wernsaar 2014-09-20 14:27:10 +0200
  • 7a911569b8 added test for GEMM3M functions wernsaar 2014-09-20 14:21:42 +0200
  • 466bfb8b86 updated README.md wernsaar 2014-09-17 16:01:07 +0200
  • 70d1ba09b2 Update the doc for target list. Zhang Xianyi 2014-09-17 14:29:21 +0800
  • d293b78b64 Merge pull request #451 from eshelman/patch-1 Zhang Xianyi 2014-09-17 14:20:06 +0800
  • 9912dbbcf9 Add HASWELL to TargetList.txt Eliot Eshelman 2014-09-16 18:26:45 -0400
  • 01bc462e8e Merge pull request #449 from wernsaar/develop Zhang Xianyi 2014-09-16 14:33:48 +0800
  • 3300f5ebff optimized multithreading lower limits wernsaar 2014-09-15 11:38:25 +0200
  • 59e2c20557 Merge pull request #448 from wernsaar/develop Zhang Xianyi 2014-09-15 13:12:14 +0800
  • b7c9566eea removed obsolete gemv kernel files wernsaar 2014-09-14 11:00:53 +0200
  • 6df1b0be81 optimized zgemv_n_microk_sandy-4.c wernsaar 2014-09-14 10:21:22 +0200
  • 2ac1e076c1 added optimized zgemv_n kernel for sandybridge wernsaar 2014-09-14 09:02:05 +0200
  • 9908b6031c bugfix in KERNEL.PILEDRIVER wernsaar 2014-09-13 16:26:53 +0200
  • 8f100a14f2 optimized cgemv_t kernel for haswell wernsaar 2014-09-13 16:13:27 +0200
  • 53b5726b04 added optimized cgemv_t kernel for haswell wernsaar 2014-09-13 15:14:12 +0200
  • 1a352b24e6 updated KERNEL.HASWELL wernsaar 2014-09-13 12:23:27 +0200
  • 5194818d4b updated zgemv_t_4.c wernsaar 2014-09-13 09:48:34 +0200
  • 8a39cdb1c1 added optimized zgemv_t kernel for haswell wernsaar 2014-09-13 09:47:07 +0200
  • fd2478c9e2 optimized interface/zgemv.c for multithreading wernsaar 2014-09-12 19:18:23 +0200
  • 0a1390f2d8 enabled optimized zgemv_t kernel for bulldozer wernsaar 2014-09-12 17:43:47 +0200
  • a8b0812feb optimized zgemv_t for bulldozer wernsaar 2014-09-12 17:42:25 +0200
  • a0fb68ab42 added optimized zgemv_t kernel for bulldozer wernsaar 2014-09-12 17:04:22 +0200
  • 6544d30e42 Fix segfault when gemm is called immediately after set_num_threads. Dan Luu 2014-09-12 08:55:23 -0500
  • 44c11165d5 bugfix in cgemv_t_4.c wernsaar 2014-09-12 14:12:24 +0200
  • 564be4eb72 added optimized cgemv_t kernel wernsaar 2014-09-12 13:38:01 +0200
  • 107c3ea7d5 added optimized zgemv_t routine wernsaar 2014-09-12 12:35:20 +0200
  • bb8d698335 optimized zgemv_n_microk_haswell-4.c for small size wernsaar 2014-09-11 13:44:55 +0200
  • e0192a6914 bugfix in zgemv_n_4.c wernsaar 2014-09-11 13:18:00 +0200
  • bced4594bb added optimized zgemv_n kernel wernsaar 2014-09-11 12:34:57 +0200
  • cafba99b6b bufix in cgemv_n_microk_haswell-4.c wernsaar 2014-09-11 11:12:44 +0200
  • ac8f232b2a more optimizations wernsaar 2014-09-11 10:25:48 +0200
  • f98e1244c4 optimized cgemv_n_4.c wernsaar 2014-09-10 19:26:14 +0200
  • be95700b30 added optimized cgemv_kernel for haswell wernsaar 2014-09-10 14:11:24 +0200
  • 4aa534ae93 added cgemv_n kernel, optimized for small sizes wernsaar 2014-09-10 13:45:13 +0200
  • 1cba8e7b11 Merge pull request #446 from grisuthedragon/cblas_matcopy Zhang Xianyi 2014-09-10 16:31:31 +0800
  • d13e92f07e Merge pull request #445 from wernsaar/develop Zhang Xianyi 2014-09-10 16:28:14 +0800
  • baa46e4fba added and tested optimized dgemv_n kernel for haswell wernsaar 2014-09-09 16:17:45 +0200
  • faab7a181d added optimized dgemv_n kernel for haswell wernsaar 2014-09-09 15:32:32 +0200
  • 8109d8232c optimized dgemv_t kernel for haswell wernsaar 2014-09-09 14:38:08 +0200
  • debc6d1a05 bugfix in KERNEL.HASWELL wernsaar 2014-09-09 14:04:44 +0200
  • e73a0113ec added optimized gemv kernels wernsaar 2014-09-09 13:54:55 +0200
  • 44f2bf9bae added optimized dgemv_t kernel for haswell wernsaar 2014-09-09 13:34:22 +0200
  • a057e5434d add CBLAS interface for s/d/c/zimatcopy Martin Koehler 2014-09-09 09:52:13 +0200
  • cd34e9701b removed obsolete files wernsaar 2014-09-08 19:15:31 +0200
  • 7794766d3c Add cblas_(s/d/c/z)omatcopy in order to have cblas interface for them. Martin Köhler 2014-09-08 17:57:44 +0200
  • 658939faaa optimized dgemv_n kernel for small sizes wernsaar 2014-09-08 15:22:35 +0200
  • f511807fc0 modified multithreading threshold wernsaar 2014-09-08 12:27:32 +0200
  • c4d9d4e5f8 added haswell optimized kernel wernsaar 2014-09-08 12:25:16 +0200
  • 7c0a94ff47 bugfix in sgemv_n_microk_haswell-4.c wernsaar 2014-09-08 10:54:33 +0200
  • cbbc80aad3 added optimized sgemv_t kernel for haswell wernsaar 2014-09-08 10:13:39 +0200
  • 2be5c7a640 bugfix for windows wernsaar 2014-09-07 21:48:42 +0200
  • 80f7786875 enabled optimized sgemv kernels for piledriver wernsaar 2014-09-07 21:13:57 +0200
  • 553e275407 optimized sgemv_n kernel for sandybridge wernsaar 2014-09-07 20:53:30 +0200
  • 7b3932b3f3 optimized sgemv_n kernel for nehalem wernsaar 2014-09-07 19:20:08 +0200
  • 75207b1148 optimized sgemv_n for very small size of m wernsaar 2014-09-07 18:23:48 +0200
  • 274828fa50 optimizations for very small sizes wernsaar 2014-09-07 13:45:03 +0200
  • 5ae1731fe6 better optimzations for sgemv_t kernel wernsaar 2014-09-06 21:28:57 +0200
  • c8eaf3ae2d optimized sgemv_t_4 kernel for very small sizes wernsaar 2014-09-06 19:41:57 +0200
  • 3a7ab47ee9 optimized sgemv_t wernsaar 2014-09-06 18:34:25 +0200
  • cf5544b417 optimization for small size wernsaar 2014-09-06 13:17:56 +0200
  • d143f84dd2 added optimized sgemv_n kernel for haswell wernsaar 2014-09-06 12:08:48 +0200
  • 7794237475 undef WHEREAMI wernsaar 2014-09-06 11:01:42 +0200
  • a64fe9bcc9 added optimized sgemv_n kernel for sandybridge wernsaar 2014-09-06 08:41:53 +0200
  • 2021d0f9d6 experimentally removed expensive function calls wernsaar 2014-09-05 15:05:53 +0200
  • 6df7a88930 optimized sgemv_t for sandybridge wernsaar 2014-09-05 10:22:50 +0200
  • 53de943690 bugfix for sgemv_n_4.c wernsaar 2014-09-04 18:55:52 +0200
  • 7f910010a0 optimized sgemv_n kernel for small sizes wernsaar 2014-09-04 13:09:27 +0200
  • 3a5d8dbff9 optimized sgemv_n_4.c wernsaar 2014-09-03 15:34:30 +0200
  • 2a60c6d4b0 optimized sgemv_n for small sizes wernsaar 2014-09-03 14:48:45 +0200
  • 0fc560ba23 bugfix for buffer overflow wernsaar 2014-09-03 10:13:47 +0200
  • d1800397f5 optimized interface/gemv.c for multithreading wernsaar 2014-09-02 17:36:07 +0200
  • f4ff889491 updated interface/gemv.c for multithreading wernsaar 2014-09-02 16:30:04 +0200
  • 210bec9111 added plot-header to compare multithreading wernsaar 2014-09-02 14:11:42 +0200
  • f3b50dcf5b removed obsolete instructions from sgemv_t_4.c wernsaar 2014-09-02 13:35:41 +0200
  • 93eaba959d optimized sgemv_t for bulldozer wernsaar 2014-09-02 12:42:36 +0200
  • 9570e56965 optimized sgemv_t_4.c for small sizes wernsaar 2014-09-01 15:11:37 +0200
  • d7f91f8b4f extended gemv.c benchmark wernsaar 2014-09-01 15:07:36 +0200
  • 53f1277b6b modified benchmark/gemv.c wernsaar 2014-08-31 15:38:18 +0200
  • bc99faef1b optimized sgemv_t_4.c for uneven sizes wernsaar 2014-08-31 14:33:15 +0200
  • 848c0f16f7 optimized sgemv_t_4.c for small size wernsaar 2014-08-31 13:23:44 +0200
  • e2fc8c8c2c changed 1 test value (bug in lapack-testing?) wernsaar 2014-08-30 13:58:02 +0200
  • 53e6dbf6ca optimized sgemv_t kernel for small sizes wernsaar 2014-08-30 13:36:27 +0200
  • 868f8a8756 Merge pull request #443 from idunham/fix Zhang Xianyi 2014-08-29 13:31:06 +0800
  • db7e6366cd Workaround PIC limitations in cpuid. Isaac Dunham 2014-08-28 13:05:07 -0700
  • 2702323f7d Merge pull request #440 from wernsaar/develop Zhang Xianyi 2014-08-28 12:43:54 +0800