Commit Graph

  • 8d50a9fd1a Fixed #35 a build bug with NO_LAPACK=1 & DYNAMIC_ARCH=1. Xianyi Zhang 2011-06-09 11:38:59 +0800
  • 1496383224 Print the wall time (cycles) with enabling FUNCTION_PROFILE. Xianyi Zhang 2011-06-09 10:40:15 +0800
  • 4335bca2f7 Fixed #33 ztrmm bug on Nehalem. Wang Qian 2011-06-07 12:53:25 +0800
  • 31040e4d80 Fixed #32 a SEGFAULT bug with gcc-4.6. According to i386 calling convention, The called funtion should remove the hidden return value address from the stack. Xianyi 2011-06-03 13:19:54 +0800
  • 3d7e62eb8b Fixed #31 Shared library placement on Mac. Thank Mr.Viral B. Shah for this patch. Xianyi Zhang 2011-05-30 12:42:17 +0800
  • 88d94d0ec8 Fixed #30 strmm computational error on Loongson3A. traz 2011-05-28 09:48:34 +0000
  • af40551c9f Fixed the makefile bug about openblas_set_num_threads. Xianyi Zhang 2011-05-27 21:15:30 +0800
  • c30c22a76c Fixed a bug about detecting underscore prefix in c_check. Xianyi Zhang 2011-05-27 18:16:19 +0800
  • cc09e6ef3a Ingnore *.obj files in git. Xianyi Zhang 2011-05-27 18:12:45 +0800
  • fc84909115 Modify single precision compiler conditions, increasing single precision kernel code on Loongson3a. traz 2011-05-27 09:47:17 +0000
  • 5ca4e51df0 Remove the useless code, modify code comments and format. traz 2011-05-18 10:54:51 +0000
  • fcb5ce011b Fixed #28. Convert the result to double precision in MIPS64 dsdot_k kernel. Xianyi Zhang 2011-05-17 21:24:00 +0000
  • a9320f896e Fixed #25 dtrmm and dtrsm computational error on Loongson3A. traz 2011-05-14 22:00:57 +0000
  • 830a823be1 Added missed testing codes for dsdot. Xianyi Zhang 2011-05-13 02:41:39 +0800
  • b206fc7075 Fixed #28. Convert the result to double precision in the end of dsdot kernel. Xianyi Zhang 2011-05-13 02:34:30 +0800
  • 1d60510959 Added the unit testcase for dsdot. Xianyi Zhang 2011-05-13 02:19:55 +0800
  • 03272a606d Added the unit test for drotmg. Xianyi Zhang 2011-05-13 01:21:39 +0800
  • 0dc9eca36f Merge branch 'hotfix-readme_about_branches' into develop Xianyi Zhang 2011-05-12 19:06:31 +0800
  • 8cc628a953 Merge branch 'hotfix-readme_about_branches' Xianyi Zhang 2011-05-12 19:06:02 +0800
  • bbc517292a Added the spec of git branches about this project. Xianyi Zhang 2011-05-12 19:05:20 +0800
  • 29dce62b8f Finish dtrsm_kernel_Rx.S on Loongson3A. traz 2011-05-11 10:44:23 +0000
  • fa8e4fd879 Fixed #26 the wrong result of rotmg. Used fabs() instead of abs(). Xianyi Zhang 2011-05-11 01:12:32 +0800
  • 432c309f63 Finish dtrsm_kernel_Lx.S on Loongson3A. traz 2011-05-10 12:48:43 +0000
  • d2f351d819 Modify dtrsm compiler options traz 2011-05-09 17:31:58 +0000
  • 5a991b7149 Fixed #24 drmm error on Loongson3A traz 2011-05-09 17:28:20 +0000
  • 417b8ec792 Added openblas_set_num_threads for Fortran. Xianyi Zhang 2011-05-06 17:03:35 +0800
  • 7dcf4eeee7 Fixed #23. Fixed a bug of f_check script about generating link flags. Xianyi Zhang 2011-05-04 13:03:10 +0800
  • 1acf5ace29 Fixed a bug when detecting Intel CPU. Xianyi Zhang 2011-05-03 17:19:36 +0800
  • fcf9b82f14 Fixed a build bug with NO_LAPACK=1 and SANNITY_CHECK=1. traits 2011-05-03 14:42:11 +0800
  • 2aab238c61 Fixed #16. Print the user-friendly message when detecting CPU failed. Xianyi Zhang 2011-04-22 22:14:06 +0800
  • b8d93812f0 Added docs for make TARGET=your_cpu_target. Xianyi Zhang 2011-04-22 22:07:46 +0800
  • ff6ae89d3e Fixed #19. Provided an error msg when the arch is not supported. Xianyi Zhang 2011-04-22 20:21:42 +0800
  • 0a45e5495f Fixed #21. Added extern C to support C++. Thank Tasio for the patch. Xianyi Zhang 2011-04-20 13:41:38 +0800
  • 9320933520 Completely dtrmm function. traz 2011-04-17 20:26:49 +0000
  • 921caefa56 Increased handling trmm part, no edge handling. Test size(M and N) must be a multiple of 4 . traz 2011-04-15 21:56:25 +0000
  • ecd4c1f3d9 Modify prefetching C. traz 2011-04-11 22:46:36 +0000
  • ab9e4ce351 Adjust kc size from 112 to 116 . traz 2011-04-11 22:17:57 +0000
  • 921e040b15 Changed default page size to 16KB on Loongson 3A. Xianyi Zhang 2011-04-11 21:46:48 +0000
  • 00ef0cd434 Supported goto_set_num_threads & openblas_set_num_threads functions when USE_OPENMP=1. Xianyi Zhang 2011-04-07 14:52:35 +0800
  • 989c6f8b06 Fixed #14 the SEGFAULT bug on 64 cores. On SMP server, the number of CPUs or cores should be less than or equal to 64. Xianyi Zhang 2011-03-28 10:58:39 +0800
  • 552f31dbbd Fixed #13. Fixed blasint undefined bug in <cblas.h> file. Xianyi Zhang 2011-03-25 01:16:12 +0800
  • 5452ba3850 Updated the developing version to v0.1 alpha2. Xianyi Zhang 2011-03-20 23:35:31 +0800
  • 54745902b8 Init Changelog file for next release version(v0.1alpha2). Xianyi Zhang 2011-03-20 23:30:09 +0800
  • 1aa9a298e1 Change BLOCK SIZE of LOONGSON3A TARGET. traz 2011-04-06 10:39:31 +0000
  • 782205a693 Add dgemm compiler Options in KERNEL.LOONGSON3A. traz 2011-04-06 10:38:34 +0000
  • ac494c0d04 New kernel in LOONGSON3A. traz 2011-04-06 10:36:44 +0000
  • 85f99d4769 Fixed #14 the SEGFAULT bug on 64 cores. On SMP server, the number of CPUs or cores should be less than or equal to 64. Xianyi Zhang 2011-03-28 10:58:39 +0800
  • 5e7f29b19e Fixed #13. Fixed blasint undefined bug in <cblas.h> file. Xianyi Zhang 2011-03-25 01:16:12 +0800
  • 141091f528 Merge branch 'master' of github.com:xianyi/OpenBLAS into x86 Xianyi Zhang 2011-03-22 14:16:18 +0800
  • e4bb6f2482 Fixed the detecting bug on Intel Core i5. Thank ggl329 for the patch. v0.1alpha1 Xianyi Zhang 2011-03-22 14:09:47 +0800
  • 0edcdd470e Updated the developing version to v0.1 alpha2. Xianyi Zhang 2011-03-20 23:35:31 +0800
  • d672491122 Init Changelog file for next release version(v0.1alpha2). Xianyi Zhang 2011-03-20 23:30:09 +0800
  • 972062903c OpenBLAS 0.1 alpha version 1. Xianyi Zhang 2011-03-20 22:44:57 +0800
  • d9aa359e69 Merge remote branch 'origin/loongson3a' into x86 Xianyi Zhang 2011-03-20 21:57:58 +0800
  • 04769bdf54 Merge remote branch 'origin/loongson3a' into x86 Xianyi Zhang 2011-03-20 21:57:09 +0800
  • 6f058487ab Detect Intel Core Clarkdale & Arrandale Xianyi Zhang 2011-03-20 21:56:40 +0800
  • f405b5bcc5 Fixed the bug about Loongson3A gsLQC1 & gsSQC1 instructions in daxpy kernel. Now daxpy is correct. Xianyi Zhang 2011-03-18 23:05:56 +0000
  • 2b8643e0de Merge branch 'loongson3a' of github.com:xianyi/OpenBLAS into loongson3a Xianyi Zhang 2011-03-18 01:20:15 +0000
  • c84f8be453 Supported detecting new kernel(2.6.36) & new Loongson3A03 CPU. Xianyi Zhang 2011-03-18 01:10:58 +0000
  • d5cffd506a Modified the default kernel makefile in MIPS64 arch. Wang Qian 2011-03-07 11:22:32 +0000
  • 5838f12995 Support unalign address in daxpy on loongson3a simd.. Xianyi Zhang 2011-03-05 10:17:10 +0800
  • 5444a3f8f7 Unroll to 16 in daxpy on loongson3a. Xianyi Zhang 2011-03-04 17:50:17 +0800
  • 88cbfcc5b5 Merge commit 'origin/x86' into loongson3a Xianyi Zhang 2011-03-04 14:11:52 +0000
  • ce78abe37e Merge branch 'x86' of github.com:xianyi/OpenBLAS into x86 Xianyi Zhang 2011-03-04 11:53:04 +0800
  • 8f1090d32a Support NO_LAPACK=1 to build the lib without LAPACK functions. Xianyi Zhang 2011-03-04 11:51:32 +0800
  • 272f62a2b6 Changed movlps macro name in capital in x86/zdot_sse2.S file. Xianyi 2011-03-03 00:46:39 +0800
  • 36016fe349 On x86 32bits, gcc 4.4.3 generated wrong codes (movsd) from movlps in zdot_sse2.S line 191. This would casue zdotu & zdotc failures. Instead, use movlpd to walk around it. Fixed #8. Fixed #9. Xianyi 2011-03-02 18:45:30 +0800
  • 44acb7503e Added zdotu with x & y offset=1 test case. Xianyi Zhang 2011-03-02 18:03:40 +0800
  • 6eb02bbb9c Merge remote branch 'origin/x86' into loongson3a Xianyi Zhang 2011-03-02 13:52:05 +0800
  • 0e782b9bd3 updated the changelog. Xianyi Zhang 2011-03-02 13:40:55 +0800
  • 588737210d Fixed randomly SEGFAULT when nodemask==NULL with above Linux 2.6.34. Fixed #12. Thank Mr.Ei-ji Nakama providing this patch. Xianyi Zhang 2011-03-02 13:38:32 +0800
  • cdf33edac3 Added Changelog. Fixed #11. Xianyi Zhang 2011-02-26 12:27:56 +0800
  • f7a5e049e2 Enable Debug flags in memory alloc and init functions. Xianyi Zhang 2011-02-26 11:51:39 +0800
  • 1b97ec1a7c Added DEBUG option in Makefile.rule. Fixed DEBUG typo mistakes. Xianyi Zhang 2011-02-26 11:19:54 +0800
  • 36b3a730d3 Merge branch 'x86' of github.com:xianyi/OpenBLAS into x86 Xianyi Zhang 2011-02-24 17:02:52 +0800
  • 128418f49b Fixed #10. Supported GOTO_NUM_THREADS & GOTO_THREADS_TIMEOUT environment variables. Xianyi Zhang 2011-02-24 15:16:21 +0800
  • 12214e1d0f Fixed #7. Modified axpy kernel codes to avoid unloop with incx==0 or incy==0 in x86 32bits arch. Xianyi 2011-02-23 20:08:34 +0800
  • cd2cbabecc Added unit test case (zdotu, N=1). Xianyi Zhang 2011-02-22 14:16:46 +0800
  • 854137e0fd Supported building debug version. Xianyi Zhang 2011-02-22 13:40:40 +0800
  • afbe3c9791 Improved the quality of codes in unit test. Thanks José Luis García Pallero Xianyi Zhang 2011-02-21 00:42:46 +0800
  • 0cfd29a819 Fixed #7. 1)Disable the multi-thread and 2) Modified kernel codes to avoid unloop in axpy function when incx==0 or incy==0. Xianyi Zhang 2011-02-21 00:24:21 +0800
  • 109b86d00e Added axpy unit test with incx==0 and incy==0. Xianyi Zhang 2011-02-21 00:17:33 +0800
  • 78da0e0a0c Fixed #6. Disable multi-thread swap when incx==0 or incy==0. Xianyi Zhang 2011-02-20 17:14:38 +0800
  • 8dd3fd7f26 Added swap unit test with incx==0 and incy==0. Xianyi Zhang 2011-02-20 17:13:12 +0800
  • 51454082c6 Updated readme file. Xianyi Zhang 2011-02-19 00:18:17 +0800
  • e51364edb4 Fixed #5 Detected Intel Westmere (using Nehalem codes) in build and dynamic arch build. Thanks Cao He from Dawning supporting Intel Xeon 5660 testbed. Xianyi Zhang 2011-02-18 22:08:10 +0800
  • bfaa80c316 fixed #4 csrot & drot returned the wrong result when incx==incy==0 on i686 arch. Xianyi 2011-02-18 03:00:58 +0800
  • bd7a74234f Disable quad and x precision objs in reference. Xianyi 2011-02-18 02:50:32 +0800
  • 029d5d16d0 Merge branch 'master' into loongson3a Xianyi Zhang 2011-02-17 00:39:09 +0800
  • c84315782c Merge branch 'x86' of github.com:xianyi/OpenBLAS into x86 Xianyi Zhang 2011-02-16 23:41:15 +0800
  • c5852d4e30 fixed #4 csrot returned the wrong result when incx==incy==0. Xianyi Zhang 2011-02-16 23:39:43 +0800
  • c79696cc61 Added rot testcase when incx == incy ==1. Xianyi Zhang 2011-02-16 23:32:13 +0800
  • 84ba64e65b fixed a bug in drot whe incx or incy equals to zero. Xianyi Zhang 2011-02-16 00:18:45 +0800
  • e3e7547712 Merge branch 'master' into x86 Xianyi Zhang 2011-02-16 17:42:12 +0800
  • 1dd1bba66c Updated gitignore. Xianyi Zhang 2011-02-16 17:37:48 +0800
  • fbf95688d6 Added utest frame using CUnit(http://cunit.sourceforge.net/). Xianyi Zhang 2011-02-16 17:33:06 +0800
  • b8b27bec5c fixed a bug in drot whe incx or incy equals to zero. Xianyi Zhang 2011-02-16 00:18:45 +0800
  • 1e671b49f3 Did the experiment with Loongson 3A 128bit load & store instruction. Xianyi Zhang 2011-01-29 03:05:27 +0800
  • 77b7020d69 changed prefetch order. Xianyi Zhang 2011-01-29 03:03:34 +0800
  • e003b811ab load x & y contiguously in axpy. Xianyi Zhang 2011-01-28 11:18:50 +0800