Commit Graph

46 Commits

Author SHA1 Message Date
Zhang Xianyi bfaaa975e6 Added BULLDOZER target. So far it uses barcelona kernels. 2012-12-07 00:53:31 +08:00
Zhang Xianyi b7c0fa6bd2 Init AMD Bulldozer codebase. 2012-12-06 07:29:54 -05:00
Zhang Xianyi 6751f7b9a7 Fixed #157. Only detect the number of physical CPU cores on Mac OSX. 2012-11-13 15:48:57 +08:00
Zhang Xianyi 538c764d2b Refs #153. Restore the original CPU affinity when calling openblas_set_num_threads(1).
Please read the issue on github.com for the detail.
2012-11-06 18:21:46 +08:00
Zhang Xianyi 6c5899dff5 Don't use xgetbv instruction when NO_AVX=1 2012-10-09 14:52:35 +08:00
Zhang Xianyi 735ca38b8f Refs #139. Check OS supporting AVX on runtime. 2012-09-18 15:46:20 +08:00
Zhang Xianyi f76a384841 Refs #139. Added NO_AVX flag to use old Nehalem kernels on Sandy Bridge.
For example, make NO_AVX=1 or make DYNAMIC_ARCH=1 NO_AVX=1
2012-09-17 23:25:46 +08:00
Jameson Nash d0e731e8b8 provide support for passing CFLAGS, FFLAGS, PFLAGS, FPFLAGS to make on the command line 2012-08-21 00:31:12 -04:00
Zhang Xianyi fe4ab95cd5 Refs #136. Fixed a bug about controlling the number of threads on Windows. 2012-08-19 23:50:54 +08:00
Xianyi Zhang 801383effe Fixed a hang bug when shutdown blas threads server on Windows. Added the feature about dynamic changing the number of threads on Windows. 2012-08-14 18:34:32 +08:00
Zhang Xianyi 54cd65e47f Use sandy bridge kernel when DYNAMIC_ARCH=1. 2012-08-13 15:25:08 +08:00
Zhang Xianyi a55821a2ec Refs #132. Kill the threads when unload the library. 2012-08-11 21:33:15 +08:00
Zhang Xianyi d007cca61d Refs #134. Fixed the building bug on IBM Power. 2012-08-10 11:54:21 +08:00
Xianyi Zhang 25f1a573fd Fixed the build bug when DYNAMIC_ARCH=0. 2012-07-07 12:12:24 +08:00
Sylvestre Ledru 3692b4d631 Improve the detection of sparc 2012-07-02 02:51:38 +02:00
Xianyi Zhang a507b56ab1 Refs #119 #118. Fixed disabling hyper threading bug. 2012-06-29 15:53:24 +08:00
Xianyi Zhang 853d16ed7e Added openblas_set_num_threads dummy function on Windows. We plan to implement this feature in next version. 2012-06-23 13:07:38 +08:00
Zhang Xianyi 422359d09a Export openblas_set_num_threads in shared library. 2012-06-23 11:32:43 +08:00
Zhang Xianyi d3b67d0bd8 Refs #113. Fixed the typo BOBCATE -> BOBCAT 2012-05-31 22:40:15 +08:00
Zhang Xianyi d6cab3f37e Refs #113. Support AMD Bobcate using Barcelona kernel codes. Replace 3DNow! with MMX. 2012-05-31 18:17:45 +08:00
Zhang Xianyi 90d6ad569d Merge branch 'sandybridge' into develop
Just copy the kernel codes from Nehalem. The optimization is ongoing.
2012-05-31 12:44:55 +08:00
Xianyi Zhang a6adbb299d Refs #112. Improved setting thread affinity in Linux. Remove the limit (64) about the number of CPU cores. 2012-05-29 15:23:52 +08:00
Xianyi Zhang a53c6e2440 Merge branch 'develop' into sandybridge 2012-05-25 23:16:44 +08:00
Zaheer Chothia a431042475 Fix inconsistent case for OS_* macros (Refs pull request #111) 2012-05-23 00:01:14 +02:00
Mike Nolta 4e29b6ffc0 FreeBSD: fix OS_FreeBSD -> OS_FREEBSD typos 2012-05-21 16:57:19 -04:00
Xianyi Zhang 19a48b82cf Init Sandybridge codes based on Nehalem. 2012-03-30 20:01:03 +08:00
Xianyi Zhang 0b89a7a92d Ref #82. Disable outputing debug information in alloc_mmap. 2012-03-23 18:17:12 +08:00
Wang Qian 8163ab7e55 Change the block size on Loongson 3B. 2011-11-23 18:41:49 +00:00
Xianyi Zhang ef6f7f32ae Fixed mbind bug on Loongson 3B. Check the return value of my_mbind function. 2011-11-23 17:17:41 +00:00
Xianyi Zhang b95ad4cfaf Support detecting ICT Loongson-3B CPU. 2011-11-09 19:29:50 +00:00
traz 831858b883 Modify aligned address of sa and sb to improve the performance of multi-threads. 2011-09-23 20:59:48 +00:00
Xianyi Zhang 16fc083322 Refs #47. Fixed the seting parameter bug on Loongson 3A single thread version. 2011-09-08 16:39:34 +00:00
Xianyi Zhang 3c856c0c1a Check the return value of pthread_create. Update the docs with known issue on Loongson 3A. 2011-09-06 18:27:33 +00:00
Xianyi Zhang 4727fe8abf Refs #47. On Loongson 3A, set DGEMM_R parameter depending on different number of threads. It would improve double precision BLAS3 on multi-threads. 2011-09-05 15:13:52 +00:00
Xianyi Zhang 82f5274828 Refs #39. It's unnecessary to include sys/mman.h file in blas_server_omp.c. 2011-06-22 01:52:20 +08:00
Xianyi Zhang 1496383224 Print the wall time (cycles) with enabling FUNCTION_PROFILE. 2011-06-09 10:40:15 +08:00
Xianyi Zhang af40551c9f Fixed the makefile bug about openblas_set_num_threads. 2011-05-27 21:15:30 +08:00
Xianyi Zhang 417b8ec792 Added openblas_set_num_threads for Fortran. 2011-05-06 17:03:35 +08:00
Xianyi Zhang 989c6f8b06 Fixed #14 the SEGFAULT bug on 64 cores. On SMP server, the number of CPUs or cores should be less than or equal to 64. 2011-04-07 14:48:10 +08:00
Xianyi Zhang e4bb6f2482 Fixed the detecting bug on Intel Core i5. Thank ggl329 for the patch. 2011-03-22 14:09:47 +08:00
Xianyi Zhang f7a5e049e2 Enable Debug flags in memory alloc and init functions. 2011-02-26 11:51:39 +08:00
Xianyi Zhang 128418f49b Fixed #10. Supported GOTO_NUM_THREADS & GOTO_THREADS_TIMEOUT environment variables. 2011-02-24 16:32:13 +08:00
Xianyi Zhang e51364edb4 Fixed #5 Detected Intel Westmere (using Nehalem codes) in build and dynamic arch build.
Thanks Cao He from Dawning supporting Intel Xeon 5660 testbed.
2011-02-19 00:03:50 +08:00
Xianyi Zhang e6c13e2b3c changed library name to openblas and modified environment variable. 2011-01-24 17:58:05 +00:00
Xianyi Zhang 5c9f1ebbf9 Fixed a bug when compiling dynamic ARCH x86 in GotoBLAS2. 2011-01-24 16:04:17 +00:00
Xianyi Zhang 342bbc3871 Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00