Zhang Xianyi
|
bfaaa975e6
|
Added BULLDOZER target. So far it uses barcelona kernels.
|
2012-12-07 00:53:31 +08:00 |
Zhang Xianyi
|
b7c0fa6bd2
|
Init AMD Bulldozer codebase.
|
2012-12-06 07:29:54 -05:00 |
Zhang Xianyi
|
6751f7b9a7
|
Fixed #157. Only detect the number of physical CPU cores on Mac OSX.
|
2012-11-13 15:48:57 +08:00 |
Zhang Xianyi
|
538c764d2b
|
Refs #153. Restore the original CPU affinity when calling openblas_set_num_threads(1).
Please read the issue on github.com for the detail.
|
2012-11-06 18:21:46 +08:00 |
Zhang Xianyi
|
6c5899dff5
|
Don't use xgetbv instruction when NO_AVX=1
|
2012-10-09 14:52:35 +08:00 |
Zhang Xianyi
|
735ca38b8f
|
Refs #139. Check OS supporting AVX on runtime.
|
2012-09-18 15:46:20 +08:00 |
Zhang Xianyi
|
f76a384841
|
Refs #139. Added NO_AVX flag to use old Nehalem kernels on Sandy Bridge.
For example, make NO_AVX=1 or make DYNAMIC_ARCH=1 NO_AVX=1
|
2012-09-17 23:25:46 +08:00 |
Jameson Nash
|
d0e731e8b8
|
provide support for passing CFLAGS, FFLAGS, PFLAGS, FPFLAGS to make on the command line
|
2012-08-21 00:31:12 -04:00 |
Zhang Xianyi
|
fe4ab95cd5
|
Refs #136. Fixed a bug about controlling the number of threads on Windows.
|
2012-08-19 23:50:54 +08:00 |
Xianyi Zhang
|
801383effe
|
Fixed a hang bug when shutdown blas threads server on Windows. Added the feature about dynamic changing the number of threads on Windows.
|
2012-08-14 18:34:32 +08:00 |
Zhang Xianyi
|
54cd65e47f
|
Use sandy bridge kernel when DYNAMIC_ARCH=1.
|
2012-08-13 15:25:08 +08:00 |
Zhang Xianyi
|
a55821a2ec
|
Refs #132. Kill the threads when unload the library.
|
2012-08-11 21:33:15 +08:00 |
Zhang Xianyi
|
d007cca61d
|
Refs #134. Fixed the building bug on IBM Power.
|
2012-08-10 11:54:21 +08:00 |
Xianyi Zhang
|
25f1a573fd
|
Fixed the build bug when DYNAMIC_ARCH=0.
|
2012-07-07 12:12:24 +08:00 |
Sylvestre Ledru
|
3692b4d631
|
Improve the detection of sparc
|
2012-07-02 02:51:38 +02:00 |
Xianyi Zhang
|
a507b56ab1
|
Refs #119 #118. Fixed disabling hyper threading bug.
|
2012-06-29 15:53:24 +08:00 |
Xianyi Zhang
|
853d16ed7e
|
Added openblas_set_num_threads dummy function on Windows. We plan to implement this feature in next version.
|
2012-06-23 13:07:38 +08:00 |
Zhang Xianyi
|
422359d09a
|
Export openblas_set_num_threads in shared library.
|
2012-06-23 11:32:43 +08:00 |
Zhang Xianyi
|
d3b67d0bd8
|
Refs #113. Fixed the typo BOBCATE -> BOBCAT
|
2012-05-31 22:40:15 +08:00 |
Zhang Xianyi
|
d6cab3f37e
|
Refs #113. Support AMD Bobcate using Barcelona kernel codes. Replace 3DNow! with MMX.
|
2012-05-31 18:17:45 +08:00 |
Zhang Xianyi
|
90d6ad569d
|
Merge branch 'sandybridge' into develop
Just copy the kernel codes from Nehalem. The optimization is ongoing.
|
2012-05-31 12:44:55 +08:00 |
Xianyi Zhang
|
a6adbb299d
|
Refs #112. Improved setting thread affinity in Linux. Remove the limit (64) about the number of CPU cores.
|
2012-05-29 15:23:52 +08:00 |
Xianyi Zhang
|
a53c6e2440
|
Merge branch 'develop' into sandybridge
|
2012-05-25 23:16:44 +08:00 |
Zaheer Chothia
|
a431042475
|
Fix inconsistent case for OS_* macros (Refs pull request #111)
|
2012-05-23 00:01:14 +02:00 |
Mike Nolta
|
4e29b6ffc0
|
FreeBSD: fix OS_FreeBSD -> OS_FREEBSD typos
|
2012-05-21 16:57:19 -04:00 |
Xianyi Zhang
|
19a48b82cf
|
Init Sandybridge codes based on Nehalem.
|
2012-03-30 20:01:03 +08:00 |
Xianyi Zhang
|
0b89a7a92d
|
Ref #82. Disable outputing debug information in alloc_mmap.
|
2012-03-23 18:17:12 +08:00 |
Wang Qian
|
8163ab7e55
|
Change the block size on Loongson 3B.
|
2011-11-23 18:41:49 +00:00 |
Xianyi Zhang
|
ef6f7f32ae
|
Fixed mbind bug on Loongson 3B. Check the return value of my_mbind function.
|
2011-11-23 17:17:41 +00:00 |
Xianyi Zhang
|
b95ad4cfaf
|
Support detecting ICT Loongson-3B CPU.
|
2011-11-09 19:29:50 +00:00 |
traz
|
831858b883
|
Modify aligned address of sa and sb to improve the performance of multi-threads.
|
2011-09-23 20:59:48 +00:00 |
Xianyi Zhang
|
16fc083322
|
Refs #47. Fixed the seting parameter bug on Loongson 3A single thread version.
|
2011-09-08 16:39:34 +00:00 |
Xianyi Zhang
|
3c856c0c1a
|
Check the return value of pthread_create. Update the docs with known issue on Loongson 3A.
|
2011-09-06 18:27:33 +00:00 |
Xianyi Zhang
|
4727fe8abf
|
Refs #47. On Loongson 3A, set DGEMM_R parameter depending on different number of threads. It would improve double precision BLAS3 on multi-threads.
|
2011-09-05 15:13:52 +00:00 |
Xianyi Zhang
|
82f5274828
|
Refs #39. It's unnecessary to include sys/mman.h file in blas_server_omp.c.
|
2011-06-22 01:52:20 +08:00 |
Xianyi Zhang
|
1496383224
|
Print the wall time (cycles) with enabling FUNCTION_PROFILE.
|
2011-06-09 10:40:15 +08:00 |
Xianyi Zhang
|
af40551c9f
|
Fixed the makefile bug about openblas_set_num_threads.
|
2011-05-27 21:15:30 +08:00 |
Xianyi Zhang
|
417b8ec792
|
Added openblas_set_num_threads for Fortran.
|
2011-05-06 17:03:35 +08:00 |
Xianyi Zhang
|
989c6f8b06
|
Fixed #14 the SEGFAULT bug on 64 cores. On SMP server, the number of CPUs or cores should be less than or equal to 64.
|
2011-04-07 14:48:10 +08:00 |
Xianyi Zhang
|
e4bb6f2482
|
Fixed the detecting bug on Intel Core i5. Thank ggl329 for the patch.
|
2011-03-22 14:09:47 +08:00 |
Xianyi Zhang
|
f7a5e049e2
|
Enable Debug flags in memory alloc and init functions.
|
2011-02-26 11:51:39 +08:00 |
Xianyi Zhang
|
128418f49b
|
Fixed #10. Supported GOTO_NUM_THREADS & GOTO_THREADS_TIMEOUT environment variables.
|
2011-02-24 16:32:13 +08:00 |
Xianyi Zhang
|
e51364edb4
|
Fixed #5 Detected Intel Westmere (using Nehalem codes) in build and dynamic arch build.
Thanks Cao He from Dawning supporting Intel Xeon 5660 testbed.
|
2011-02-19 00:03:50 +08:00 |
Xianyi Zhang
|
e6c13e2b3c
|
changed library name to openblas and modified environment variable.
|
2011-01-24 17:58:05 +00:00 |
Xianyi Zhang
|
5c9f1ebbf9
|
Fixed a bug when compiling dynamic ARCH x86 in GotoBLAS2.
|
2011-01-24 16:04:17 +00:00 |
Xianyi Zhang
|
342bbc3871
|
Import GotoBLAS2 1.13 BSD version codes.
|
2011-01-24 14:54:24 +00:00 |