Go to file
Sebastien Fabbro 9f0fb6e662 Respect user's LDFLAGS 2013-07-25 14:08:37 -07:00
benchmark Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
ctest Respect user's LDFLAGS 2013-07-25 14:08:37 -07:00
driver Merge branch 'loongson3a' into develop 2013-07-20 22:33:17 +08:00
exports Respect user's LDFLAGS 2013-07-25 14:08:37 -07:00
interface Fix xianyi/OpenBLAS#256 2013-07-22 17:02:06 -07:00
kernel Fixed a computational error in zgemm_kernel_4x4_sandy.S file. 2013-07-18 20:23:21 +08:00
lapack Refs #191. A walk around for dtrtri_U single thread bug. 2013-07-14 22:16:30 +08:00
lapack-netlib Changed makefile for lapack. 2013-07-14 10:41:54 +08:00
reference Fixed a build bug with NO_LAPACK=1 and SANNITY_CHECK=1. 2011-05-03 14:42:11 +08:00
test Respect user's LDFLAGS 2013-07-25 14:08:37 -07:00
utest Adde the mising test_amax.c file. 2012-04-26 16:40:44 +08:00
.gitignore Refs #247. Included lapack source codes. Avoid downloading tar.gz from netlib.org 2013-07-09 18:13:48 +08:00
.travis.yml Updated travis. 2013-07-12 21:41:12 +08:00
CONTRIBUTORS.md Update CONTRIBUTORS.md 2013-07-20 23:32:23 +08:00
Changelog.txt Fixed #253. Update doc for v0.2.7 version. 2013-07-20 23:05:12 +08:00
GotoBLAS_00License.txt rename documents in GotoBLAS. 2011-01-24 15:57:23 +00:00
GotoBLAS_01Readme.txt rename documents in GotoBLAS. 2011-01-24 15:57:23 +00:00
GotoBLAS_02QuickInstall.txt rename documents in GotoBLAS. 2011-01-24 15:57:23 +00:00
GotoBLAS_03FAQ.txt Refs #85 #104. Use patch instead of git to apply this segfaults.patch. 2012-05-08 23:50:46 +08:00
GotoBLAS_04FAQ.txt rename documents in GotoBLAS. 2011-01-24 15:57:23 +00:00
GotoBLAS_05LargePage.txt rename documents in GotoBLAS. 2011-01-24 15:57:23 +00:00
GotoBLAS_06WeirdPerformance.txt rename documents in GotoBLAS. 2011-01-24 15:57:23 +00:00
LICENSE Fixed a typo in license file. 2012-03-27 14:17:13 +08:00
Makefile Modified Makefile to avoid redundant echo. 2013-07-16 22:44:27 +08:00
Makefile.alpha Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
Makefile.generic Respect user's LDFLAGS 2013-07-25 14:08:37 -07:00
Makefile.ia64 Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
Makefile.install Modified Makefile.install 2013-07-16 17:45:00 +08:00
Makefile.mips64 Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
Makefile.power Respect user's LDFLAGS 2013-07-25 14:08:37 -07:00
Makefile.prebuild Refs #187. Use perl to generate cblas_noconst.h instead of sed. 2013-01-22 00:29:54 +08:00
Makefile.rule Fixed #253. Update doc for v0.2.7 version. 2013-07-20 23:05:12 +08:00
Makefile.sparc Respect user's LDFLAGS 2013-07-25 14:08:37 -07:00
Makefile.system Merge branch 'loongson3a' into develop 2013-07-20 22:33:17 +08:00
Makefile.tail provide support for passing CFLAGS, FFLAGS, PFLAGS, FPFLAGS to make on the command line 2012-08-21 00:31:12 -04:00
Makefile.x86 Respect user's LDFLAGS 2013-07-25 14:08:37 -07:00
Makefile.x86_64 Respect user's LDFLAGS 2013-07-25 14:08:37 -07:00
README.md Merge branch 'loongson3a' into develop 2013-07-20 22:33:17 +08:00
TargetList.txt TargetList.txt: minor re-ordering 2013-03-17 23:03:05 +08:00
c_check Refs #248. Support LAPACK and LAPACKE with lsbcc. 2013-07-10 16:02:27 +08:00
cblas.h create openblas_get_parallel to retrieve information which 2013-07-11 21:39:19 +08:00
common.h Refs #214, #221, #246. Fixed the getrf overflow bug on Windows. 2013-07-11 03:20:02 +08:00
common_alpha.h Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
common_c.h Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
common_d.h Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
common_ia64.h Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
common_interface.h Fixed #141. make f77blas.h compatible with compilers which lack C99 complex number. 2012-10-08 12:48:20 +08:00
common_lapack.h Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
common_level1.h Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
common_level2.h Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
common_level3.h Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
common_linux.h Refs #248. Fixed the LSB compatiable issue for BLAS only. 2013-07-09 15:38:03 +08:00
common_macro.h Refs #47. On Loongson 3A, set DGEMM_R parameter depending on different number of threads. It would improve double precision BLAS3 on multi-threads. 2011-09-05 15:13:52 +00:00
common_mips64.h Fixed the SEGFAULT bug with Loongcc and Loongson3. 2013-04-11 15:33:43 +08:00
common_param.h refs #55. Added DTB_ENTRIES into dynamic arch setting parameters. Now, it can read DTB_ENTRIES on runtime. 2011-09-05 17:37:07 +08:00
common_power.h Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
common_q.h Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
common_reference.h Added the test case for samax. 2012-04-26 16:17:17 +08:00
common_s.h Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
common_sparc.h Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
common_thread.h Fixed noisy warning with Clang 2012-06-21 00:17:28 +02:00
common_x.h Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
common_x86.h Support AMD Piledriver by bulldozer kernels. 2013-07-06 12:06:43 -03:00
common_x86_64.h Support AMD Piledriver by bulldozer kernels. 2013-07-06 12:06:43 -03:00
common_z.h Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
cpuid.S Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
cpuid.h Support AMD Piledriver by bulldozer kernels. 2013-07-06 12:06:43 -03:00
cpuid_alpha.c refs #55. Added DTB_ENTRIES into dynamic arch setting parameters. Now, it can read DTB_ENTRIES on runtime. 2011-09-05 17:37:07 +08:00
cpuid_ia64.c refs #55. Added DTB_ENTRIES into dynamic arch setting parameters. Now, it can read DTB_ENTRIES on runtime. 2011-09-05 17:37:07 +08:00
cpuid_mips.c Fixed the detection bug on Loongson 3A server. 2012-09-21 10:14:07 +00:00
cpuid_power.c Refs #220. Support Power7 by old Power6 kernels. 2013-05-21 22:59:45 +08:00
cpuid_sparc.c refs #55. Added DTB_ENTRIES into dynamic arch setting parameters. Now, it can read DTB_ENTRIES on runtime. 2011-09-05 17:37:07 +08:00
cpuid_x86.c Support AMD Piledriver by bulldozer kernels. 2013-07-06 12:06:43 -03:00
ctest.c Refs #248. Support LAPACK and LAPACKE with lsbcc. 2013-07-10 16:02:27 +08:00
ctest1.c Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
ctest2.c Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
f_check Refs #255. Didn't use f77 compiler. 2013-07-22 11:34:43 +08:00
ftest.f Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
ftest2.f Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
getarch.c Fixed the typo in getarch.c 2013-07-09 16:26:59 +08:00
getarch_2nd.c Added UNROLL values for 3M to getarch_2nd.c, Makefile.system and Makefile.L3 2013-06-09 17:26:42 +02:00
l1param.h Added BULLDOZER target. So far it uses barcelona kernels. 2012-12-07 00:53:31 +08:00
l2param.h Support AMD Piledriver by bulldozer kernels. 2013-07-06 12:06:43 -03:00
make.inc Refs #176. Fixed make.inc overriding RANLIB bug when cross-compiling LAPACK. 2013-01-03 01:47:31 +08:00
openblas_config_template.h Fixed #217 openblas_config.h bug on Windows 64. 2013-07-01 00:35:14 +08:00
param.h Support AMD Piledriver by bulldozer kernels. 2013-07-06 12:06:43 -03:00
patch.for_lapack-3.1.1 Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
patch.for_lapack-3.4.0 Refs #88. Fixed the build bug about LAPACKE C Interface to LAPACKE. 2012-04-13 23:12:06 +08:00
patch.for_lapack-3.4.1 Refs #130 Prevent reading ipiv array beyond the bound in ?laswp. Use laswp instead of laswp_oncopy in getrf. 2012-08-09 20:06:51 +08:00
patch.for_lapack-3.4.2 Added the patch for lapacke example. 2012-11-13 00:53:26 +08:00
quickbuild.32bit Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
quickbuild.64bit Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
quickbuild.win32 Added the tip for Windows. 2012-08-09 20:37:55 +08:00
quickbuild.win64 Added the tip for Windows. 2012-08-09 20:37:55 +08:00
segfaults.patch Refs #85 #104. Use patch instead of git to apply this segfaults.patch. 2012-05-08 23:50:46 +08:00
symcopy.h Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
version.h changed library name to openblas and modified environment variable. 2011-01-24 17:58:05 +00:00

README.md

OpenBLAS

Build Status

Introduction

OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version.

Please read the documents on OpenBLAS wiki pages http://github.com/xianyi/OpenBLAS/wiki.

Binary Packages

We provide binary packages for the following platform.

  • Windows x86/x86_64

You can download them from file hosting on sourceforge.net.

Installation from Source

Download from project homepage. http://xianyi.github.com/OpenBLAS/

Or, check out codes from git://github.com/xianyi/OpenBLAS.git

Normal compile

  • type "make" to detect the CPU automatically. or
  • type "make TARGET=xxx" to set target CPU, e.g. "make TARGET=NEHALEM". The full target list is in file TargetList.txt.

Cross compile

Please set CC and FC with the cross toolchains. Then, set HOSTCC with your host C compiler. At last, set TARGET explicitly.

Examples:

On X86 box, compile this library for loongson3a CPU.

make BINARY=64 CC=mips64el-unknown-linux-gnu-gcc FC=mips64el-unknown-linux-gnu-gfortran HOSTCC=gcc TARGET=LOONGSON3A

On X86 box, compile this library for loongson3a CPU with loongcc (based on Open64) compiler.

make CC=loongcc FC=loongf95 HOSTCC=gcc TARGET=LOONGSON3A CROSS=1 CROSS_SUFFIX=mips64el-st-linux-gnu-   NO_LAPACKE=1 NO_SHARED=1 BINARY=32

Debug version

make DEBUG=1

Install to the directory (optional)

Example:

make install PREFIX=your_installation_directory

The default directory is /opt/OpenBLAS

Support CPU & OS

Please read GotoBLAS_01Readme.txt

Additional support CPU:

x86/x86-64:

  • Intel Xeon 56xx (Westmere): Used GotoBLAS2 Nehalem codes.
  • Intel Sandy Bridge: Optimized Level-3 BLAS with AVX on x86-64.
  • Intel Haswell: Optimized Level-3 BLAS with AVX on x86-64 (identical to Sandy Bridge).
  • AMD Bobcat: Used GotoBLAS2 Barcelona codes.
  • AMD Bulldozer: x86-64 S/DGEMM AVX kernels. (Thank Werner Saar)
  • AMD PILEDRIVER: Used Bulldozer codes.

MIPS64:

  • ICT Loongson 3A: Optimized Level-3 BLAS and the part of Level-1,2.
  • ICT Loongson 3B: Experimental

Support OS:

Usages

Link with libopenblas.a or -lopenblas for shared library.

Set the number of threads with environment variables.

Examples:

export OPENBLAS_NUM_THREADS=4

or

export GOTO_NUM_THREADS=4

or

export OMP_NUM_THREADS=4

The priorities are OPENBLAS_NUM_THREADS > GOTO_NUM_THREADS > OMP_NUM_THREADS.

If you compile this lib with USE_OPENMP=1, you should set OMP_NUM_THREADS environment variable. OpenBLAS ignores OPENBLAS_NUM_THREADS and GOTO_NUM_THREADS with USE_OPENMP=1.

Set the number of threads on runtime.

We provided the below functions to control the number of threads on runtime.

void goto_set_num_threads(int num_threads);

void openblas_set_num_threads(int num_threads);

If you compile this lib with USE_OPENMP=1, you should use the above functions, too.

Report Bugs

Please add a issue in https://github.com/xianyi/OpenBLAS/issues

Contact

ChangeLog

Please see Changelog.txt to obtain the differences between GotoBLAS2 1.13 BSD version.

Troubleshooting

  • Please read Faq at first.
  • Please use gcc version 4.6 and above to compile Sandy Bridge AVX kernels on Linux/MingW/BSD.
  • Please use Clang version 3.1 and above to compile the library on Sandy Bridge microarchitecture. The Clang 3.0 will generate the wrong AVX binary code.
  • The number of CPUs/Cores should less than or equal to 256.
  • On Linux, OpenBLAS sets the processor affinity by default. This may cause the conflict with R parallel. You can build the library with NO_AFFINITY=1.
  • On Loongson 3A. make test would be failed because of pthread_create error. The error code is EAGAIN. However, it will be OK when you run the same testcase on shell.

Contributing

  1. Check for open issues or open a fresh issue to start a discussion around a feature idea or a bug.
  2. Fork the OpenBLAS repository to start making your changes.
  3. Write a test which shows that the bug was fixed or that the feature works as expected.
  4. Send a pull request. Make sure to add yourself to CONTRIBUTORS.md.