Matthew Barber 0bb973976b Squash commit of GitHub wiki
* Created Installation Guide (markdown)
* Updated quick installation (markdown)
* Updated Home (markdown)
* Updated Document (markdown)
* Updated Document (markdown)
* Updated Document (markdown)
* Created Installation Guide (markdown)
* Created Home (markdown)
* Init version
* Updated OpenBLAS Wiki (markdown)
* Updated OpenBLAS Wiki (markdown)
* Updated OpenBLAS Wiki (markdown)
* Updated Document (markdown)
* Updated Installation Guide (markdown)
* Updated Installation Guide (markdown)
* Created Download (markdown)
* Created Faq (markdown)
* Updated Faq (markdown)
* Updated FAQ
* Created How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated Document (markdown)
* Updated Faq (markdown)
* Updated Faq (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated Faq (markdown)
* Updated OpenBLAS Wiki (markdown)
* Updated Home (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Created How to generate import library for MingW (markdown)
* Updated Document (markdown)
* Updated Faq (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Build instrunctions for FreeBSD
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated Installation Guide (markdown)
* Updated Faq (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* minor edits
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated Faq (markdown)
* Installation instructions for Windows
* Updated Faq (markdown)
* G77 conventions no longer needed with GCC 4.7+
* Updated Home (markdown)
* Document why issue 168 occurred.
* Updated Home (markdown)
* Created Publications (markdown)
* Updated Home (markdown)
* Updated Document (markdown)
* Updated Faq (markdown)
* Updated Download (markdown)
* Updated Publications (markdown)
* Updated Faq (markdown)
* Updated Document (markdown)
* Revert 7580d38ffad37e6613e6304707aaaa681f3d78c2 ... b1bd4ff37d2106bbd5c4730a08dbb789cc44e7d4
* Created Mailing List (markdown)
* Updated Mailing List (markdown)
* Updated Mailing List (markdown)
* Updated Home (markdown)
* Updated Document (markdown)
* Updated Publications (markdown)
* Updated Download (markdown)
* Updated Faq (markdown)
* Updated Home (markdown)
* Updated Faq (markdown)
* Updated Home (markdown)
* Revert b69f1417cdf8820be046cc27a2b96b42a25bc3a3 ... 90a227c317c3572ced943461ac3a252c40790f44 on Home
* Updated Home (markdown)
* Updated Publications (markdown)
* Updated Faq (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* We already ensure the stack alignment in Makefile.system for Win32.
* Updated Faq (markdown)
* Updated Faq (markdown)
* Updated Publications (markdown)
* Created Donation (markdown)
* Updated Home (markdown)
* Updated Document (markdown)
* Updated Faq (markdown)
* Updated Publications (markdown)
* Updated Download (markdown)
* Updated Mailing List (markdown)
* Updated Donation (markdown)
* Updated Download (markdown)
* Updated Donation (markdown)
* Updated Donation (markdown)
* Updated Donation (markdown)
* Updated Donation (markdown)
* Updated Home (markdown)
* Updated Faq (markdown)
* Updated Download (markdown)
* Updated Home (markdown)
* Updated Home (markdown)
* Add new entry for static linking and pthread.
* Fix named anchors (see http://stackoverflow.com/questions/5319754/cross-reference-named-anchor-in-markdown/7335259#7335259)
* Created Related packages that use OpenBLAS (markdown)
* Updated Related packages that use OpenBLAS (markdown)
* Updated Related packages that use OpenBLAS (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated Document (markdown)
* Created To-do List (markdown)
* Updated To do List (markdown)
* Updated Fixed optimized kernels To do List (markdown)
* Fix English idiom
* Remove trailing whitespace
* Updated Fixed optimized kernels To do List (markdown)
* Updated Fixed optimized kernels To do List (markdown)
* Updated Fixed optimized kernels To do List (markdown)
* Updated Fixed optimized kernels To do List (markdown)
* Updated Faq (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated Related packages that use OpenBLAS (markdown)
* Updated Related packages that use OpenBLAS (markdown)
* Created Machine List (markdown)
* Updated Document (markdown)
* Updated Installation Guide (markdown)
* Created User Manual (markdown)
* Updated User Manual (markdown)
* Updated Document (markdown)
* Updated User Manual (markdown)
* Updated User Manual (markdown)
* Updated User Manual (markdown)
* Updated User Manual (markdown)
* Updated Related packages that use OpenBLAS (markdown)
* Updated Faq (markdown)
* Updated Related packages that use OpenBLAS (markdown)
* Updated Machine List (markdown)
* Updated Related packages that use OpenBLAS (markdown)
* Updated Related packages that use OpenBLAS (markdown)
* Add a note about building in QEMU
* Updated Home (markdown)
* Updated Faq (markdown)
* update for allocating too many meory error.
* Updated Faq (markdown)
* Updated Faq (markdown)
* Updated Installation Guide (markdown)
* Updated Faq (markdown)
* Init function doc
* Updated Document (markdown)
* Updated User Manual (markdown)
* Updated User Manual (markdown)
* Created How to build OpenBLAS for Android (markdown)
* Updated How to build OpenBLAS for Android (markdown)
* Updated Home (markdown)
* Part of the description is really no clear, I add some more information, so it would be easier for VS user to fix the problems facing them.
* Created Developer manual (markdown)
* Updated Document (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* a typo, download ** frome -> download from
* Updated Faq (markdown)
* English (minor edit)
* Updated Developer manual (markdown)
* Updated Developer manual (markdown)
* Updated Developer manual (markdown)
* Updated Machine List (markdown)
* Updated Developer manual (markdown)
* Updated Developer manual (markdown)
* Updated How to build OpenBLAS for Android (markdown)
* Updated How to build OpenBLAS for Android (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* issue 842
* Updated How to build OpenBLAS for Android (markdown)
* Updated How to build OpenBLAS for Android (markdown)
* Updated How to build OpenBLAS for Android (markdown)
* Updated How to build OpenBLAS for Android (markdown)
* Added FC for building with Fortran
* Change link for the Intel MKL documentation
* Updated User Manual (markdown)
* Updated User Manual (markdown)
* Added MIPS build instructions from issue 949
* use TARGET_CFLAGS and TARGET_LDFLAGS instead of CFLAGS and LDFLAGS for linking OpenBLAS on ARMv7
* Add Windows updates (msys2,mingw/w64 merger), Android/MIPS pointers, qemu hint
* Building libs & netlib targets to prevent errors in tests
* Recipes not targets (for make)
* Making only libs, not netlib (which also contains link/run tests...)
* Copied from instructions by Ivan Ushakov, originally posted in issue 569
* Updated How to build OpenBLAS for iPhone iOS (markdown)
* Updated Faq (markdown)
* Created How to build OpenBLAS for iPhone iOS (markdown)
* error code (0xc000007b) was missing a character
* Updated How to build OpenBLAS for iPhone iOS (ARMv8) (markdown)
* Updated How to build OpenBLAS for iPhone iOS (ARMv8) (markdown)
* Revert 7e9dd0ebf079e002e3aa831fa671fde3e8cfad81...8d105c7be8cd447482f61e0295c0c146f5314eb5 on How to build OpenBLAS for iPhone iOS
* Add guide on how to reversibly supplant Ubuntu LTS libblas.so.3
* typo
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated User Manual (markdown)
* Updated Faq (markdown)
* Updated Download (markdown)
* Add perl to pacman package list
* Fixed formatting on general questions
* Copied from issue 1136
* Added instructions for building for Windows UWP.
* To clear confusions vs super-fat-binaries that dont exist.
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Update for 0.2.20 (full builds, ARMv7 softfp support, newer NDKs using CLANG)
* Updated How to build OpenBLAS for Android (markdown)
* Fix some formatting issues
* Updated How to build OpenBLAS for Android (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated How to build OpenBLAS for Android (markdown)
* Created Precompiled installation packages (markdown)
* Updated Precompiled installation packages (markdown)
* Example - debian?
* Mention (and link to) distribution-specific packages
* Updated Installation Guide (markdown)
* OpenSuSE (13.2, SLE included)
* Updated Precompiled installation packages (markdown)
* Updated Precompiled installation packages (markdown)
* Make it look consistent.
* Fedora+EPEL // maybe rpmbuild is too heavy
* Updated Precompiled installation packages (markdown)
* Updated Precompiled installation packages (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated Precompiled installation packages (markdown)
* fix toolchain argument in armv8 clang build as per issue 1337
* add note about stdio.h not found error
* Add flang instructions
* Use the SVG Travis badge
* homebrew option for OSX
* Promote native MSVC builds with LLVM
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Direct people to the appropriate instructions
* Add link to the Goto paper
* Add CMAKE_BUILD_TYPE
* Add note about having to specify AR on a Mac, from issue 1435
* Mention requirement to build a standalone toolchain in the clang section as well
* added 'perl' to conda install command
* homebrew/science was deprecated. This tap is now empty as all its formulae were migrated.
* Added hint for "expected identifier" error message to mingw section following
issue 1503
* Revert 9161c3b54281131e892dec739d888f35e6c59cf3...03f879be0c9e6a55705bc7efd5ee193299e04029 on How to use OpenBLAS in Microsoft Visual Studio
* Revert to recommending mingw-w64 from sf.net and add note about issue 1503
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Update MSVC installation procedure with info from issue 1521
* Add downgrade option for msys2 mingw compiler issue as suggested by econwang
in issue 1503
* Add note about static linking bug with NDK 16 and API>22
* Updated Precompiled installation packages (markdown)
* Updated Precompiled installation packages (markdown)
* Updated Faq (markdown)
* OBS is renamed and deep link format changed. Apparently recent SLE includes rpm by default too.
* Add links to Conda-Forge and to staticfloat's builds for Julia
* Mention _64 suffix appended to Julia builds with INTERFACE64 (issue 1617)
* Fix unwanted markdown italicization
* Add instruction to change to the generic sgemmkernel implementation from issue 1531
* Added hint about stack size requirements for running lapack-test from PR 1645; fixed markup of section headings
* Add link to RvdG's publications page as a non-paywalled source of the "Goto paper"
* Add section about non-suitability of the IBM XL compiler on POWER8
* Mention cmake version requirement in view or recent issues with link failures in utest etc.
* Replace outdated entry for Sandybridge support with more general section on AVX512, Ryzen and GPU
* Mention Apple Accelerate here as iOS build issue tickets usually die as soon as someone points out this option to the questioner.
* Add section about unexpectedly using an older pre-installed version of the shared library (issue 1822)
* fix markup of new entry
* Mention perl and C compiler as prerequisites on the build host
* Save WIP page
* Updated Notes on parallelism and OpenBLAS (markdown)
* Updated Notes on parallelism and OpenBLAS (markdown)
* Updated Notes on parallelism and OpenBLAS (markdown)
* Updated [WIP] Notes on parallelism and OpenBLAS (markdown)
* Updated [WIP] Notes on parallelism and OpenBLAS (markdown)
* Updated [WIP] Notes on parallelism and OpenBLAS (markdown)
* Destroyed [WIP] Notes on parallelism and OpenBLAS (markdown)
* Updated Faq (markdown)
* Add small note on AVX512 for CentOS/RHEL section.
* document the extension functions
* formatting
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated Download (markdown)
* Add brief general usage information from issue 1925
* Add link to Pete Warden blog article on GEMM rather than just deep-linking to a diagram from it
* Document some of the less useful parameters from param.h
* Updated Installation Guide (markdown)
* Done with issue 2089
* Add note about changed library names for update-alternatives on Debian/Ubuntu
* Updated Home (markdown)
* Add note about using OpenBLAS with CUDA_HPL 2.3 from issue issue 909
* Fix typos in previous commit
* Add pdb instructions fir cross-builds
* Add note about generic QEMU CPUID clashing with existing P2(MMX)
* typo
* typo
* C code syntax highlight
* Updated multithreading section to introduce option USE_LOCKING (issue 2164)
* Updated How to build OpenBLAS for iPhone iOS (ARMv8) (markdown)
* Updated How to build OpenBLAS for iPhone iOS (ARMv8) (markdown)
* Clarify Miniconda/cmake install instructions and redact outdated note about msys2
* Document cmake install step
* Updated How to build OpenBLAS for Android (markdown)
* Add solution for programs that look for libblas.so/liblapack.so
* Add entry for powersaving modes on ARM boards (from issue 2540)
* Add suggestion for speed problems on big.little systems from issue 2589
* Convert the ARMV8 big.little tidbit to a separate topic and update it with more details from the issue ticket
* Add entry about problems caused by using the raw cblas.h (issue 2593)
* complete quote symbol around CPATH environment variable
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Add note about running conda activate when working in a VS window (from issue 2637)
* Add note about (not) compiling with -fbounds-check (ticket 2657)
* Add entry about compile-time NUM_THREADS setting (issue 2678)
* Added some sketchy description of adding cpuids for autodetection, adding targets and architectures
* Markup and typo fixes
* Add openblas_set_affinity from PR 2547
* Created _Footer (markdown)
* Destroyed _Footer (markdown)
* Add LAPACK-like SHGEMM to document the "official" status of the SH prefix
* fix formatting of latest addition
* Move outdated instructions for gcc-based NDK versions to the bottom, add hint about x86 builds
* Add help for cpuid recognition failure
* Update source tree layout & mention extraneous cpu paramerts
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Explain why pure VS builds are slower, and highlight that they do not support DYNAMIC_ARCH
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Mention fortran requirement and incompatibility of ifort with msvc
* preliminary page for understanding the build system, needs a lot more work and input from more knowledgeable people than me
* Updated Build system overview (markdown)
* Updated WIP   Build system overview (community made) (markdown)
* add information for HOSTCC, HOST_CFLAGS
* Added alternative script which was tested on OSX with latest NDK
* added link to targets list
* Updated WIP   Build system overview (community made) (markdown)
* Updated WIP   Build system overview (community made) (markdown)
* Updated WIP   Build system overview (community made) (markdown)
* Updated WIP   Build system overview (community made) (markdown)
* Updated WIP   Build system overview (community made) (markdown)
* Updated WIP   Build system overview (community made) (markdown)
* added script for x86_64 architecture
* Updated WIP   Build system overview (community made) (markdown)
* Updated WIP   Build system overview (community made) (markdown)
* updated link to FLAME publications list
* Created How to use OpenBLAS on Cortex-M (markdown)
* Updated How to use OpenBLAS on Cortex M (markdown)
* Updated Precompiled installation packages (markdown)
* Updated How to use OpenBLAS on Cortex M (markdown)
* Updated How to use OpenBLAS on Cortex M (markdown)
* Updated How to use OpenBLAS on Cortex M (markdown)
* Update source layout graph and start a short section on benchmarking to collect various pointers from the issue tracker
* Add workaround for building with CMAKE on OSX
* Use actual small headings to fix... weird bullet indent shit
* Oops
* Updated Faq (markdown)
* Updated Faq (markdown)
* Updated How to generate import library for MingW (markdown)
* Updated How to generate import library for MinGW (markdown)
* Updated How to generate import library for MinGW (markdown)
* Updated How to generate import library for MinGW (markdown)
* Updated How to generate import library for MinGW (markdown)
* Updated How to generate import library for MinGW (markdown)
* Updated How to generate import library for MinGW (markdown)
* Updated How to generate import library for MinGW (markdown)
* Updated How to generate import library for MinGW (markdown)
* explicitly set CMAKE_MT to replace the new cmake default llvm-mt (failing)
* Add -Wl,-rpath,/your_path/OpenBLAS/lib option to gcc linker line in "Link shared library" section + explanation for why it is needed/can be omitted. Also make note that -lgfortran not needed if only making LAPACKE calls.
* Add note explaining that build flags passed to make should also be passed to make install
* give example of install error
* Describe how to build openblas library for win/arm64 targets
* Add Xen to the existing entry for QEMU/KVM based on issue 3445
* Updated Download (markdown)
* Updated Installation Guide (markdown)
* Updated Installation Guide (markdown)
* Revert b8da0e8523b898a2206d1e2fe99dbfb4ebb0ffa8...bc55aade759d2f925689b000828da249e1fc6a1a on Installation Guide
* Revert b0c9a2ee060b8dd0b46b4c58375ef2a743c0363a...cecf8cf67963bd77a0bb97086e3a457a4cee11ff on Download
* Revert bc55aade759d2f925689b000828da249e1fc6a1a...134894a0f09a0e92eef1b9a5c9e63f459d2db55e on Installation Guide
* Add NDK23B example
* Makes iOS build more robust
* Double -isysroot
* Bump up required devtoolset version for AVX-512 intrinsics.
* Updated Installation Guide (markdown)
* Updated How to build OpenBLAS for Windows on ARM64 (markdown)
* Revert b8da0e8523b898a2206d1e2fe99dbfb4ebb0ffa8...75bba70832f8765faee693931c4a9e3eb6c84d98 on Installation Guide
* Revert 75bba70832f8765faee693931c4a9e3eb6c84d98...d171e711a5cd8026b2eb507b249b5e51fa28b2a2 on Installation Guide
* restore Windows link after malicious edit
* Revert 1bcb03dcef85c675aace7f0a755d5aa36ec46eca...f732906434146b1a1ee82abe944a6d51d8f43b81 on Installation Guide
* restore Windows link after malicious edit
* Updated Installation Guide (markdown)
* Bump up AVX-512 devtoolset because of identified packaging issues
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* n-dash html entity instead of -
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Add the bfloat16 functions
* mention AXPBY
* Update building for Apple M1
* Updated How to build OpenBLAS for Windows on ARM64 (markdown)
* Created How to build OpenBLAS for macOS M1 / arm64 (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Add NO_AVX2 build hint for OSX Docker Desktop/xhyve (issues 2194 and 2244)
* Mention the ELF offset/address bug from binutils 2.38 ld
* moved issue 665 (sparse matrix/vector support) to a faq entry
* Update and simplify based on CI experience and 3741
* Updated Download (markdown)
* Updated How to build OpenBLAS for Windows on ARM64 (markdown)
* Revert 0dcee87d486028fbd88c603853cdcae810e025c6...bf3d15e74d42b0b01618b4beb7b9d658fb905118 on Download
* Revert a02f9e470f8e26eda1b8d8601ad2486557721ccf...c862aeb3492c29b487858d43c93676855b60a1f2 on How to build OpenBLAS for Windows on ARM64
* Updated How to use OpenBLAS in Microsoft Visual Studio (markdown)
* Revert 9db97d11d88c801e8c5e9b8d6cc85fb44e5bca61...d2eb48810f3ecc1680900581473005f79c394ca4 on How to use OpenBLAS in Microsoft Visual Studio
* start with the smallest configs, Appveyor and Cirrus
* Updated CI jobs overview (markdown)
* Add Azure CI
* Add github workflows
* Add the crossbuild parts of the dynamic_arch workflow
* remove trailing separator
* Add FreeBSD/Cirrus
* Add ILP64 jobs on Cirrus
* Add C910V and the OSUOSL Jenkins jobs (currently configured for my fork)
* Updated Installation Guide (markdown)
* Expand section on precompiled windows binaries to mention INTERFACE64=0 option
* Remove reference to buildbot (domain reregistered to someone else, issue 4148
* Add OpenMP hints for mixed threads mode from issue 3186
* document NUM_PARALLEL (paraphrased from issue 1735) and expand other entries a bit
* Mention use of llvm-ar rather than gcc-ar in recent NDKs and remove perl requirement
* Add ?gemmt from -1.3.23/0.3.24
* note that LLVM is an optional install with VS2022
* clarify that all tools for the xbuild come with VS2022
* add instructions for cross-compiling from Windows/x86 (copied from issue #4459)

Co-authored-by: Martin Kroeker <martin@ruby.chemie.uni-freiburg.de>
Co-authored-by: xianyi <traits.zhang@gmail.com>
Co-authored-by: Zhang Xianyi <traits.zhang@gmail.com>
Co-authored-by: Andrew <bradatajs@yahoo.com>
Co-authored-by: Elethom <elethomhunter@gmail.com>
Co-authored-by: Isuru Fernando <isuruf@gmail.com>
Co-authored-by: A. Tammy <epsilon-0@users.noreply.github.com>
Co-authored-by: Andrew <16061801+brada4@users.noreply.github.com>
Co-authored-by: Paul MUSTIÈRE <paul.mustiere@gmail.com>
Co-authored-by: TiborGY <gyori.tibor@stud.u-szeged.hu>
Co-authored-by: xoviat <xoviat@users.noreply.github.com>
Co-authored-by: zchothia <zaheer.chothia@gmail.com>
Co-authored-by: Eric Larson <larson.eric.d@gmail.com>
Co-authored-by: xoviat <49173759+xoviat@users.noreply.github.com>
Co-authored-by: Kevin Yang <kevin@cobaltspeech.com>
Co-authored-by: Mavaddat Javid <javid@mavaddat.ca>
Co-authored-by: Derek Huang <37860662+phetdam@users.noreply.github.com>
Co-authored-by: Iblis Lin <iblis@hs.ntnu.edu.tw>
Co-authored-by: Niyas Sait <niyas.sait@linaro.org>
Co-authored-by: Roman Nazarevych <lemberg.rn@gmail.com>
Co-authored-by: rumiv <100173053+rumiv@users.noreply.github.com>
Co-authored-by: Felix Yan <felixonmars@archlinux.org>
Co-authored-by: Matti Picus <matti.picus@gmail.com>
Co-authored-by: Timothy Gu <timothygu99@gmail.com>
Co-authored-by: Yubin Wang <wangyubin19890515@163.com>
Co-authored-by: dtidmarsh <tidmarsh.david@gmail.com>
Co-authored-by: hninhninhtun <30315263+hninhninhtun@users.noreply.github.com>
Co-authored-by: masel0 <96305063+masel0@users.noreply.github.com>
Co-authored-by: meow464 <70211708+meow464@users.noreply.github.com>
Co-authored-by: Ankush Chauhan <ankush.26.11@gmail.com>
Co-authored-by: Ashwin Sekhar T K <ashwinyes@users.noreply.github.com>
Co-authored-by: Chunde <xmuhcd@msn.com>
Co-authored-by: Corey Richardson <corey@octayn.net>
Co-authored-by: CristianAndrade94 <117796497+CristianAndrade94@users.noreply.github.com>
Co-authored-by: Dave Liu <dliu@rivierapartners.com>
Co-authored-by: David Hagen <david@appliedbiomath.com>
Co-authored-by: Gökçen Eraslan <gokcen.eraslan@gmail.com>
Co-authored-by: Hong <hong@topbug.net>
Co-authored-by: Iarsv <96173089+Iarsv@users.noreply.github.com>
Co-authored-by: Isuru Fernando <isuru.11@cse.mrt.ac.lk>
Co-authored-by: Jellby <jellby@yahoo.com>
Co-authored-by: Joachim Wagner <jwagner@computing.dcu.ie>
Co-authored-by: Joseph Shen <joseph.smeng@gmail.com>
Co-authored-by: Kevin Ji <kevin.ji@outlook.com>
Co-authored-by: Liming Wang <lmwang@gmail.com>
Co-authored-by: Marco Pompili <marcs.pompili@gmail.com>
Co-authored-by: Marcus Ottosson <konstruktion@gmail.com>
Co-authored-by: Musen <yuan.gan@fandm.edu>
Co-authored-by: Neil Shipp <neilsh@microsoft.com>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Randall Bohn <rsbohn@familysearch.org>
Co-authored-by: Tiger <51085070+1149859096@users.noreply.github.com>
Co-authored-by: Tillsten <mail.till@gmx.de>
Co-authored-by: Tommy Carozzani <tommy.carozzani@nanolive.ch>
Co-authored-by: Tyler <TT--@users.noreply.github.com>
Co-authored-by: Xingyu Na <asr.naxingyu@gmail.com>
Co-authored-by: Zhuo Zhang <zchrissirhcz@gmail.com>
Co-authored-by: brada4 <bradatajs@yahoo.com>
Co-authored-by: ccy022364 <ccy022364@163.com>
Co-authored-by: davmaz <davmaz@users.noreply.github.com>
Co-authored-by: dkapelyan <30596321+dkapelyan@users.noreply.github.com>
Co-authored-by: eolianoe <eolianoe@users.noreply.github.com>
Co-authored-by: fommil <sam.halliday@Gmail.com>
Co-authored-by: magras <magras@users.noreply.github.com>
Co-authored-by: neitann <96481461+neitann@users.noreply.github.com>
Co-authored-by: raffamaiden <raffamaiden@gmail.com>
Co-authored-by: sogf <93816959+sogf@users.noreply.github.com>
Co-authored-by: wernsaar <wernsaar@googlemail.com>
2024-03-14 08:45:49 +00:00
2024-03-13 13:07:26 +01:00
2024-01-16 23:45:06 +08:00
2019-05-04 15:01:29 -04:00
2024-03-12 15:28:50 +03:00
2023-07-26 12:01:12 -04:00
2021-01-20 20:21:27 +01:00
2024-03-14 08:45:49 +00:00
2022-12-07 15:04:13 +01:00
2024-01-16 23:45:06 +08:00
2024-01-16 23:45:06 +08:00
2024-01-02 22:08:49 +01:00
2023-08-16 00:48:40 +02:00
2023-10-31 11:08:22 +01:00
2023-08-16 00:48:40 +02:00
2024-01-16 23:45:06 +08:00
2024-02-08 13:15:34 +01:00
2020-10-13 11:00:22 -05:00
2020-12-09 10:28:46 +08:00
2023-08-16 00:48:40 +02:00
2019-03-31 22:12:23 +02:00
2014-11-25 15:28:58 +08:00
2023-08-16 00:48:40 +02:00
2019-04-29 17:03:56 -04:00
2023-08-16 00:48:40 +02:00
2019-03-31 22:12:23 +02:00
2023-08-16 00:48:40 +02:00
2024-01-16 23:45:06 +08:00
2024-03-14 08:45:49 +00:00
2024-03-14 08:45:49 +00:00
2024-03-14 08:45:49 +00:00
2022-12-09 16:32:13 -06:00
2024-03-14 08:45:49 +00:00
2023-01-12 16:58:41 +08:00
2023-04-24 10:29:59 +08:00
2024-03-14 08:45:49 +00:00
2023-08-05 18:33:15 +02:00
2023-08-05 18:32:41 +02:00
2014-11-25 15:28:58 +08:00
2024-03-14 08:45:49 +00:00
2024-03-14 08:45:49 +00:00
2024-01-16 23:45:06 +08:00
2013-07-25 14:08:37 -07:00
2023-11-20 17:24:22 -06:00
2024-03-10 23:22:05 +01:00
2024-03-14 08:45:49 +00:00
2012-08-09 20:37:55 +08:00
2023-03-20 15:12:35 +01:00
2024-03-14 08:45:49 +00:00
2014-11-25 15:28:58 +08:00

OpenBLAS

Join the chat at https://gitter.im/xianyi/OpenBLAS

Travis CI: Build Status

AppVeyor: Build status

Cirrus CI: Build Status

Build Status

OSUOSL POWERCI Build Status

OSUOSL IBMZ-CI Build Status

Introduction

OpenBLAS is an optimized BLAS (Basic Linear Algebra Subprograms) library based on GotoBLAS2 1.13 BSD version.

Please read the documentation on the OpenBLAS wiki pages: https://github.com/xianyi/OpenBLAS/wiki.

For a general introduction to the BLAS routines, please refer to the extensive documentation of their reference implementation hosted at netlib: https://www.netlib.org/blas. On that site you will likewise find documentation for the reference implementation of the higher-level library LAPACK - the Linear Algebra Package that comes included with OpenBLAS. If you are looking for a general primer or refresher on Linear Algebra, the set of six 20-minute lecture videos by Prof. Gilbert Strang on either MIT OpenCourseWare https://ocw.mit.edu/resources/res-18-010-a-2020-vision-of-linear-algebra-spring-2020/ or Youtube https://www.youtube.com/playlist?list=PLUl4u3cNGP61iQEFiWLE21EJCxwmWvvek may be helpful.

Binary Packages

We provide official binary packages for the following platform:

  • Windows x86/x86_64

You can download them from file hosting on sourceforge.net or from the Releases section of the github project page, https://github.com/xianyi/OpenBLAS/releases.

Installation from Source

Download from project homepage, https://xianyi.github.com/OpenBLAS/, or check out the code using Git from https://github.com/xianyi/OpenBLAS.git. (If you want the most up to date version, be sure to use the develop branch - master is several years out of date due to a change of maintainership.) Buildtime parameters can be chosen in Makefile.rule, see there for a short description of each option. Most can also be given directly on the make or cmake command line.

Dependencies

Building OpenBLAS requires the following to be installed:

  • GNU Make
  • A C compiler, e.g. GCC or Clang
  • A Fortran compiler (optional, for LAPACK)
  • IBM MASS (optional, see below)

Normal compile

Simply invoking make (or gmake on BSD) will detect the CPU automatically. To set a specific target CPU, use make TARGET=xxx, e.g. make TARGET=NEHALEM. The full target list is in the file TargetList.txt, other build optionss are documented in Makefile.rule and can either be set there (typically by removing the comment character from the respective line), or used on the make command line. Note that when you run make install after building, you need to repeat all command line options you provided to make in the build step, as some settings like the supported maximum number of threads are automatically derived from the build host by default, which might not be what you want. For building with cmake, the usual conventions apply, i.e. create a build directory either underneath the toplevel OpenBLAS source directory or separate from it, and invoke cmake there with the path to the source tree and any build options you plan to set.

Cross compile

Set CC and FC to point to the cross toolchains, and set HOSTCC to your host C compiler. The target must be specified explicitly when cross compiling.

Examples:

  • On an x86 box, compile this library for a loongson3a CPU:

    make BINARY=64 CC=mips64el-unknown-linux-gnu-gcc FC=mips64el-unknown-linux-gnu-gfortran HOSTCC=gcc TARGET=LOONGSON3A
    

    or same with the newer mips-crosscompiler put out by Loongson that defaults to the 32bit ABI:

    make HOSTCC=gcc CC='/opt/mips-loongson-gcc7.3-linux-gnu/2019.06-29/bin/mips-linux-gnu-gcc -mabi=64' FC='/opt/mips-loongson-gcc7.3-linux-gnu/2019.06-29/bin/mips-linux-gnu-gfortran -mabi=64' TARGET=LOONGSON3A
    
  • On an x86 box, compile this library for a loongson3a CPU with loongcc (based on Open64) compiler:

    make CC=loongcc FC=loongf95 HOSTCC=gcc TARGET=LOONGSON3A CROSS=1 CROSS_SUFFIX=mips64el-st-linux-gnu-   NO_LAPACKE=1 NO_SHARED=1 BINARY=32
    

Debug version

A debug version can be built using make DEBUG=1.

Compile with MASS support on Power CPU (optional)

The IBM MASS library consists of a set of mathematical functions for C, C++, and Fortran applications that are tuned for optimum performance on POWER architectures. OpenBLAS with MASS requires a 64-bit, little-endian OS on POWER. The library can be installed as shown:

  • On Ubuntu:

    wget -q http://public.dhe.ibm.com/software/server/POWER/Linux/xl-compiler/eval/ppc64le/ubuntu/public.gpg -O- | sudo apt-key add -
    echo "deb http://public.dhe.ibm.com/software/server/POWER/Linux/xl-compiler/eval/ppc64le/ubuntu/ trusty main" | sudo tee /etc/apt/sources.list.d/ibm-xl-compiler-eval.list
    sudo apt-get update
    sudo apt-get install libxlmass-devel.8.1.5
    
  • On RHEL/CentOS:

    wget http://public.dhe.ibm.com/software/server/POWER/Linux/xl-compiler/eval/ppc64le/rhel7/repodata/repomd.xml.key
    sudo rpm --import repomd.xml.key
    wget http://public.dhe.ibm.com/software/server/POWER/Linux/xl-compiler/eval/ppc64le/rhel7/ibm-xl-compiler-eval.repo
    sudo cp ibm-xl-compiler-eval.repo /etc/yum.repos.d/
    sudo yum install libxlmass-devel.8.1.5
    

After installing the MASS library, compile OpenBLAS with USE_MASS=1. For example, to compile on Power8 with MASS support: make USE_MASS=1 TARGET=POWER8.

Install to a specific directory (optional)

Use PREFIX= when invoking make, for example

make install PREFIX=your_installation_directory

(along with all options you added on the make command line in the preceding build step) The default installation directory is /opt/OpenBLAS.

Supported CPUs and Operating Systems

Please read GotoBLAS_01Readme.txt for older CPU models already supported by the 2010 GotoBLAS.

Additional supported CPUs

x86/x86-64

  • Intel Xeon 56xx (Westmere): Used GotoBLAS2 Nehalem codes.
  • Intel Sandy Bridge: Optimized Level-3 and Level-2 BLAS with AVX on x86-64.
  • Intel Haswell: Optimized Level-3 and Level-2 BLAS with AVX2 and FMA on x86-64.
  • Intel Skylake-X: Optimized Level-3 and Level-2 BLAS with AVX512 and FMA on x86-64.
  • Intel Cooper Lake: as Skylake-X with improved BFLOAT16 support.
  • AMD Bobcat: Used GotoBLAS2 Barcelona codes.
  • AMD Bulldozer: x86-64 ?GEMM FMA4 kernels. (Thanks to Werner Saar)
  • AMD PILEDRIVER: Uses Bulldozer codes with some optimizations.
  • AMD STEAMROLLER: Uses Bulldozer codes with some optimizations.
  • AMD ZEN: Uses Haswell codes with some optimizations for Zen 2/3 (use SkylakeX for Zen4)

MIPS32

  • MIPS 1004K: uses P5600 codes
  • MIPS 24K: uses P5600 codes

MIPS64

  • ICT Loongson 3A: Optimized Level-3 BLAS and the part of Level-1,2.
  • ICT Loongson 3B: Experimental

ARM

  • ARMv6: Optimized BLAS for vfpv2 and vfpv3-d16 (e.g. BCM2835, Cortex M0+)
  • ARMv7: Optimized BLAS for vfpv3-d32 (e.g. Cortex A8, A9 and A15)

ARM64

  • ARMv8: Basic ARMV8 with small caches, optimized Level-3 and Level-2 BLAS
  • Cortex-A53: same as ARMV8 (different cpu specifications)
  • Cortex-A55: same as ARMV8 (different cpu specifications)
  • Cortex A57: Optimized Level-3 and Level-2 functions
  • Cortex A72: same as A57 ( different cpu specifications)
  • Cortex A73: same as A57 (different cpu specifications)
  • Falkor: same as A57 (different cpu specifications)
  • ThunderX: Optimized some Level-1 functions
  • ThunderX2T99: Optimized Level-3 BLAS and parts of Levels 1 and 2
  • ThunderX3T110
  • TSV110: Optimized some Level-3 helper functions
  • EMAG 8180: preliminary support based on A57
  • Neoverse N1: (AWS Graviton2) preliminary support
  • Neoverse V1: (AWS Graviton3) optimized Level-3 BLAS
  • Apple Vortex: preliminary support based on ThunderX2/3
  • A64FX: preliminary support, optimized Level-3 BLAS
  • ARMV8SVE: any ARMV8 cpu with SVE extensions

PPC/PPC64

  • POWER8: Optimized BLAS, only for PPC64LE (Little Endian), only with USE_OPENMP=1

  • POWER9: Optimized Level-3 BLAS (real) and some Level-1,2. PPC64LE with OpenMP only.

  • POWER10: Optimized Level-3 BLAS including SBGEMM and some Level-1,2.

  • AIX: Dynamic architecture with OpenXL and OpenMP.

    make CC=ibm-clang_r FC=xlf TARGET=POWER7 BINARY=64 USE_OPENMP=1 INTERFACE64=1 DYNAMIC_ARCH=1 USE_THREAD=1
    

IBM zEnterprise System

  • Z13: Optimized Level-3 BLAS and Level-1,2
  • Z14: Optimized Level-3 BLAS and (single precision) Level-1,2

RISC-V

  • C910V: Optimized Level-3 BLAS (real) and Level-1,2 by RISC-V Vector extension 0.7.1.

    make HOSTCC=gcc TARGET=C910V CC=riscv64-unknown-linux-gnu-gcc FC=riscv64-unknown-linux-gnu-gfortran
    

    (also known to work on C906 as long as you use only single-precision functions - its instruction set support appears to be incomplete in double precision)

  • x280: Level-3 BLAS and Level-1,2 are optimized by RISC-V Vector extension 1.0.

    make HOSTCC=gcc TARGET=x280 NUM_THREADS=8 CC=riscv64-unknown-linux-gnu-clang FC=riscv64-unknown-linux-gnu-gfortran
    
  • ZVL???B: Level-3 BLAS and Level-1,2 including vectorised kernels targeting generic RISCV cores with vector support with registers of at least the corresponding width; ZVL128B and ZVL256B are available. e.g.:

make TARGET=RISCV64_ZVL256B CFLAGS="-DTARGET=RISCV64_ZVL256B"
BINARY=64 ARCH=riscv64 CC='clang -target riscv64-unknown-linux-gnu'
AR=riscv64-unknown-linux-gnu-ar AS=riscv64-unknown-linux-gnu-gcc
LD=riscv64-unknown-linux-gnu-gcc FC=riscv64-unknown-linux-gnu-gfortran
HOSTCC=gcc HOSTFC=gfortran -j


### Support for multiple targets in a single library

OpenBLAS can be built for multiple targets with runtime detection of the target cpu by specifiying `DYNAMIC_ARCH=1` in Makefile.rule, on the gmake command line or as `-DDYNAMIC_ARCH=TRUE` in cmake.

For **x86_64**, the list of targets this activates contains Prescott, Core2, Nehalem, Barcelona, Sandybridge, Bulldozer, Piledriver, Steamroller, Excavator, Haswell, Zen, SkylakeX, Cooper Lake, Sapphire Rapids. For cpu generations not included in this list, the corresponding older model is used. If you also specify `DYNAMIC_OLDER=1`, specific support for Penryn, Dunnington, Opteron, Opteron/SSE3, Bobcat, Atom and Nano is added. Finally there is an option `DYNAMIC_LIST` that allows to specify an individual list of targets to include instead of the default.

`DYNAMIC_ARCH` is also supported on **x86**, where it translates to Katmai, Coppermine, Northwood, Prescott, Banias,
Core2, Penryn, Dunnington, Nehalem, Athlon, Opteron, Opteron_SSE3, Barcelona, Bobcat, Atom and Nano.

On **ARMV8**, it enables support for CortexA53, CortexA57, CortexA72, CortexA73, Falkor, ThunderX, ThunderX2T99, TSV110 as well as generic ARMV8 cpus. If compiler support for SVE is available at build time, support for NeoverseN2, NeoverseV1 as well as generic ArmV8SVE targets is also enabled.

For **POWER**, the list encompasses POWER6, POWER8 and POWER9. POWER10 is additionally available if a sufficiently recent compiler is used for the build.

on **ZARCH** it comprises Z13 and Z14 as well as generic zarch support.

The `TARGET` option can be used in conjunction with `DYNAMIC_ARCH=1` to specify which cpu model should be assumed for all the
common code in the library, usually you will want to set this to the oldest model you expect to encounter.
Please note that it is not possible to combine support for different architectures, so no combined 32 and 64 bit or x86_64 and arm64 in the same library.

### Supported OS

- **GNU/Linux**
- **MinGW or Visual Studio (CMake)/Windows**: Please read <https://github.com/xianyi/OpenBLAS/wiki/How-to-use-OpenBLAS-in-Microsoft-Visual-Studio>.
- **Darwin/macOS/OSX/iOS**: Experimental. Although GotoBLAS2 already supports Darwin, we are not OSX/iOS experts.
- **FreeBSD**: Supported by the community. We don't actively test the library on this OS.
- **OpenBSD**: Supported by the community. We don't actively test the library on this OS.
- **NetBSD**: Supported by the community. We don't actively test the library on this OS.
- **DragonFly BSD**: Supported by the community. We don't actively test the library on this OS.
- **Android**: Supported by the community. Please read <https://github.com/xianyi/OpenBLAS/wiki/How-to-build-OpenBLAS-for-Android>.
- **AIX**: Supported on PPC up to POWER10
- **Haiku**: Supported by the community. We don't actively test the library on this OS.
- **SunOS**: Supported by the community. We don't actively test the library on this OS.
- **Cortex-M**: Supported by the community. Please read <https://github.com/xianyi/OpenBLAS/wiki/How-to-use-OpenBLAS-on-Cortex-M>.

## Usage

Statically link with `libopenblas.a` or dynamically link with `-lopenblas` if OpenBLAS was
compiled as a shared library.

### Setting the number of threads using environment variables

Environment variables are used to specify a maximum number of threads.
For example,

```sh
export OPENBLAS_NUM_THREADS=4
export GOTO_NUM_THREADS=4
export OMP_NUM_THREADS=4

The priorities are OPENBLAS_NUM_THREADS > GOTO_NUM_THREADS > OMP_NUM_THREADS.

If you compile this library with USE_OPENMP=1, you should set the OMP_NUM_THREADS environment variable; OpenBLAS ignores OPENBLAS_NUM_THREADS and GOTO_NUM_THREADS when compiled with USE_OPENMP=1.

Setting the number of threads at runtime

We provide the following functions to control the number of threads at runtime:

void goto_set_num_threads(int num_threads);
void openblas_set_num_threads(int num_threads);

Note that these are only used once at library initialization, and are not available for fine-tuning thread numbers in individual BLAS calls. If you compile this library with USE_OPENMP=1, you should use the above functions too.

Reporting bugs

Please submit an issue in https://github.com/xianyi/OpenBLAS/issues.

Contact

Change log

Please see Changelog.txt to view the differences between OpenBLAS and GotoBLAS2 1.13 BSD version.

Troubleshooting

  • Please read the FAQ first.
  • Please use GCC version 4.6 and above to compile Sandy Bridge AVX kernels on Linux/MinGW/BSD.
  • Please use Clang version 3.1 and above to compile the library on Sandy Bridge microarchitecture. Clang 3.0 will generate the wrong AVX binary code.
  • Please use GCC version 6 or LLVM version 6 and above to compile Skylake AVX512 kernels.
  • The number of CPUs/cores should be less than or equal to 256. On Linux x86_64 (amd64), there is experimental support for up to 1024 CPUs/cores and 128 numa nodes if you build the library with BIGNUMA=1.
  • OpenBLAS does not set processor affinity by default. On Linux, you can enable processor affinity by commenting out the line NO_AFFINITY=1 in Makefile.rule. However, note that this may cause a conflict with R parallel.
  • On Loongson 3A, make test may fail with a pthread_create error (EAGAIN). However, it will be okay when you run the same test case on the shell.

Contributing

  1. Check for open issues or open a fresh issue to start a discussion around a feature idea or a bug.
  2. Fork the OpenBLAS repository to start making your changes.
  3. Write a test which shows that the bug was fixed or that the feature works as expected.
  4. Send a pull request. Make sure to add yourself to CONTRIBUTORS.md.

Donation

Please read this wiki page.

Description
No description provided
Readme 73 MiB
Languages
C 51.1%
Fortran 29.9%
Assembly 17.5%
Makefile 0.5%
C++ 0.4%
Other 0.4%