OpenBLAS/docs/user_manual/index.html

899 lines
42 KiB
HTML

<!doctype html>
<html lang="en" class="no-js">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width,initial-scale=1">
<link rel="canonical" href="https://openblas.net/docs/user_manual/">
<link rel="prev" href="../install/">
<link rel="next" href="../extensions/">
<link rel="icon" href="../logo.svg">
<meta name="generator" content="mkdocs-1.6.0, mkdocs-material-9.5.25">
<title>User manual - OpenBLAS</title>
<link rel="stylesheet" href="../assets/stylesheets/main.6543a935.min.css">
<link rel="stylesheet" href="../assets/stylesheets/palette.06af60db.min.css">
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
<link rel="stylesheet" href="https://fonts.googleapis.com/css?family=Roboto:300,300i,400,400i,700,700i%7CRoboto+Mono:400,400i,700,700i&display=fallback">
<style>:root{--md-text-font:"Roboto";--md-code-font:"Roboto Mono"}</style>
<script>__md_scope=new URL("..",location),__md_hash=e=>[...e].reduce((e,_)=>(e<<5)-e+_.charCodeAt(0),0),__md_get=(e,_=localStorage,t=__md_scope)=>JSON.parse(_.getItem(t.pathname+"."+e)),__md_set=(e,_,t=localStorage,a=__md_scope)=>{try{t.setItem(a.pathname+"."+e,JSON.stringify(_))}catch(e){}}</script>
</head>
<body dir="ltr" data-md-color-scheme="default" data-md-color-primary="grey" data-md-color-accent="indigo">
<input class="md-toggle" data-md-toggle="drawer" type="checkbox" id="__drawer" autocomplete="off">
<input class="md-toggle" data-md-toggle="search" type="checkbox" id="__search" autocomplete="off">
<label class="md-overlay" for="__drawer"></label>
<div data-md-component="skip">
<a href="#compile-the-library" class="md-skip">
Skip to content
</a>
</div>
<div data-md-component="announce">
</div>
<header class="md-header md-header--shadow" data-md-component="header">
<nav class="md-header__inner md-grid" aria-label="Header">
<a href=".." title="OpenBLAS" class="md-header__button md-logo" aria-label="OpenBLAS" data-md-component="logo">
<img src="../logo.svg" alt="logo">
</a>
<label class="md-header__button md-icon" for="__drawer">
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24"><path d="M3 6h18v2H3V6m0 5h18v2H3v-2m0 5h18v2H3v-2Z"/></svg>
</label>
<div class="md-header__title" data-md-component="header-title">
<div class="md-header__ellipsis">
<div class="md-header__topic">
<span class="md-ellipsis">
OpenBLAS
</span>
</div>
<div class="md-header__topic" data-md-component="header-topic">
<span class="md-ellipsis">
User manual
</span>
</div>
</div>
</div>
<label class="md-header__button md-icon" for="__search">
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24"><path d="M9.5 3A6.5 6.5 0 0 1 16 9.5c0 1.61-.59 3.09-1.56 4.23l.27.27h.79l5 5-1.5 1.5-5-5v-.79l-.27-.27A6.516 6.516 0 0 1 9.5 16 6.5 6.5 0 0 1 3 9.5 6.5 6.5 0 0 1 9.5 3m0 2C7 5 5 7 5 9.5S7 14 9.5 14 14 12 14 9.5 12 5 9.5 5Z"/></svg>
</label>
<div class="md-search" data-md-component="search" role="dialog">
<label class="md-search__overlay" for="__search"></label>
<div class="md-search__inner" role="search">
<form class="md-search__form" name="search">
<input type="text" class="md-search__input" name="query" aria-label="Search" placeholder="Search" autocapitalize="off" autocorrect="off" autocomplete="off" spellcheck="false" data-md-component="search-query" required>
<label class="md-search__icon md-icon" for="__search">
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24"><path d="M9.5 3A6.5 6.5 0 0 1 16 9.5c0 1.61-.59 3.09-1.56 4.23l.27.27h.79l5 5-1.5 1.5-5-5v-.79l-.27-.27A6.516 6.516 0 0 1 9.5 16 6.5 6.5 0 0 1 3 9.5 6.5 6.5 0 0 1 9.5 3m0 2C7 5 5 7 5 9.5S7 14 9.5 14 14 12 14 9.5 12 5 9.5 5Z"/></svg>
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24"><path d="M20 11v2H8l5.5 5.5-1.42 1.42L4.16 12l7.92-7.92L13.5 5.5 8 11h12Z"/></svg>
</label>
<nav class="md-search__options" aria-label="Search">
<button type="reset" class="md-search__icon md-icon" title="Clear" aria-label="Clear" tabindex="-1">
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24"><path d="M19 6.41 17.59 5 12 10.59 6.41 5 5 6.41 10.59 12 5 17.59 6.41 19 12 13.41 17.59 19 19 17.59 13.41 12 19 6.41Z"/></svg>
</button>
</nav>
</form>
<div class="md-search__output">
<div class="md-search__scrollwrap" data-md-scrollfix>
<div class="md-search-result" data-md-component="search-result">
<div class="md-search-result__meta">
Initializing search
</div>
<ol class="md-search-result__list" role="presentation"></ol>
</div>
</div>
</div>
</div>
</div>
</nav>
</header>
<div class="md-container" data-md-component="container">
<main class="md-main" data-md-component="main">
<div class="md-main__inner md-grid">
<div class="md-sidebar md-sidebar--primary" data-md-component="sidebar" data-md-type="navigation" >
<div class="md-sidebar__scrollwrap">
<div class="md-sidebar__inner">
<nav class="md-nav md-nav--primary" aria-label="Navigation" data-md-level="0">
<label class="md-nav__title" for="__drawer">
<a href=".." title="OpenBLAS" class="md-nav__button md-logo" aria-label="OpenBLAS" data-md-component="logo">
<img src="../logo.svg" alt="logo">
</a>
OpenBLAS
</label>
<ul class="md-nav__list" data-md-scrollfix>
<li class="md-nav__item">
<a href=".." class="md-nav__link">
<span class="md-ellipsis">
Home
</span>
</a>
</li>
<li class="md-nav__item">
<a href="../install/" class="md-nav__link">
<span class="md-ellipsis">
Install OpenBLAS
</span>
</a>
</li>
<li class="md-nav__item md-nav__item--active">
<input class="md-nav__toggle md-toggle" type="checkbox" id="__toc">
<label class="md-nav__link md-nav__link--active" for="__toc">
<span class="md-ellipsis">
User manual
</span>
<span class="md-nav__icon md-icon"></span>
</label>
<a href="./" class="md-nav__link md-nav__link--active">
<span class="md-ellipsis">
User manual
</span>
</a>
<nav class="md-nav md-nav--secondary" aria-label="Table of contents">
<label class="md-nav__title" for="__toc">
<span class="md-nav__icon md-icon"></span>
Table of contents
</label>
<ul class="md-nav__list" data-md-component="toc" data-md-scrollfix>
<li class="md-nav__item">
<a href="#compile-the-library" class="md-nav__link">
<span class="md-ellipsis">
Compile the library
</span>
</a>
<nav class="md-nav" aria-label="Compile the library">
<ul class="md-nav__list">
<li class="md-nav__item">
<a href="#normal-compile" class="md-nav__link">
<span class="md-ellipsis">
Normal compile
</span>
</a>
</li>
<li class="md-nav__item">
<a href="#cross-compile" class="md-nav__link">
<span class="md-ellipsis">
Cross compile
</span>
</a>
</li>
<li class="md-nav__item">
<a href="#debug-version" class="md-nav__link">
<span class="md-ellipsis">
Debug version
</span>
</a>
</li>
<li class="md-nav__item">
<a href="#install-to-the-directory-optional" class="md-nav__link">
<span class="md-ellipsis">
Install to the directory (optional)
</span>
</a>
</li>
</ul>
</nav>
</li>
<li class="md-nav__item">
<a href="#link-the-library" class="md-nav__link">
<span class="md-ellipsis">
Link the library
</span>
</a>
</li>
<li class="md-nav__item">
<a href="#code-examples" class="md-nav__link">
<span class="md-ellipsis">
Code examples
</span>
</a>
<nav class="md-nav" aria-label="Code examples">
<ul class="md-nav__list">
<li class="md-nav__item">
<a href="#call-cblas-interface" class="md-nav__link">
<span class="md-ellipsis">
Call CBLAS interface
</span>
</a>
</li>
<li class="md-nav__item">
<a href="#call-blas-fortran-interface" class="md-nav__link">
<span class="md-ellipsis">
Call BLAS Fortran interface
</span>
</a>
</li>
</ul>
</nav>
</li>
<li class="md-nav__item">
<a href="#troubleshooting" class="md-nav__link">
<span class="md-ellipsis">
Troubleshooting
</span>
</a>
</li>
<li class="md-nav__item">
<a href="#blas-reference-manual" class="md-nav__link">
<span class="md-ellipsis">
BLAS reference manual
</span>
</a>
</li>
</ul>
</nav>
</li>
<li class="md-nav__item">
<a href="../extensions/" class="md-nav__link">
<span class="md-ellipsis">
Extensions
</span>
</a>
</li>
<li class="md-nav__item">
<a href="../developers/" class="md-nav__link">
<span class="md-ellipsis">
Developer manual
</span>
</a>
</li>
<li class="md-nav__item">
<a href="../build_system/" class="md-nav__link">
<span class="md-ellipsis">
Build system
</span>
</a>
</li>
<li class="md-nav__item">
<a href="../distributing/" class="md-nav__link">
<span class="md-ellipsis">
Redistributing OpenBLAS
</span>
</a>
</li>
<li class="md-nav__item">
<a href="../ci/" class="md-nav__link">
<span class="md-ellipsis">
CI jobs
</span>
</a>
</li>
<li class="md-nav__item">
<a href="../about/" class="md-nav__link">
<span class="md-ellipsis">
About
</span>
</a>
</li>
<li class="md-nav__item">
<a href="../faq/" class="md-nav__link">
<span class="md-ellipsis">
FAQ
</span>
</a>
</li>
</ul>
</nav>
</div>
</div>
</div>
<div class="md-sidebar md-sidebar--secondary" data-md-component="sidebar" data-md-type="toc" >
<div class="md-sidebar__scrollwrap">
<div class="md-sidebar__inner">
<nav class="md-nav md-nav--secondary" aria-label="Table of contents">
<label class="md-nav__title" for="__toc">
<span class="md-nav__icon md-icon"></span>
Table of contents
</label>
<ul class="md-nav__list" data-md-component="toc" data-md-scrollfix>
<li class="md-nav__item">
<a href="#compile-the-library" class="md-nav__link">
<span class="md-ellipsis">
Compile the library
</span>
</a>
<nav class="md-nav" aria-label="Compile the library">
<ul class="md-nav__list">
<li class="md-nav__item">
<a href="#normal-compile" class="md-nav__link">
<span class="md-ellipsis">
Normal compile
</span>
</a>
</li>
<li class="md-nav__item">
<a href="#cross-compile" class="md-nav__link">
<span class="md-ellipsis">
Cross compile
</span>
</a>
</li>
<li class="md-nav__item">
<a href="#debug-version" class="md-nav__link">
<span class="md-ellipsis">
Debug version
</span>
</a>
</li>
<li class="md-nav__item">
<a href="#install-to-the-directory-optional" class="md-nav__link">
<span class="md-ellipsis">
Install to the directory (optional)
</span>
</a>
</li>
</ul>
</nav>
</li>
<li class="md-nav__item">
<a href="#link-the-library" class="md-nav__link">
<span class="md-ellipsis">
Link the library
</span>
</a>
</li>
<li class="md-nav__item">
<a href="#code-examples" class="md-nav__link">
<span class="md-ellipsis">
Code examples
</span>
</a>
<nav class="md-nav" aria-label="Code examples">
<ul class="md-nav__list">
<li class="md-nav__item">
<a href="#call-cblas-interface" class="md-nav__link">
<span class="md-ellipsis">
Call CBLAS interface
</span>
</a>
</li>
<li class="md-nav__item">
<a href="#call-blas-fortran-interface" class="md-nav__link">
<span class="md-ellipsis">
Call BLAS Fortran interface
</span>
</a>
</li>
</ul>
</nav>
</li>
<li class="md-nav__item">
<a href="#troubleshooting" class="md-nav__link">
<span class="md-ellipsis">
Troubleshooting
</span>
</a>
</li>
<li class="md-nav__item">
<a href="#blas-reference-manual" class="md-nav__link">
<span class="md-ellipsis">
BLAS reference manual
</span>
</a>
</li>
</ul>
</nav>
</div>
</div>
</div>
<div class="md-content" data-md-component="content">
<article class="md-content__inner md-typeset">
<h1>User manual</h1>
<h2 id="compile-the-library">Compile the library</h2>
<h3 id="normal-compile">Normal compile</h3>
<ul>
<li>type <code>make</code> to detect the CPU automatically.
or</li>
<li>type <code>make TARGET=xxx</code> to set target CPU, e.g. <code>make TARGET=NEHALEM</code>. The full target list is in file TargetList.txt.</li>
</ul>
<h3 id="cross-compile">Cross compile</h3>
<p>Please set <code>CC</code> and <code>FC</code> with the cross toolchains. Then, set <code>HOSTCC</code> with your host C compiler. At last, set <code>TARGET</code> explicitly.</p>
<p>Examples:</p>
<ul>
<li>On x86 box, compile the library for ARM Cortex-A9 linux.</li>
</ul>
<p>Install only gnueabihf versions. Please check https://github.com/xianyi/OpenBLAS/issues/936#issuecomment-237596847</p>
<pre><code>make CC=arm-linux-gnueabihf-gcc FC=arm-linux-gnueabihf-gfortran HOSTCC=gcc TARGET=CORTEXA9
</code></pre>
<ul>
<li>On X86 box, compile this library for loongson3a CPU.</li>
</ul>
<div class="highlight"><pre><span></span><code>make BINARY=64 CC=mips64el-unknown-linux-gnu-gcc FC=mips64el-unknown-linux-gnu-gfortran HOSTCC=gcc TARGET=LOONGSON3A
</code></pre></div>
<ul>
<li>On X86 box, compile this library for loongson3a CPU with loongcc (based on Open64) compiler.</li>
</ul>
<div class="highlight"><pre><span></span><code>make CC=loongcc FC=loongf95 HOSTCC=gcc TARGET=LOONGSON3A CROSS=1 CROSS_SUFFIX=mips64el-st-linux-gnu- NO_LAPACKE=1 NO_SHARED=1 BINARY=32
</code></pre></div>
<h3 id="debug-version">Debug version</h3>
<pre><code>make DEBUG=1
</code></pre>
<h3 id="install-to-the-directory-optional">Install to the directory (optional)</h3>
<p>Example:</p>
<pre><code>make install PREFIX=your_installation_directory
</code></pre>
<p>The default directory is /opt/OpenBLAS. Note that any flags passed to <code>make</code> during build should also be passed to <code>make install</code> to circumvent any install errors, i.e. some headers not being copied over correctly.</p>
<p>For more information, please read <a href="../install/">Installation Guide</a>.</p>
<h2 id="link-the-library">Link the library</h2>
<ul>
<li>Link shared library</li>
</ul>
<div class="highlight"><pre><span></span><code>gcc -o test test.c -I/your_path/OpenBLAS/include/ -L/your_path/OpenBLAS/lib -Wl,-rpath,/your_path/OpenBLAS/lib -lopenblas
</code></pre></div>
<p>The <code>-Wl,-rpath,/your_path/OpenBLAS/lib</code> option to linker can be omitted if you ran <code>ldconfig</code> to update linker cache, put <code>/your_path/OpenBLAS/lib</code> in <code>/etc/ld.so.conf</code> or a file in <code>/etc/ld.so.conf.d</code>, or installed OpenBLAS in a location that is part of the <code>ld.so</code> default search path (usually /lib,/usr/lib and /usr/local/lib). Alternatively, you can set the environment variable LD_LIBRARY_PATH to point to the folder that contains libopenblas.so. Otherwise, linking at runtime will fail with a message like <code>cannot open shared object file: no such file or directory</code></p>
<p>If the library is multithreaded, please add <code>-lpthread</code>. If the library contains LAPACK functions, please add <code>-lgfortran</code> or other Fortran libs, although if you only make calls to LAPACKE routines, i.e. your code has <code>#include "lapacke.h"</code> and makes calls to methods like <code>LAPACKE_dgeqrf</code>, <code>-lgfortran</code> is not needed.</p>
<ul>
<li>Link static library</li>
</ul>
<div class="highlight"><pre><span></span><code>gcc -o test test.c /your/path/libopenblas.a
</code></pre></div>
<p>You can download <code>test.c</code> from https://gist.github.com/xianyi/5780018 </p>
<h2 id="code-examples">Code examples</h2>
<h3 id="call-cblas-interface">Call CBLAS interface</h3>
<p>This example shows calling cblas_dgemm in C. https://gist.github.com/xianyi/6930656
<div class="highlight"><pre><span></span><code><span class="cp">#include</span><span class="w"> </span><span class="cpf">&lt;cblas.h&gt;</span>
<span class="cp">#include</span><span class="w"> </span><span class="cpf">&lt;stdio.h&gt;</span>
<span class="kt">void</span><span class="w"> </span><span class="nf">main</span><span class="p">()</span>
<span class="p">{</span>
<span class="w"> </span><span class="kt">int</span><span class="w"> </span><span class="n">i</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span>
<span class="w"> </span><span class="kt">double</span><span class="w"> </span><span class="n">A</span><span class="p">[</span><span class="mi">6</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">{</span><span class="mf">1.0</span><span class="p">,</span><span class="mf">2.0</span><span class="p">,</span><span class="mf">1.0</span><span class="p">,</span><span class="mf">-3.0</span><span class="p">,</span><span class="mf">4.0</span><span class="p">,</span><span class="mf">-1.0</span><span class="p">};</span><span class="w"> </span>
<span class="w"> </span><span class="kt">double</span><span class="w"> </span><span class="n">B</span><span class="p">[</span><span class="mi">6</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">{</span><span class="mf">1.0</span><span class="p">,</span><span class="mf">2.0</span><span class="p">,</span><span class="mf">1.0</span><span class="p">,</span><span class="mf">-3.0</span><span class="p">,</span><span class="mf">4.0</span><span class="p">,</span><span class="mf">-1.0</span><span class="p">};</span><span class="w"> </span>
<span class="w"> </span><span class="kt">double</span><span class="w"> </span><span class="n">C</span><span class="p">[</span><span class="mi">9</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">{</span><span class="mf">.5</span><span class="p">,</span><span class="mf">.5</span><span class="p">,</span><span class="mf">.5</span><span class="p">,</span><span class="mf">.5</span><span class="p">,</span><span class="mf">.5</span><span class="p">,</span><span class="mf">.5</span><span class="p">,</span><span class="mf">.5</span><span class="p">,</span><span class="mf">.5</span><span class="p">,</span><span class="mf">.5</span><span class="p">};</span><span class="w"> </span>
<span class="w"> </span><span class="n">cblas_dgemm</span><span class="p">(</span><span class="n">CblasColMajor</span><span class="p">,</span><span class="w"> </span><span class="n">CblasNoTrans</span><span class="p">,</span><span class="w"> </span><span class="n">CblasTrans</span><span class="p">,</span><span class="mi">3</span><span class="p">,</span><span class="mi">3</span><span class="p">,</span><span class="mi">2</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="n">A</span><span class="p">,</span><span class="w"> </span><span class="mi">3</span><span class="p">,</span><span class="w"> </span><span class="n">B</span><span class="p">,</span><span class="w"> </span><span class="mi">3</span><span class="p">,</span><span class="mi">2</span><span class="p">,</span><span class="n">C</span><span class="p">,</span><span class="mi">3</span><span class="p">);</span>
<span class="w"> </span><span class="k">for</span><span class="p">(</span><span class="n">i</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span><span class="w"> </span><span class="n">i</span><span class="o">&lt;</span><span class="mi">9</span><span class="p">;</span><span class="w"> </span><span class="n">i</span><span class="o">++</span><span class="p">)</span>
<span class="w"> </span><span class="n">printf</span><span class="p">(</span><span class="s">&quot;%lf &quot;</span><span class="p">,</span><span class="w"> </span><span class="n">C</span><span class="p">[</span><span class="n">i</span><span class="p">]);</span>
<span class="w"> </span><span class="n">printf</span><span class="p">(</span><span class="s">&quot;</span><span class="se">\n</span><span class="s">&quot;</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></p>
<div class="highlight"><pre><span></span><code>gcc -o test_cblas_open test_cblas_dgemm.c -I /your_path/OpenBLAS/include/ -L/your_path/OpenBLAS/lib -lopenblas -lpthread -lgfortran
</code></pre></div>
<h3 id="call-blas-fortran-interface">Call BLAS Fortran interface</h3>
<p>This example shows calling dgemm Fortran interface in C. https://gist.github.com/xianyi/5780018</p>
<div class="highlight"><pre><span></span><code><span class="cp">#include</span><span class="w"> </span><span class="cpf">&quot;stdio.h&quot;</span>
<span class="cp">#include</span><span class="w"> </span><span class="cpf">&quot;stdlib.h&quot;</span>
<span class="cp">#include</span><span class="w"> </span><span class="cpf">&quot;sys/time.h&quot;</span>
<span class="cp">#include</span><span class="w"> </span><span class="cpf">&quot;time.h&quot;</span>
<span class="k">extern</span><span class="w"> </span><span class="kt">void</span><span class="w"> </span><span class="nf">dgemm_</span><span class="p">(</span><span class="kt">char</span><span class="o">*</span><span class="p">,</span><span class="w"> </span><span class="kt">char</span><span class="o">*</span><span class="p">,</span><span class="w"> </span><span class="kt">int</span><span class="o">*</span><span class="p">,</span><span class="w"> </span><span class="kt">int</span><span class="o">*</span><span class="p">,</span><span class="kt">int</span><span class="o">*</span><span class="p">,</span><span class="w"> </span><span class="kt">double</span><span class="o">*</span><span class="p">,</span><span class="w"> </span><span class="kt">double</span><span class="o">*</span><span class="p">,</span><span class="w"> </span><span class="kt">int</span><span class="o">*</span><span class="p">,</span><span class="w"> </span><span class="kt">double</span><span class="o">*</span><span class="p">,</span><span class="w"> </span><span class="kt">int</span><span class="o">*</span><span class="p">,</span><span class="w"> </span><span class="kt">double</span><span class="o">*</span><span class="p">,</span><span class="w"> </span><span class="kt">double</span><span class="o">*</span><span class="p">,</span><span class="w"> </span><span class="kt">int</span><span class="o">*</span><span class="p">);</span>
<span class="kt">int</span><span class="w"> </span><span class="nf">main</span><span class="p">(</span><span class="kt">int</span><span class="w"> </span><span class="n">argc</span><span class="p">,</span><span class="w"> </span><span class="kt">char</span><span class="o">*</span><span class="w"> </span><span class="n">argv</span><span class="p">[])</span>
<span class="p">{</span>
<span class="w"> </span><span class="kt">int</span><span class="w"> </span><span class="n">i</span><span class="p">;</span>
<span class="w"> </span><span class="n">printf</span><span class="p">(</span><span class="s">&quot;test!</span><span class="se">\n</span><span class="s">&quot;</span><span class="p">);</span>
<span class="w"> </span><span class="k">if</span><span class="p">(</span><span class="n">argc</span><span class="o">&lt;</span><span class="mi">4</span><span class="p">){</span>
<span class="w"> </span><span class="n">printf</span><span class="p">(</span><span class="s">&quot;Input Error</span><span class="se">\n</span><span class="s">&quot;</span><span class="p">);</span>
<span class="w"> </span><span class="k">return</span><span class="w"> </span><span class="mi">1</span><span class="p">;</span>
<span class="w"> </span><span class="p">}</span>
<span class="w"> </span><span class="kt">int</span><span class="w"> </span><span class="n">m</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">atoi</span><span class="p">(</span><span class="n">argv</span><span class="p">[</span><span class="mi">1</span><span class="p">]);</span>
<span class="w"> </span><span class="kt">int</span><span class="w"> </span><span class="n">n</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">atoi</span><span class="p">(</span><span class="n">argv</span><span class="p">[</span><span class="mi">2</span><span class="p">]);</span>
<span class="w"> </span><span class="kt">int</span><span class="w"> </span><span class="n">k</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">atoi</span><span class="p">(</span><span class="n">argv</span><span class="p">[</span><span class="mi">3</span><span class="p">]);</span>
<span class="w"> </span><span class="kt">int</span><span class="w"> </span><span class="n">sizeofa</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">m</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">k</span><span class="p">;</span>
<span class="w"> </span><span class="kt">int</span><span class="w"> </span><span class="n">sizeofb</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">k</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">n</span><span class="p">;</span>
<span class="w"> </span><span class="kt">int</span><span class="w"> </span><span class="n">sizeofc</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">m</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">n</span><span class="p">;</span>
<span class="w"> </span><span class="kt">char</span><span class="w"> </span><span class="n">ta</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="sc">&#39;N&#39;</span><span class="p">;</span>
<span class="w"> </span><span class="kt">char</span><span class="w"> </span><span class="n">tb</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="sc">&#39;N&#39;</span><span class="p">;</span>
<span class="w"> </span><span class="kt">double</span><span class="w"> </span><span class="n">alpha</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mf">1.2</span><span class="p">;</span>
<span class="w"> </span><span class="kt">double</span><span class="w"> </span><span class="n">beta</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mf">0.001</span><span class="p">;</span>
<span class="w"> </span><span class="k">struct</span><span class="w"> </span><span class="nc">timeval</span><span class="w"> </span><span class="n">start</span><span class="p">,</span><span class="n">finish</span><span class="p">;</span>
<span class="w"> </span><span class="kt">double</span><span class="w"> </span><span class="n">duration</span><span class="p">;</span>
<span class="w"> </span><span class="kt">double</span><span class="o">*</span><span class="w"> </span><span class="n">A</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="kt">double</span><span class="o">*</span><span class="p">)</span><span class="n">malloc</span><span class="p">(</span><span class="k">sizeof</span><span class="p">(</span><span class="kt">double</span><span class="p">)</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">sizeofa</span><span class="p">);</span>
<span class="w"> </span><span class="kt">double</span><span class="o">*</span><span class="w"> </span><span class="n">B</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="kt">double</span><span class="o">*</span><span class="p">)</span><span class="n">malloc</span><span class="p">(</span><span class="k">sizeof</span><span class="p">(</span><span class="kt">double</span><span class="p">)</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">sizeofb</span><span class="p">);</span>
<span class="w"> </span><span class="kt">double</span><span class="o">*</span><span class="w"> </span><span class="n">C</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">(</span><span class="kt">double</span><span class="o">*</span><span class="p">)</span><span class="n">malloc</span><span class="p">(</span><span class="k">sizeof</span><span class="p">(</span><span class="kt">double</span><span class="p">)</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">sizeofc</span><span class="p">);</span>
<span class="w"> </span><span class="n">srand</span><span class="p">((</span><span class="kt">unsigned</span><span class="p">)</span><span class="n">time</span><span class="p">(</span><span class="nb">NULL</span><span class="p">));</span>
<span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="p">(</span><span class="n">i</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span><span class="w"> </span><span class="n">i</span><span class="o">&lt;</span><span class="n">sizeofa</span><span class="p">;</span><span class="w"> </span><span class="n">i</span><span class="o">++</span><span class="p">)</span>
<span class="w"> </span><span class="n">A</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">i</span><span class="o">%</span><span class="mi">3</span><span class="o">+</span><span class="mi">1</span><span class="p">;</span><span class="c1">//(rand()%100)/10.0;</span>
<span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="p">(</span><span class="n">i</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span><span class="w"> </span><span class="n">i</span><span class="o">&lt;</span><span class="n">sizeofb</span><span class="p">;</span><span class="w"> </span><span class="n">i</span><span class="o">++</span><span class="p">)</span>
<span class="w"> </span><span class="n">B</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">i</span><span class="o">%</span><span class="mi">3</span><span class="o">+</span><span class="mi">1</span><span class="p">;</span><span class="c1">//(rand()%100)/10.0;</span>
<span class="w"> </span><span class="k">for</span><span class="w"> </span><span class="p">(</span><span class="n">i</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span><span class="w"> </span><span class="n">i</span><span class="o">&lt;</span><span class="n">sizeofc</span><span class="p">;</span><span class="w"> </span><span class="n">i</span><span class="o">++</span><span class="p">)</span>
<span class="w"> </span><span class="n">C</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">i</span><span class="o">%</span><span class="mi">3</span><span class="o">+</span><span class="mi">1</span><span class="p">;</span><span class="c1">//(rand()%100)/10.0;</span>
<span class="w"> </span><span class="c1">//#if 0</span>
<span class="w"> </span><span class="n">printf</span><span class="p">(</span><span class="s">&quot;m=%d,n=%d,k=%d,alpha=%lf,beta=%lf,sizeofc=%d</span><span class="se">\n</span><span class="s">&quot;</span><span class="p">,</span><span class="n">m</span><span class="p">,</span><span class="n">n</span><span class="p">,</span><span class="n">k</span><span class="p">,</span><span class="n">alpha</span><span class="p">,</span><span class="n">beta</span><span class="p">,</span><span class="n">sizeofc</span><span class="p">);</span>
<span class="w"> </span><span class="n">gettimeofday</span><span class="p">(</span><span class="o">&amp;</span><span class="n">start</span><span class="p">,</span><span class="w"> </span><span class="nb">NULL</span><span class="p">);</span>
<span class="w"> </span><span class="n">dgemm_</span><span class="p">(</span><span class="o">&amp;</span><span class="n">ta</span><span class="p">,</span><span class="w"> </span><span class="o">&amp;</span><span class="n">tb</span><span class="p">,</span><span class="w"> </span><span class="o">&amp;</span><span class="n">m</span><span class="p">,</span><span class="w"> </span><span class="o">&amp;</span><span class="n">n</span><span class="p">,</span><span class="w"> </span><span class="o">&amp;</span><span class="n">k</span><span class="p">,</span><span class="w"> </span><span class="o">&amp;</span><span class="n">alpha</span><span class="p">,</span><span class="w"> </span><span class="n">A</span><span class="p">,</span><span class="w"> </span><span class="o">&amp;</span><span class="n">m</span><span class="p">,</span><span class="w"> </span><span class="n">B</span><span class="p">,</span><span class="w"> </span><span class="o">&amp;</span><span class="n">k</span><span class="p">,</span><span class="w"> </span><span class="o">&amp;</span><span class="n">beta</span><span class="p">,</span><span class="w"> </span><span class="n">C</span><span class="p">,</span><span class="w"> </span><span class="o">&amp;</span><span class="n">m</span><span class="p">);</span>
<span class="w"> </span><span class="n">gettimeofday</span><span class="p">(</span><span class="o">&amp;</span><span class="n">finish</span><span class="p">,</span><span class="w"> </span><span class="nb">NULL</span><span class="p">);</span>
<span class="w"> </span><span class="n">duration</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">((</span><span class="kt">double</span><span class="p">)(</span><span class="n">finish</span><span class="p">.</span><span class="n">tv_sec</span><span class="o">-</span><span class="n">start</span><span class="p">.</span><span class="n">tv_sec</span><span class="p">)</span><span class="o">*</span><span class="mi">1000000</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="p">(</span><span class="kt">double</span><span class="p">)(</span><span class="n">finish</span><span class="p">.</span><span class="n">tv_usec</span><span class="o">-</span><span class="n">start</span><span class="p">.</span><span class="n">tv_usec</span><span class="p">))</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="mi">1000000</span><span class="p">;</span>
<span class="w"> </span><span class="kt">double</span><span class="w"> </span><span class="n">gflops</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="mf">2.0</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">m</span><span class="w"> </span><span class="o">*</span><span class="n">n</span><span class="o">*</span><span class="n">k</span><span class="p">;</span>
<span class="w"> </span><span class="n">gflops</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">gflops</span><span class="o">/</span><span class="n">duration</span><span class="o">*</span><span class="mf">1.0e-6</span><span class="p">;</span>
<span class="w"> </span><span class="kt">FILE</span><span class="w"> </span><span class="o">*</span><span class="n">fp</span><span class="p">;</span>
<span class="w"> </span><span class="n">fp</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">fopen</span><span class="p">(</span><span class="s">&quot;timeDGEMM.txt&quot;</span><span class="p">,</span><span class="w"> </span><span class="s">&quot;a&quot;</span><span class="p">);</span>
<span class="w"> </span><span class="n">fprintf</span><span class="p">(</span><span class="n">fp</span><span class="p">,</span><span class="w"> </span><span class="s">&quot;%dx%dx%d</span><span class="se">\t</span><span class="s">%lf s</span><span class="se">\t</span><span class="s">%lf MFLOPS</span><span class="se">\n</span><span class="s">&quot;</span><span class="p">,</span><span class="w"> </span><span class="n">m</span><span class="p">,</span><span class="w"> </span><span class="n">n</span><span class="p">,</span><span class="w"> </span><span class="n">k</span><span class="p">,</span><span class="w"> </span><span class="n">duration</span><span class="p">,</span><span class="w"> </span><span class="n">gflops</span><span class="p">);</span>
<span class="w"> </span><span class="n">fclose</span><span class="p">(</span><span class="n">fp</span><span class="p">);</span>
<span class="w"> </span><span class="n">free</span><span class="p">(</span><span class="n">A</span><span class="p">);</span>
<span class="w"> </span><span class="n">free</span><span class="p">(</span><span class="n">B</span><span class="p">);</span>
<span class="w"> </span><span class="n">free</span><span class="p">(</span><span class="n">C</span><span class="p">);</span>
<span class="w"> </span><span class="k">return</span><span class="w"> </span><span class="mi">0</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div>
<div class="highlight"><pre><span></span><code>gcc -o time_dgemm time_dgemm.c /your/path/libopenblas.a -lpthread
./time_dgemm &lt;m&gt; &lt;n&gt; &lt;k&gt;
</code></pre></div>
<h2 id="troubleshooting">Troubleshooting</h2>
<ul>
<li>Please read <a href="../faq/">Faq</a> at first.</li>
<li>Please use gcc version 4.6 and above to compile Sandy Bridge AVX kernels on Linux/MingW/BSD.</li>
<li>Please use Clang version 3.1 and above to compile the library on Sandy Bridge microarchitecture. The Clang 3.0 will generate the wrong AVX binary code.</li>
<li>The number of CPUs/Cores should less than or equal to 256. On Linux x86_64(amd64), there is experimental support for up to 1024 CPUs/Cores and 128 numa nodes if you build the library with BIGNUMA=1.</li>
<li>OpenBLAS does not set processor affinity by default. On Linux, you can enable processor affinity by commenting the line NO_AFFINITY=1 in Makefile.rule. But this may cause <a href="https://stat.ethz.ch/pipermail/r-sig-hpc/2012-April/001348.html">the conflict with R parallel</a>.</li>
<li>On Loongson 3A. make test would be failed because of pthread_create error. The error code is EAGAIN. However, it will be OK when you run the same testcase on shell.</li>
</ul>
<h2 id="blas-reference-manual">BLAS reference manual</h2>
<p>If you want to understand every BLAS function and definition, please read <a href="https://software.intel.com/en-us/intel-mkl/documentation">Intel MKL reference manual</a> or <a href="http://netlib.org/blas/">netlib.org</a></p>
<p>Here are <a href="../extensions/">OpenBLAS extension functions</a></p>
</article>
</div>
<script>var target=document.getElementById(location.hash.slice(1));target&&target.name&&(target.checked=target.name.startsWith("__tabbed_"))</script>
</div>
</main>
<footer class="md-footer">
<div class="md-footer-meta md-typeset">
<div class="md-footer-meta__inner md-grid">
<div class="md-copyright">
Made with
<a href="https://squidfunk.github.io/mkdocs-material/" target="_blank" rel="noopener">
Material for MkDocs
</a>
</div>
</div>
</div>
</footer>
</div>
<div class="md-dialog" data-md-component="dialog">
<div class="md-dialog__inner md-typeset"></div>
</div>
<script id="__config" type="application/json">{"base": "..", "features": [], "search": "../assets/javascripts/workers/search.b8dbb3d2.min.js", "translations": {"clipboard.copied": "Copied to clipboard", "clipboard.copy": "Copy to clipboard", "search.result.more.one": "1 more on this page", "search.result.more.other": "# more on this page", "search.result.none": "No matching documents", "search.result.one": "1 matching document", "search.result.other": "# matching documents", "search.result.placeholder": "Type to start searching", "search.result.term.missing": "Missing", "select.version": "Select version"}}</script>
<script src="../assets/javascripts/bundle.081f42fc.min.js"></script>
</body>
</html>