Commit Graph

77 Commits

Author SHA1 Message Date
Martin Kroeker a36eb19ae0
Update conditional for C11 atomics to use HAVE_C11 2020-07-18 17:13:24 +00:00
Rajalakshmi Srinivasaraghavan 9fe930f205 powerpc: Add support for future processor
This is the initial patch to support build infrastructure
for POWER10 architecture.
2020-06-11 15:47:20 -05:00
Rajalakshmi Srinivasaraghavan 67cc4b9e16 Fix warnings in clang and export symbol 2020-04-15 19:15:23 -05:00
Rajalakshmi Srinivasaraghavan 7eb55504b1 RFC : Add half precision gemm for bfloat16 in OpenBLAS
This patch adds support for bfloat16 data type matrix multiplication kernel.
For architectures that don't support bfloat16, it is defined as unsigned short
(2 bytes).  Default unroll sizes can be changed as per architecture as done for
SGEMM and for now 8 and 4 are used for M and N.  Size of ncopy/tcopy can be
changed as per architecture requirement and for now, size 2 is used.

Added shgemm in kernel/power/KERNEL.POWER9 and tested in powerpc64le and
powerpc64.  For reference, added a small test compare_sgemm_shgemm.c to compare
sgemm and shgemm output.

This patch does not cover OpenBLAS test, benchmark and lapack tests for shgemm.
Complex type implementation can be discussed and added once this is approved.
2020-04-14 14:55:08 -05:00
Martin Kroeker 79fd006c58
Expose the support_avx512 function provided in dynamic.c 2020-03-26 21:25:39 +01:00
Martin Kroeker d2cb610272
Add option USE_LOCKING for single-threaded build with locking support
for calling from concurrent threads
2019-05-15 23:18:43 +02:00
Jeff Baylor 40e53e52d6 snprintf define consolidated to common.h 2019-04-22 17:01:34 -07:00
Martin Kroeker 7c51cc8527
Merge branch 'develop' into develop 2019-03-29 19:36:29 +01:00
AbdelRauf 853a18bc17 power9 makefile. dgemm based on power8 kernel with following changes : 32x unrolled 16x4 kernel and 8x4 kernel using (lxv stxv butterfly rank1 update). improvement from 17 to 22-23gflops. dtrmm cases were added into dgemm itself 2019-03-29 15:49:40 +00:00
Erik M. Bray 1006ff8a7b Use POSIX getenv on Cygwin
The Windows-native GetEnvironmentVariable cannot be relied on, as
Cygwin does not always copy environment variables set through Cygwin
to the Windows environment block, particularly after fork().
2019-03-15 15:06:30 +01:00
Andrew 9531d0e175 lets fit it in one 4k page 2018-11-06 17:51:24 +00:00
Andrew 3fd41313fc add low bound for number of buffers 2018-11-06 09:40:13 +00:00
Steven G. Johnson 48610a4524
fix blasabs for windows
Bugfix in #1713 for Windows (LLP64), where `blasabs` needs to be `llabs` rather than `labs` for the 64-bit API.
2018-08-05 08:18:51 -04:00
Martin Kroeker 4a553e8678
Merge pull request #1713 from martin-frbg/issue1710
Introduce blasabs macro and use it to switch between abs and labs for INTERFACE64
2018-08-04 23:51:31 +02:00
Martin Kroeker 40c068a875
Introduce blasabs() to switch between abs() and labs() for INTERFACE64 2018-08-04 20:07:59 +02:00
Zoltán Mizsei 6463bffd59 Haiku supporting patches 2018-08-02 20:49:14 +02:00
Martin Kroeker de8fff671d
Revert "Use usleep instead of sched_yield by default" 2018-06-11 17:05:27 +02:00
Martin Kroeker ed7c4a043b
Use usleep instead of sched_yield by default
sched_yield only burns cpu cycles, fixes #900,  see also #923, #1560
2018-06-07 10:18:26 +02:00
Martin Kroeker 83da278093
Update common.h 2018-06-06 09:27:49 +02:00
Martin Kroeker 358d4df2bd
Merge branch 'develop' into issue1593-2 2018-06-06 09:21:41 +02:00
Martin Kroeker 06d43760e4
Restore _Atomic define before stdatomic.h for old gcc
see #1593
2018-06-06 09:18:10 +02:00
Martin Kroeker 354a976a59
Fix inverted condition in _Atomic declaration
fixes #1593
2018-06-05 10:31:34 +02:00
zhiyong.dang 53457f222f move _Atomic define to common.h 2018-05-11 00:13:16 -07:00
Zhiyong Dang 1b83341d19 Fix race condition in blas_server_omp.c
Change-Id: Ic896276cd073d6b41930c7c5a29d66348cd1725d
2018-04-27 17:00:42 +08:00
Alex Arslan a41d241a0e
Add support for DragonFly BSD 2018-04-03 16:39:29 -07:00
Alex Arslan 8da6b6ae52
Allow building on OpenBSD
With this change, OpenBLAS builds and all tests pass on OpenBSD 6.2
using Clang. Tested on x86-64 only, with and without DYNAMIC_ARCH=1.
2018-04-02 10:48:22 -07:00
Isuru Fernando eb98fdddfc typedefs only for c 2017-07-29 20:38:16 +05:30
Isuru Fernando ca17b4b75c Fix complex support for MSVC headers 2017-07-28 11:50:29 +05:30
Neil Shipp 34513be726 Add Microsoft Windows 10 UWP build support 2017-06-23 13:07:34 -07:00
Martin Kroeker ea26b00c06 Fix CREAL,CIMAG macros for PGI 2017-03-13 00:36:01 +01:00
Zhang Xianyi b678471d65 Merge branch 'z13' into develop
Conflicts:
	CONTRIBUTORS.md
2017-01-09 05:52:42 -05:00
Daniel Patrick Foose a94f2b7848 Change to allow compiling with USE_OPENMP on MSVC
MSVC treats the declaration of omp_in_parallel and omp_get_num_procs without the modifiers __declspec(dllimport) and __cdecl as a redefinition.
2016-06-14 14:37:28 -04:00
Werner Saar 6a2bde7a2d optimized dgemm and dgetrf for POWER8 2016-05-17 14:45:27 +02:00
Shivraj Patil 2c3dfe2bf3 MIPS P5600(32 bit) and I6400(64 bit) cores support added.
Seperated mips and mips64 files.
Configurations support for mips 32 bit.

Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
2016-04-22 14:03:18 +05:30
Zhang Xianyi dd43661cfd Init IBM z system (s390x) porting. 2016-04-15 18:02:24 -04:00
Werner Saar 2e6333f74e modified common.h for piledriver 2016-03-09 15:48:29 +01:00
Zhang Xianyi a1a96589aa Fixed #773 blas_quickdivide bug on CMake and Visual Studio x86 32-bit. 2016-02-04 15:23:32 -05:00
Jerome Robert 87a2ccc37c Factorize MAX_STACK_ALLOC code to common_stackalloc.h
Ref #727
2016-01-08 16:03:52 +01:00
Werner Saar 5f2fa15e04 include sched.h if OS is Android 2016-01-05 12:36:49 +01:00
Ashwin Sekhar T K 9742dba595 Fix compiler errors in common.h 2015-11-09 14:15:50 +05:30
Zhang Xianyi 63c56d3da9 Only include complex.h since Android 5.0 2015-10-27 10:47:55 -05:00
Zhang Xianyi 8fade093aa Fixed cmake bug on Visual Studio. 2015-10-20 14:37:22 -05:00
Zhang Xianyi 94b125255f Merge branch 'develop' into cmake
Conflicts:
	driver/others/memory.c
2015-10-13 04:46:08 +08:00
Zhang Xianyi 3684706a12 Include time.h. 2015-10-08 15:18:54 +00:00
buffer51 2297a2d989 Fixed error in common.h for Android compilation introduced by e12cf1123e 2015-09-03 20:54:21 -04:00
Grazvydas Ignotas 3efeaed0d8 correct a minor mistake 2015-08-16 20:12:04 +02:00
Grazvydas Ignotas 6b92204a7c add fallback blas_lock implementation
to be used on armv5 and new platforms
2015-08-16 18:59:17 +02:00
Grazvydas Ignotas e12cf1123e add fallback rpcc implementation
- use on arm, arm64 and any new platform
- use faster integer math instead of double
- use similar scale as rdtsc so that timeouts work
2015-08-16 18:59:16 +02:00
Zhang Xianyi f8eba3d548 Fixed cmake build bugs on Linux. 2015-08-11 16:25:16 -05:00
Zhang Xianyi f874465bb8 Use cmake to build OpenBLAS GENERIC Target on MSVC x86 64-bit.
Disable CBLAS and LAPACK.
2015-08-10 14:10:44 -05:00