Martin Kroeker
e9cd11768c
Enable parallel make on MS Windows by default
...
fixes #874
2018-06-09 17:54:36 +02:00
Arjan van de Ven
99c7bba8e4
Initial support for SkylakeX / AVX512
...
This patch adds the basic infrastructure for adding the SkylakeX (Intel Skylake server)
target. The SkylakeX target will use the AVX512 (AVX512VL level) instruction set,
which brings 2 basic things:
1) 512 bit wide SIMD (2x width of AVX2)
2) 32 SIMD registers (2x the number on AVX2)
This initial patch only contains a trivial transofrmation of the Haswell SGEMM kernel
to AVX512VL; more will follow later but this patch aims to get the infrastructure
in place for this "later".
Full performance tuning has not been done yet; with more registers and wider SIMD
it's in theory possible to retune the kernels but even without that there's an
interesting enough performance increase (30-40% range) with just this change.
2018-06-03 07:58:52 +00:00
Alex Arslan
a41d241a0e
Add support for DragonFly BSD
2018-04-03 16:39:29 -07:00
Alex Arslan
8da6b6ae52
Allow building on OpenBSD
...
With this change, OpenBLAS builds and all tests pass on OpenBSD 6.2
using Clang. Tested on x86-64 only, with and without DYNAMIC_ARCH=1.
2018-04-02 10:48:22 -07:00
Martin Kroeker
efa84afd00
Use get_corename for SPARC as well
2018-02-01 18:20:38 +01:00
Shivraj Patil
e3d844b062
Added mips I6500 core
...
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
2017-09-22 11:57:43 +05:30
Sébastien Villemot
7543e578a4
Add support for TARGET=ZARCH_GENERIC and TARGET=Z13
2017-09-19 12:16:42 +02:00
Martin Kroeker
166d64eb7c
Fix FORCE_ZEN option in getarch.c
2017-04-19 14:20:42 +02:00
Denis Steckelmacher
c9ff735da6
Add ZEN support (tested for auto-detected static backend)
2017-03-19 15:32:50 +01:00
Ashwin Sekhar T K
4b55fae337
ARM64: Add Cavium THUNDERX2T99 Target
2017-01-11 11:18:40 +05:30
Andrew Pinski
fb200c7245
ARM64: Add Cavium THUNDERX Target
2017-01-10 15:01:37 +05:30
Ashwin Sekhar T K
4713e7c47f
ARM64: Add the VULCAN Target
2017-01-10 15:01:17 +05:30
Zhang Xianyi
b678471d65
Merge branch 'z13' into develop
...
Conflicts:
CONTRIBUTORS.md
2017-01-09 05:52:42 -05:00
Martin Kroeker
570bc9afbd
Fix spurious define in openblas_config.h
...
TARGET as specified with make is already return-terminated when getarch reads it. This led to an empty line written to config_last.h that awk in Makefile.install then expanded to a spurious "#define OPENBLAS_" in openblas_config.h (as noted by "kmb" on the mailing list)
2016-11-06 17:29:33 +01:00
Howard Su
ff1da01476
USE NPROCESSOR_CONF instaed of NPORCESSOR_ONLN
...
to determine the number of CPU. In ARM platform,
online CPU will increasing when there is more workload.
while configure cpu is the max number of CPU.
2016-10-13 12:37:50 +00:00
Shivraj Patil
beb1d076a4
Added MSA optimization for GEMV_N, GEMV_T, ASUM, DOT functions
...
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
2016-07-15 18:38:25 +05:30
Shivraj Patil
2c3dfe2bf3
MIPS P5600(32 bit) and I6400(64 bit) cores support added.
...
Seperated mips and mips64 files.
Configurations support for mips 32 bit.
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
2016-04-22 14:03:18 +05:30
Zhang Xianyi
dd43661cfd
Init IBM z system (s390x) porting.
2016-04-15 18:02:24 -04:00
Jerome Robert
7aac0aff8e
Allow to force to do not use -j as make argument
...
Close #828 (hopefully)
2016-03-31 23:03:52 +02:00
Werner Saar
b752858d6c
added dgemm-, dtrmm-, zgemm- and ztrmm-kernel for power8
2016-03-01 07:33:56 +01:00
Zhang Xianyi
3e8d6ea74f
Init POWER8 kernels by POWER6.
2015-11-03 12:34:23 +08:00
Zhang Xianyi
aaa8551c57
Merge pull request #749 from lotheac/illumos_fixes
...
illumos fixes
2016-01-26 08:42:20 -06:00
Lauri Tirkkonen
8635d425c1
make parallel make work on illumos
2016-01-22 18:55:48 +02:00
Jerome Robert
ba024fcfc0
Allow to force the number of parallel make job
...
This is particularly useful when using distcc
2015-12-28 19:45:29 +01:00
Ashwin Sekhar T K
f2f8a0fe8b
Adding arm64 target CORTEXA57
...
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
2015-11-09 14:15:50 +05:30
Zhang Xianyi
94b125255f
Merge branch 'develop' into cmake
...
Conflicts:
driver/others/memory.c
2015-10-13 04:46:08 +08:00
Zhang Xianyi
f27942a68a
Fixed make TARGET=CORTEXA9 and CORTEXA15 bug.
2015-09-26 14:42:44 +00:00
Grazvydas Ignotas
d38a1ddc7a
use real armv5 support
...
there is no more requirement for ARMv6 instructions,
and VFP on ARMv5 is uncommon
2015-08-16 18:59:18 +02:00
Fábio Perez
b8d64a856a
Add POWER7/POWER8 as targets
2015-08-05 11:02:39 -03:00
Zhang Xianyi
dcd5ba4443
Merge branch 'cmake' of https://github.com/hpanderson/OpenBLAS into hpanderson_cmake
2015-07-22 04:06:39 +08:00
Zhang Xianyi
51ff17d46e
Add AMD Excavator target.
2015-05-13 16:16:30 -05:00
Zhang Xianyi
c674fa32be
Add ARM targets.
2015-03-24 12:17:04 -05:00
Zhang Xianyi
229ce2ccd1
Add cortex-a9 and cortex-a15 targets.
2015-01-12 08:55:29 +00:00
Hank Anderson
1a41022e3e
Added MSVC defines to cpuid.h and getarch.c.
2015-01-01 21:01:28 -06:00
Werner Saar
4319769b79
added target processor STEAMROLLER
2014-12-28 20:16:46 +08:00
Zhang Xianyi
2fb02626da
Update organization info.
2014-11-25 15:28:58 +08:00
Benedikt Huber
58c90d5937
# The first commit's message is:
...
Optimizations for APM's xgene-1 (aarch64).
1) general system updates to support armv8 better. Make all did not work, one needed to supply TARGET=ARMV8.
2) sgem 4x4 kernel in assembler using SIMD, and configuration changes to use it.
3) strmm 4x4 kernel in C. Since the sgem kernel does 4x4, the trmm kernel must also do 4xN.
Added Dave Nuechterlein to the contributors list.
2014-11-11 22:19:23 +08:00
Zhang Xianyi
552119c484
Fixed #407 . Support outputing the CPU corename on runtime.
...
The user can use char * openblas_get_config() or char * openblas_get_corename().
2014-07-08 12:48:08 +08:00
Timothy Gu
6c2ead30f0
Remove all trailing whitespace except lapack-netlib
...
Signed-off-by: Timothy Gu <timothygu99@gmail.com>
2014-06-27 12:05:18 -07:00
wernsaar
43fbdb7a5a
added ARMV5 as reference platform
2014-05-13 17:25:19 +02:00
wernsaar
b3254eecaf
Merge remote branch 'origin/haswell' into develop
2013-12-01 18:09:12 +01:00
wernsaar
d13aa79d26
modified getarch.c
2013-12-01 17:31:22 +01:00
Zhang Xianyi
2638370844
Init code base for Intel Haswell.
2013-08-13 00:54:59 +08:00
Zhang Xianyi
673e453b3f
Enable bulldozer kernels.
2013-08-05 16:07:54 +08:00
Zhang Xianyi
5b504d6c23
Refs #263 . Rollback bulldozer and piledriver kernels to barcelona kernels.
2013-07-28 17:39:24 +08:00
Zhang Xianyi
fbb75e58b1
Fixed the typo in getarch.c
2013-07-09 16:26:59 +08:00
Zhang Xianyi
f54f5bac9e
Refs #248 . Fixed the LSB compatiable issue for BLAS only.
...
For example, make CC=lsbcc NO_LAPACK=1.
2013-07-09 15:38:03 +08:00
Zhang Xianyi
886cbaf4e4
Support AMD Piledriver by bulldozer kernels.
2013-07-06 12:06:43 -03:00
Zhang Xianyi
48bdc1ad3b
Added NO_PARALLEL_MAKE flag to disable parallel make.
2013-04-15 21:37:30 +08:00
Explorer09
53588bc786
getarch.c: Minor re-ordering of architecture list
2013-03-17 23:09:23 +08:00