Commit Graph

3327 Commits

Author SHA1 Message Date
Martin Kroeker
0b09516678 Fix missing parameter in popen call 2018-12-06 18:33:05 +01:00
Martin Kroeker
6ba30e270d Fix typo that broke CNRM2 on ARMV8 since 0.3.0
must have happened in my #1449
2018-12-06 13:42:25 +01:00
Martin Kroeker
bf23518e36 Merge pull request #1903 from rengolin/armv8
Fix two mistakes on Arm64 builds
2018-12-05 22:10:53 +01:00
Renato Golin
31a490ea88 Fix two mistakes on Arm64 builds
* Falkor is an ARMv8.0 with ARMv8.1 features, and chosing armv8.1-a for
   march generates instructions it cannot cope with. Reverting it back
   to armv8-a.
 * ThunderX2's build was left with a #define VULCAN, which made it miss
   the right compiler flags in Makefile.arm64, although it did create
   the right library in the end.
2018-12-05 18:51:38 +00:00
Martin Kroeker
701ea88347 Use p2align instead of align for OSX compatibility
fixes #1902
2018-12-03 13:06:43 +01:00
Martin Kroeker
721c56c224 Merge pull request #1899 from brada4/fbsd12
Add mutually supported architecture mappings for FreeBSD12 ports
2018-12-03 12:50:27 +01:00
Martin Kroeker
c5f8aeff2d Merge branch 'develop' into fbsd12 2018-12-03 12:50:14 +01:00
Martin Kroeker
8278cbe7f8 Merge pull request #1894 from pkubaj/patch-2
Use correct ARCH name on BSD powerpc64
2018-12-03 12:48:53 +01:00
Martin Kroeker
ea6d1b96bd Update Makefile.system 2018-12-03 08:59:10 +01:00
Martin Kroeker
360374be62 Update with the changes from 0.3.4 2018-12-02 23:44:13 +01:00
Martin Kroeker
f5acaad8f0 Increment version to 0.3.5.dev 2018-12-02 23:43:15 +01:00
Martin Kroeker
93fa6b7b76 Increment version to 0.3.5.dev 2018-12-02 23:42:33 +01:00
Martin Kroeker
b028960aba Merge branch 'release-0.3.0' into develop 2018-12-02 23:38:49 +01:00
Martin Kroeker
3c9e3faedb fixup BSD naming of powerpc arch 2018-12-02 23:24:53 +01:00
Andrew
44c81fd135 oops 2018-12-02 20:27:53 +01:00
Andrew
26b3710485 Add architecture mappings for FreeBSD12 2018-12-02 12:07:41 +01:00
Andrew
84e614d0fd init 2018-12-02 12:05:15 +01:00
Martin Kroeker
dceff5542c Handle Android environments that identify as Linux (#1898)
* Handle Android environments that identify as Linux

termux terminal emulator does this, causing build failures through missed defines in common.h
2018-12-01 20:56:11 +01:00
Martin Kroeker
6c7b691083 Really revert xDOT changes from 1832
neglected to rebase #1892 on merging
2018-11-30 21:32:01 +01:00
Martin Kroeker
5f4c550c27 Merge pull request #1892 from martin-frbg/mipsdot
revert MIPS64 xDOT kernel changes from #1832
2018-11-30 21:28:21 +01:00
pkubaj
731b2722ba Fix build on POWER, remove DragonFly, add NetBSD
__asm is complete on its own

DBSD developers state they will only support amd64, but NetBSD supports POWER.
2018-11-30 21:12:05 +01:00
pkubaj
f85ce54d4a Use correct Makefile on powerpc64
FreeBSD uses powerpc64 name for POWER architecture. Use correct Makefile for this platform.
2018-11-30 16:05:49 +00:00
Andrew
2601cd58ab remove surplus locking code , only enabled w x86, disabled or never enabled on all others 2018-11-30 11:38:19 +01:00
Martin Kroeker
95a5542e3c Revert DOT kernel changes from #1834
as the failures seen on Loongson3A appear to be limited to DSDOT/SDSDOT (i.e. my hackish "fix" from #1684)
2018-11-30 11:16:24 +01:00
Martin Kroeker
7a2e1bc804 Use generic kernel for DSDOT/SDSDOT
as discussed in #1834
2018-11-30 10:57:09 +01:00
Martin Kroeker
35653e38b3 Merge pull request #1834 from fengrl/develop
register push/pop command change
2018-11-30 10:48:46 +01:00
Martin Kroeker
71e25ae42f Merge pull request #1890 from martin-frbg/issue1889
Include version number in openblas_get_config output
2018-11-29 15:47:35 +01:00
Martin Kroeker
97d7298973 call it OpenBLAS not just version 2018-11-29 11:52:08 +01:00
Martin Kroeker
de0d0ed52f Improve formatting of config output 2018-11-29 11:28:19 +01:00
Martin Kroeker
081ceb3e02 Propagate version number for openblas_get_config 2018-11-29 00:12:04 +01:00
Martin Kroeker
a29ec458c2 propagate verison number for openblas_config_version 2018-11-29 00:10:49 +01:00
Martin Kroeker
816775e309 Add version information to openblas_get_config output 2018-11-29 00:06:44 +01:00
Martin Kroeker
b6363f4539 Merge pull request #1885 from brada4/freebsd
Fix freebsd clang compilation of skylakex
2018-11-25 22:20:13 +01:00
Andrew
19c4bdd8b3 Add return value so that freebsd system clang does not err out 2018-11-25 21:35:01 +01:00
Andrew
f049a4c84f init 2018-11-25 21:34:09 +01:00
Martin Kroeker
f72fdf525c Merge pull request #1875 from martin-frbg/issue1851
Serialize accesses to parallelized level3 functions from multiple cal…
2018-11-25 20:53:46 +01:00
Martin Kroeker
5393759a98 Merge pull request #1869 from martin-frbg/axpy0
Handle special case INCX=0,INCY=0 in the axpy interface
2018-11-25 20:52:49 +01:00
Martin Kroeker
5cf18e2875 Merge pull request #1878 from kiwifb/PGI_f_check
Correct link flags for PGI compiler.
2018-11-25 20:51:50 +01:00
Martin Kroeker
910050985a Merge pull request #1876 from rengolin/armv8-cleanup
Simplifying ARMv8 build parameters
2018-11-25 20:51:24 +01:00
François Bissey
0184713e1a Correct link flags for PGI compiler. 2018-11-21 14:24:56 +13:00
Martin Kroeker
45c3c459e1 Merge pull request #1868 from martin-frbg/aix_cpuid
Use prtconf to determine CPU type on AIX
2018-11-20 17:25:57 +01:00
Martin Kroeker
113cb00b95 fix missing parenthesis 2018-11-19 21:01:36 +01:00
Martin Kroeker
5192651706 Add CriticalSection handling instead of mutexes for Windows 2018-11-19 17:58:22 +01:00
Renato Golin
310ea55f29 Simplifying ARMv8 build parameters
ARMv8 builds were a bit mixed up, with ThunderX2 code in ARMv8 mode
(which is not right because TX2 is ARMv8.1) as well as requiring a few
redundancies in the defines, making it harder to maintain and understand
what core has what. A few other minor issues were also fixed.

Tests were made on the following cores: A53, A57, A72, Falkor, ThunderX,
ThunderX2, and XGene.

Tests were: OpenBLAS/test, OpenBLAS/benchmark, BLAS-Tester.

A summary:
 * Removed TX2 code from ARMv8 build, to make sure it is compatible with
   all ARMv8 cores, not just v8.1. Also, the TX2 code has actually
   harmed performance on big cores.
 * Commoned up ARMv8 architectures' defines in params.h, to make sure
   that all will benefit from ARMv8 settings, in addition to their own.
 * Adding a few more cores, using ARMv8's include strategy, to benefit
   from compiler optimisations using mtune. Also updated cache
   information from the manuals, making sure we set good conservative
   values by default. Removed Vulcan, as it's an alias to TX2.
 * Auto-detecting most of those cores, but also updating the forced
   compilation in getarch.c, to make sure the parameters are the same
   whether compiled natively or forced arch.

Benefits:
 * ARMv8 build is now guaranteed to work on all ARMv8 cores
 * Improved performance for ARMv8 builds on some cores (A72, Falkor,
   ThunderX1 and 2: up to 11%) over current develop
 * Improved performance for *all* cores comparing to develop branch
   before TX2's patch (9% ~ 36%)
 * ThunderX1 builds are 14% faster than ARMv8 on TX1, 9% faster than
   current develop's branch and 8% faster than deveop before tx2 patches

Issues:
 * Regression from current develop branch for A53 (-12%) and A57 (-3%)
   with ARMv8 builds, but still faster than before TX2's commit (+15%
   and +24% respectively). This can be improved with a simplification of
   TX2's code, to be done in future patches. At least the code is
   guaranteed to be ARMv8.0 now.

Comments:
 * CortexA57 builds are unchanged on A57 hardware from develop's branch,
   which makes sense, as it's untouched.
 * CortexA72 builds improve over A57 on A72 hardware, even if they're
   using the same includes due to new compiler tunning in the makefile.
2018-11-19 16:41:49 +00:00
Martin Kroeker
2e6fae2aad Serialize accesses to parallelized level3 functions from multiple callers
for #1851
2018-11-19 14:02:50 +01:00
Martin Kroeker
368d14f8c8 Fix harmless typo
fixes #1872
2018-11-16 14:58:28 +01:00
Martin Kroeker
42bc2a9202 Fix copy-paste errors (POWER8/9 and extraneous return) 2018-11-16 12:10:44 +01:00
fengruilin
43bb386b10 fix dot problem on 64bit mips 2018-11-15 11:11:59 +08:00
Martin Kroeker
c171b8ad13 Handle special case INCX=0,INCY=0 in the axpy interface 2018-11-13 13:57:18 +01:00
Martin Kroeker
2f04cf22ac Detect POWER9 as POWER8 on AIX and Linux
(already supported by the *BSD version)
2018-11-13 08:16:14 +01:00