Martin Kroeker
c3a2d407a0
Merge pull request #4048 from imzhuhl/spr_sbgemm_fix
...
Sapphire Rapids sbgemm fix
2023-06-17 20:47:09 +02:00
gxw
67d1e72e8b
LoongArch64: Add ABI detection for loongarch64
...
If lp64d ABI is supported, it is used; otherwise,
it falls back to the lp64 ABI.
2023-06-08 20:25:35 +08:00
Honglin Zhu
0b83088887
spr dynamic arch support
2023-05-19 10:48:18 +08:00
Martin Kroeker
ebe50458f3
Do not add a -tp to the flags of the nvc compiler if there is one already in CFLAGS
2023-02-09 09:29:27 +01:00
Martin Kroeker
3e64fa72c4
Settings from Makefile(_kernel).conf should be available to DYNAMIC_ARCH kernel builds
2022-12-29 23:05:22 +01:00
Martin Kroeker
ca3b5ae3f0
Pass NO_SVE if set
2022-12-25 12:19:20 +01:00
Martin Kroeker
d16261fbc6
SVE-enabled targets in ARM64 DYNAMIC_ARCH require a recent compiler
2022-12-25 10:19:02 +01:00
Martin Kroeker
57151b97aa
Fix INTERFACE64 builds on riscv and loongarch
2022-12-15 18:52:46 +01:00
Martin Kroeker
62341ac5e1
Fix missing parenthesis
2022-12-15 12:30:16 +01:00
Martin Kroeker
5a294b0c8a
Add -lm on any arm/arm64 BSD, not just FreeBSD
2022-12-15 10:35:47 +01:00
Martin Kroeker
ea6c5f3cf5
Add option RELAPACK_REPLACE
2022-10-30 12:55:23 +01:00
Martin Kroeker
bd30120ba7
Merge pull request #3720 from FlyGoat/mips64
...
Make it work on general MIPS64 processors
2022-08-19 20:24:27 +02:00
Jiaxun Yang
fae9368f14
Implement DYNAMIC_LIST for MIPS64
...
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
2022-08-12 13:13:31 +01:00
Jiaxun Yang
a50b29c540
Provide a fallback MIPS64_GENERIC target
...
It is really dangerous to fallback to Loongson core on other
MIPS64 processors.
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
2022-08-12 13:13:28 +01:00
Martin Kroeker
85fd3c4279
Support compilation with the Cray C and Fortran compilers ( #3712 )
...
* Add support for the Cray Fortran compiler
2022-08-04 20:42:18 +02:00
Martin Kroeker
d0ba257de0
Merge pull request #3704 from XiWeiGu/loongarch64_dynamic_arch
...
LoongArch64: Add DYNAMIC_ARCH support
2022-07-28 20:31:20 +02:00
Rajalakshmi Srinivasaraghavan
1d97405c02
POWER: Enable bfloat16 kernels by default
...
This patch enables bfloat16 kernels by default for POWER processors.
Tested on Linux POWER8, POWER9, POWER10 and AIX POWER10 systems.
2022-07-28 07:43:53 -05:00
gxw
fbfe1daf6e
LoongArch64: Add DYNAMIC_ARCH support
2022-07-28 14:28:45 +08:00
gxw
3573306a69
LoongArch64: Add core LOONGSON2K1000 and LOONGSONGENERIC
2022-07-25 16:04:56 +08:00
Martin Kroeker
407a1a242c
Merge pull request #3670 from martin-frbg/osxvermin
...
Increase MACOSX_DEPLOYMENT_TARGET to 11 on ARM macs
2022-06-29 08:31:04 +02:00
Martin Kroeker
be5500e704
Merge pull request #3669 from VFerrari/fix_small_matrix_kernel
...
POWER: fix issues with the small matrix kernel
2022-06-28 16:09:36 +02:00
Martin Kroeker
914c4d0fe8
Add C versions of the CBLAS test sources ( #3656 )
...
* Add C conversions of the CBLAS tests for NOFORTRAN=1 builds
* Enable CTEST without Fortran and fix passing of BUILD_vartype options to exports/gensymbol
2022-06-28 11:52:48 +02:00
Martin Kroeker
2857987ff6
Increase MACOSX_DEPLOYMENT_TARGET to 11 on ARM macs
2022-06-28 11:46:25 +02:00
VFerrari
2062280c6f
Power: Enable SMALL_MATRIX OPT as default for dynamic arch
2022-06-25 03:47:03 -03:00
Martin Kroeker
8f13ab94d2
Merge pull request #3613 from Rabenda/fix-riscv
...
Fix riscv64 detect
2022-05-04 07:22:47 +02:00
Martin Kroeker
24e99eca31
Avoid adding -lgfortran with NOFORTRAN
2022-04-27 20:31:42 +02:00
Han Gao
3fc52ebcfb
Fix other arch build in detect.
...
When CORE is empty, use -march=loongson3a. Fix it.
Signed-off-by: Han Gao <gaohan@uniontech.com>
2022-04-27 01:34:55 +08:00
Niyas Sait
3f5d145cd4
build: minor fixes to build on windows with make
...
This patch contains following fixes
1. Fix to build without PIC flag
2. Define LAPACK_COMPLEX_STRUCTURE for windows. Builds are failing
without it and changes are consistent with the CMake rules defined
in system.cmake (line 576)
2022-04-25 00:01:12 +01:00
Martin Kroeker
b7873605d4
Use f2c translations of LAPACK when no Fortran compiler is available ( #3539 )
...
* Add C equivalents of the Fortran routines from Reference-LAPACK as fallbacks, and C_LAPACK variable to trigger their use
2022-04-09 22:38:58 +02:00
Martin Kroeker
499ae5e8f7
Merge pull request #3510 from martin-frbg/issue3505
...
Fix recent SkylakeX/DYNAMIC_ARCH DGEMM breakage
2022-01-09 14:50:51 +01:00
Martin Kroeker
f1ac59f200
Forward DYNAMIC_ARCH option to Makefile.prebuild
2022-01-08 23:48:58 +01:00
Sunita Nadampalli
19c8f615dc
OpenBLAS: aarch64: Add neoverse-v1/n2 architecture specifics
2022-01-07 00:28:17 +00:00
Martin Kroeker
ed430cd963
Update -tp option for recent nvfortran on x86_64
2021-12-18 21:56:26 +01:00
kavanabhat
eee3381cbe
Fallback for Power kernels
2021-12-08 03:52:23 -06:00
Martin Kroeker
54d321d742
Merge pull request #3466 from rafaelcfsousa/rafael/small_matrix_p10
...
[POWER] Add small matrix for sgemm/dgemm on Power10
2021-12-03 12:12:20 +01:00
kavanabhat
9a45b5123f
Update Makefile.system
2021-12-02 13:29:38 +05:30
kavanabhat
7b5b93037d
Fix truncated assembler checks
2021-12-01 19:30:40 +05:30
Rafael Cardoso Fernandes Sousa
c78fdcc80d
[POWER] Add support for SMALL_MATRIX_OPT
2021-11-28 12:41:16 -06:00
Martin Kroeker
46947efb83
Ignore compiler support for MIPS MSA if the cpu lacks this capability
2021-11-13 23:32:26 +01:00
Martin Kroeker
9cc0098ce2
Fix potentially wrong HOSTARCH definition in cross-compilation
2021-11-10 22:27:14 +01:00
Martin Kroeker
a6fd497820
Fix nvidia HPC version checks
2021-10-30 17:31:19 +02:00
Martin Kroeker
bb01e26cfe
Adjust compiler options for nvidia hpc 21.9 (and fix a long-standing typo in dynamic_arch settings)
2021-10-29 16:39:03 +02:00
Wangyang Guo
3dc6052c7e
initial support for Sapphire Rapids platform
2021-10-12 01:30:40 -07:00
Martin Kroeker
8e4c209002
Merge pull request #3398 from kavanabhat/aix_p10_gnuas
...
Big Endian Changes for Power10 kernels
2021-10-05 18:59:47 +02:00
Martin Kroeker
04f3ecd026
Fix minor typo
2021-10-04 16:14:32 +02:00
kavanabhat
9cc95e5657
AIX changes for P10 with GNU Compiler
2021-10-01 05:18:35 -05:00
Alexandru Ardelean
b7bb2e36b8
Makefile.system: adjust mipsel/mips64el ARCH variables
...
When building for MIPS{64} little-endian variants, the included makefiles
should be the same as for the big-endian.
There are already some adjustments being done for some ARCH names.
This change adds the ones for the `mipsel` and `mips64el` names, so that
the Makefile.mips{64} files get included.
This comes as a result of: https://github.com/openwrt/packages/issues/16649
Signed-off-by: Alexandru Ardelean <ardeleanalex@gmail.com>
2021-09-26 12:20:16 +03:00
Wangyang Guo
76ea8db4da
Small Matrix: enable by default for x86_64 arch
...
If no customized GEMM_SMALL_M_PERMIT kernel defined, it will just by pass to normal path.
2021-08-05 02:59:36 +00:00
Xianyi Zhang
0a2077901c
Add small marix optimization kernel interface.
...
make SMALL_MATRIX_OPT=1
2021-08-02 07:01:47 +00:00
gxw
34207bdf5b
Fixed typos about LOONGARCH64
2021-07-30 18:11:12 +08:00
gxw
af0a69f355
Add support for LOONGARCH64
2021-07-27 15:29:12 +08:00
User User-User
9335d42740
add gcc8 version matching
2021-06-19 22:21:39 +02:00
User User-User
b7da75e4fd
WiP CORTEX A55 support
2021-06-19 21:37:51 +02:00
MikaelUrankar
4fbc0777f4
Fix typo
2021-05-26 12:14:57 +02:00
Martin Kroeker
26ccf643a3
Add -lm for FreeBSD on ARM/ARM64
2021-05-16 13:04:38 +02:00
Martin Kroeker
3c356b1a1f
Support compilation with the NAG Fortran compiler
2021-03-11 11:51:09 +01:00
Martin Kroeker
20f492c298
Fix AMD AOCC compiler detection
2021-03-01 21:00:10 +01:00
Martin Kroeker
9b2d69aa80
Add DYNAMIC_LIST option for ARM64
2021-01-24 23:18:01 +01:00
Martin Kroeker
6bbe6d5b92
Make compile-time BUFFERSIZE setting actually reach the compiler/preprocessor
2021-01-13 22:36:04 +01:00
pkubaj
7aa1ff8ff6
Fix build on FreeBSD/powerpc64le
2021-01-01 21:19:57 +00:00
Martin Kroeker
75b1f3becc
Limit POWERPC DYNAMIC_CORE list to P8 and P9 for NVIDIA compilers
2020-12-19 23:17:40 +01:00
Martin Kroeker
b212a2fb9f
Add/modify "PGI" compiler options for NVIDIA SDK 20.11
2020-12-19 22:08:37 +01:00
Martin Kroeker
18d8a67485
Merge pull request #2994 from antonblanchard/power10-fixes
...
Power10 fixes
2020-12-11 23:37:30 +01:00
gxw
4b548857d6
Add msa support for loongson
...
1. Using core loongson3r3 and loongson3r4 for loongson
2. Add DYNAMIC_ARCH for loongson
Change-Id: I1c6b54dbeca3a0cc31d1222af36a7e9bd6ab54c1
2020-12-09 10:28:46 +08:00
Martin Kroeker
6c7d557a16
Fix compiler options for 32 and 64bit SPARC builds with SolarisStudio
2020-12-06 19:20:50 +01:00
Martin Kroeker
2e99e2699b
Add workaround for gcc 4.6 miscompiling assembly kernels with -mavx
2020-11-29 15:32:17 +01:00
Martin Kroeker
437702e0e1
Merge pull request #2965 from epsilon-0/develop
...
allow setting soname without suffix or prefix
2020-11-22 12:25:33 +01:00
Anton Blanchard
fdf71d66b3
POWER10: Fix ld version detection
...
LDVERSIONGTEQ35 needs to escape the '>' character.
LDVERSIONGTEQ35 is checking the system ld version which may be different
to the toolchain being used to compile OpenBLAS. We don't have a path
to the linker in our Makefiles, so (ab)use gcc -Wl,--version to get the
version of ld in our toolchain.
2020-11-19 20:50:42 +11:00
Xianyi Zhang
fc35b72ae1
Refs #2899
...
Merge branch 'openblas-open-910' of git://github.com/damonyu1989/OpenBLAS into damonyu1989-openblas-open-910
2020-11-10 09:38:04 +08:00
Xianyi Zhang
913cc9a4ca
Merge branch 'develop' into risc-v
2020-11-10 09:18:25 +08:00
Martin Kroeker
1c4cfdc139
Stay compatible with old gmake that did not support undefine
2020-11-08 00:12:55 +01:00
Martin Kroeker
f6a57d8f63
Update Makefile.system
2020-11-08 00:01:36 +01:00
Martin Kroeker
f4b7ba12b7
Update Makefile.system
2020-11-07 23:37:21 +01:00
Martin Kroeker
a04f532edf
Reset cpu property flags between build cycles in DYNAMIC_ARCH mode
2020-11-07 20:37:03 +01:00
Martin Kroeker
8cc73fee98
Export NO_EXPRECISION after overriding for DYNAMIC_ARCH with GENERIC target
2020-11-03 23:47:04 +01:00
Aisha Tammy
60997ddd73
allow setting soname without suffix or prefix
...
Allows to create a library with a different
SONAME without the need to add suffixes to symbols
Backwards compatible and should have no effect
on the workflow and previous users.
Useful for allowing INTERFACE64 library alongside
the standard library without file conflicts
2020-11-02 13:04:53 +00:00
Martin Kroeker
40a93c232b
Disable EXPRECISION for DYNAMIC_ARCH in combination with TARGET=GENERIC
...
NO_EXPRECISION is disabled for the GENERIC_TARGET already, so prevent mixing with code parts that use a different float size by default
2020-11-01 21:58:26 +01:00
Chen, Guobing
c5e62dad69
Fix cooperlake compile issue
...
Add a missing macro which is required in Makefile.x86_64 due to recent
clearnup, which causes cooperlake platform build failure.
2020-10-29 03:37:59 +08:00
Martin Kroeker
878b6d1f41
Remove spurious expr in flang version check
2020-10-26 21:35:40 +01:00
Martin Kroeker
1a0f57c8f0
Fix missing backquotes
2020-10-20 08:37:53 +02:00
Martin Kroeker
bb8c3f6861
Add ld/binutils version check for POWER10 support
2020-10-20 01:04:20 +02:00
Zhang Xianyi
d7ba7679b6
Merge branch 'develop' into risc-v
2020-10-16 23:27:38 +08:00
damonyu
ef8e7d0279
Add the support for RISC-V Vector.
...
Change-Id: Iae7800a32f5af3903c330882cdf6f292d885f266
2020-10-15 16:09:02 +08:00
Martin Kroeker
2c552f1074
Change "HALF" and "sh" to "BFLOAT16" and "sb"
2020-10-12 00:11:31 +02:00
Martin Kroeker
ae8b0d257a
Set BUILD_ options to 1 instead of just defining them
2020-10-11 18:08:21 +02:00
Martin Kroeker
8c5e08076e
If none of the BUILD_ options is set, enable them all
2020-10-11 17:33:51 +02:00
Marius Hillenbrand
75d440caa0
s390x/DYNAMIC_ARCH: fixup broken merge and reapply simplification
...
An unrelated commit and merge inadvertently reverted our recent two
changes for simplifying DYNAMIC_ARCH on s390x. Simply reapply the
changes.
Simplify detection of which kernels we can compile on s390x. Instead of
decoding the gcc version in a complicated manner, just check if CC
supports a given -march=archXY flag. Together with the next patch, we
thereby gain support for builds with LLVM/clang with DYNAMIC_ARCH=1.
To enable builds with DYNAMIC_ARCH with older compiler releases, the
Makefile and drivers/other/dynamic_arch.c need a common view of the
architecture support built into the library.
We follow the notation from x86 when used with DYNAMIC_LIST, where
defines DYN_<ARCH NAME> denote support for a given generation to be
built in. Since there are far fewer architecture generations in OpenBLAS
for s390x, that does not bloat command lines too much.
Closes : #2842
Fixes: ba644378dc ("Copy BUILD_ options available to the compiler flags"
Signed-off-by: Marius Hillenbrand <mhillen@linux.ibm.com>
2020-09-17 17:09:03 +02:00
Martin Kroeker
ba644378dc
Copy BUILD_ options available to the compiler flags
2020-09-14 00:03:33 +02:00
Marius Hillenbrand
4f34bcfb5e
s390x/DYNAMIC_ARCH: pass supported arch levels from Makefile to run-time code
...
... instead of duplicating the (old) mechanism from the Makefile that
aimed to derive supported architecture generations from the gcc
version.
To enable builds with DYNAMIC_ARCH with older compiler releases, the
Makefile and drivers/other/dynamic_arch.c need a common view of the
architecture support built into the library.
We follow the notation from x86 when used with DYNAMIC_LIST, where
defines DYN_<ARCH NAME> denote support for a given generation to be
built in. Since there are far fewer architecture generations in OpenBLAS
for s390x, that does not bloat command lines too much.
Signed-off-by: Marius Hillenbrand <mhillen@linux.ibm.com>
2020-09-08 19:34:18 +02:00
Marius Hillenbrand
0629d8ebdb
s390x/DYNAMIC_ARCH: generalize detecting supported archs for clang
...
Simplify detection of which kernels we can compile on s390x. Instead of
decoding the gcc version in a complicated manner, just check if CC
supports a given -march=archXY flag. Together with the next patch, we
thereby gain support for builds with LLVM/clang with DYNAMIC_ARCH=1.
Signed-off-by: Marius Hillenbrand <mhillen@linux.ibm.com>
2020-09-08 19:34:18 +02:00
pkubaj
48a1364e10
Add aliases for armv6, armv7
...
FreeBSD uses those names for 32-bit ARM variants.
2020-08-23 18:50:19 +00:00
Chen, Guobing
e740c4873d
Enable COOPERLAKE build target
...
Enable new build target platform -- COOPERLAKE. This target platform
supports all the SKYLAKEX supported ISAs + avx512bf16. So all the
SKYLAKEX specific kernels/drivers and related code are now extended
to be also active on COOPERLAKE. Besides, new BF16 related kernels
are active under this target.
2020-08-13 06:18:00 +08:00
Ashwin Sekhar T K
4e1be0e481
ARM64: Add THUNDERX3T110 Target
2020-07-26 23:32:24 -07:00
Rajalakshmi Srinivasaraghavan
9be2688c78
Fix to store results in correct order for POWER10 GEMM kernels
...
There is a recent compiler change in __builtin_mma_disassemble_acc() which
affects the order of storing result in POWER10. Also removing new LDFLAG
-mno-power10-stub as it is handled by linker automatically.
2020-07-24 23:08:11 -05:00
Martin Kroeker
9796e552ea
Avoid undefining NAME,CNAME etc for pgcc as it makes it ignore the new defininitions
2020-07-23 17:03:28 +02:00
Wileam Phan
9ae154ba89
Patch for building on Summit
2020-07-20 23:30:28 -04:00
Rajalakshmi Srinivasaraghavan
417c4e8af8
Add new linker option for POWER10
...
While building with DYNAMIC_ARCH on POWER9 with POWER10
aware toolchain, new LDFLAG is needed to avoid POWER10
instructions on PLT calls .
2020-07-14 11:54:04 -05:00
Martin Kroeker
419b8686d1
Merge pull request #2682 from martin-frbg/aix
...
[WIP] fix compilation on AIX
2020-07-13 14:43:24 +02:00
Martin Kroeker
5865c7d4d6
Make 32bit POWER8 use POWER6 kernels for now
2020-07-12 18:59:01 +02:00
Rajalakshmi Srinivasaraghavan
af1e140e35
Change minimum gcc version for POWER10
...
As the MMA patches for POWER10 are backported to gcc10.2, changing
the minimum gcc version needed to build OpenBLAS for POWER10.
2020-07-09 21:46:06 -05:00