Commit Graph

35 Commits

Author SHA1 Message Date
Martin Kroeker affeef0b9c
Fix gmake build not always picking the right ARM64 arch options for clang (#4136)
* Fix gcc version checks erroneously excluding clang

* Avoid some mtune names not supported by (Apple)Clang
2023-07-13 08:38:03 +02:00
Chris Sidebottom f76e3de3a5 Remove SVE from Arm(R) Neoverse(TM) N1 CPU in Makefile
I incorrectly added `+sve` to the Neoverse(TM) N1 CPUs GCC parameters,
which doesn't support SVE - this results in failed builds when using a
compiler that doesn't support `-mtune=neoverse-n1` which appears to hide
the mistake.
2022-12-06 21:23:07 +00:00
Chris Sidebottom fd4f52c797 Add SVE implementation for sdot/ddot
This adds an SVE implementation to sdot/ddot when available, falling back to the previous Advanced SIMD kernel where there's no SVE implementation for the kernel.

All the targets were essentially treating `dot_thunderx2t99.c` as the Advanced SIMD implementation so I've renamed it to better fit with the feature detection.
2022-12-01 12:07:50 +00:00
Martin Kroeker c957ad684e
Bump gcc requirement for NeoverseN2 and V1 to 10.4 2022-11-09 10:46:43 +01:00
Martin Kroeker ae3bcc8949
Drop NeoverseN2 to armv8.2-a on OSX to make it build with gcc11 too 2022-08-31 10:41:01 +02:00
Martin Kroeker 68277282df
Work around XCode assembler SVE bug 2022-08-30 22:26:16 +02:00
Martin Kroeker f8c5bdfbab
Treat Fujitsu fcc on Fugaku like clang 2022-07-25 19:48:59 +02:00
Honglin Zhu ec0d5c7a2a Add gfortran parameters 2022-06-29 10:17:05 +08:00
Honglin Zhu 55d686d41e neoverse n2 sbgemm:
implement ncopy tcopy kernel_8x4
2022-06-29 10:14:21 +08:00
Martin Kroeker 848722926c
CortexX1 is only ARMV8 2022-03-28 17:18:56 +02:00
Martin Kroeker 09b8545fc5
Add initial support for M1 on Linux, Phytium FT2xxx series, ARM Cortex 510/710/X1/X2 2022-03-27 15:24:40 +02:00
Sunita Nadampalli 19c8f615dc OpenBLAS: aarch64: Add neoverse-v1/n2 architecture specifics 2022-01-07 00:28:17 +00:00
Bine Brank 7093372e32 add ARMV8SVE target 2021-11-01 22:53:21 +01:00
Martin Kroeker 6975cbe1f0
Enable SVE for A64FX 2021-10-19 23:23:40 +02:00
Martin Kroeker 22bf5c27ba
Add basic support for the Fujitsu A64FX (#3415)
* Add initial support for Fujitsu A64FX as generic ARMV8
2021-10-18 15:00:19 +02:00
Martin Kroeker b57acdf2d3
Add march/mtune flags for clang builds on ARM64 as well (#3414)
* Add march/mtune flags for clang as well
2021-10-18 00:26:14 +02:00
Martin Kroeker efbd7c7840
GCC did not support -mtune for ARM64 before 5.1 2021-07-23 13:42:52 +02:00
User User-User b7da75e4fd WiP CORTEX A55 support 2021-06-19 21:37:51 +02:00
Noan 32264ba496
Update Makefile.arm64
Added -march and -mtune flags for EMAG processors when GCC 9 or later
2021-05-16 09:49:13 +00:00
Martin Kroeker 6726771645
Support compilation with NAG fortran 2021-03-13 20:16:18 +01:00
Martin Kroeker bff2b7c94d
Support compilation with NVIDIA HPC compilers (which do not take gcc-style arch options) 2021-01-12 16:34:18 +01:00
Martin Kroeker 17dca035de
rename SILICON to VORTEX 2020-09-03 08:38:08 +02:00
Martin Kroeker 4a4d1ca6e0
Add AppleSIlicon cpu 2020-09-02 22:52:12 +02:00
Ashwin Sekhar T K 4e1be0e481 ARM64: Add THUNDERX3T110 Target 2020-07-26 23:32:24 -07:00
Ali Saidi c623a965f9 Add Neoverse-N1 core
The implementation is a hybird of the ARMV8 one with some of the
improved TX2 rountines along with specifying -march=v8.2-a
2020-02-29 03:22:04 +00:00
Martin Kroeker a4896b5538
Update DYNAMIC_ARCH support for ARM64 and PPC (#2332)
* Update DYNAMIC_ARCH list of ARM64 targets for gmake
* Update arm64 cpu list for runtime detection
* Update DYNAMIC_ARCH list of ARM64 targets for cmake and add POWERPC targets
2019-12-04 11:06:03 +01:00
maomao194313 53f482ee72
add TARGET support for HiSilicon tsv110 CPUs 2019-03-04 16:41:21 +08:00
Renato Golin 31a490ea88 Fix two mistakes on Arm64 builds
* Falkor is an ARMv8.0 with ARMv8.1 features, and chosing armv8.1-a for
   march generates instructions it cannot cope with. Reverting it back
   to armv8-a.
 * ThunderX2's build was left with a #define VULCAN, which made it miss
   the right compiler flags in Makefile.arm64, although it did create
   the right library in the end.
2018-12-05 18:51:38 +00:00
Renato Golin 310ea55f29 Simplifying ARMv8 build parameters
ARMv8 builds were a bit mixed up, with ThunderX2 code in ARMv8 mode
(which is not right because TX2 is ARMv8.1) as well as requiring a few
redundancies in the defines, making it harder to maintain and understand
what core has what. A few other minor issues were also fixed.

Tests were made on the following cores: A53, A57, A72, Falkor, ThunderX,
ThunderX2, and XGene.

Tests were: OpenBLAS/test, OpenBLAS/benchmark, BLAS-Tester.

A summary:
 * Removed TX2 code from ARMv8 build, to make sure it is compatible with
   all ARMv8 cores, not just v8.1. Also, the TX2 code has actually
   harmed performance on big cores.
 * Commoned up ARMv8 architectures' defines in params.h, to make sure
   that all will benefit from ARMv8 settings, in addition to their own.
 * Adding a few more cores, using ARMv8's include strategy, to benefit
   from compiler optimisations using mtune. Also updated cache
   information from the manuals, making sure we set good conservative
   values by default. Removed Vulcan, as it's an alias to TX2.
 * Auto-detecting most of those cores, but also updating the forced
   compilation in getarch.c, to make sure the parameters are the same
   whether compiled natively or forced arch.

Benefits:
 * ARMv8 build is now guaranteed to work on all ARMv8 cores
 * Improved performance for ARMv8 builds on some cores (A72, Falkor,
   ThunderX1 and 2: up to 11%) over current develop
 * Improved performance for *all* cores comparing to develop branch
   before TX2's patch (9% ~ 36%)
 * ThunderX1 builds are 14% faster than ARMv8 on TX1, 9% faster than
   current develop's branch and 8% faster than deveop before tx2 patches

Issues:
 * Regression from current develop branch for A53 (-12%) and A57 (-3%)
   with ARMv8 builds, but still faster than before TX2's commit (+15%
   and +24% respectively). This can be improved with a simplification of
   TX2's code, to be done in future patches. At least the code is
   guaranteed to be ARMv8.0 now.

Comments:
 * CortexA57 builds are unchanged on A57 hardware from develop's branch,
   which makes sense, as it's untouched.
 * CortexA72 builds improve over A57 on A72 hardware, even if they're
   using the same includes due to new compiler tunning in the makefile.
2018-11-19 16:41:49 +00:00
Ashwin Sekhar T K ebf9e9dabe arm64: Change mtune/mcpu options for THUNDERX2T99 target 2017-07-01 11:17:10 -07:00
Ashwin Sekhar T K 4b55fae337 ARM64: Add Cavium THUNDERX2T99 Target 2017-01-11 11:18:40 +05:30
Andrew Pinski fb200c7245 ARM64: Add Cavium THUNDERX Target 2017-01-10 15:01:37 +05:30
Ashwin Sekhar T K 4713e7c47f ARM64: Add the VULCAN Target 2017-01-10 15:01:17 +05:30
Ashwin Sekhar T K f2f8a0fe8b Adding arm64 target CORTEXA57
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
2015-11-09 14:15:50 +05:30
wernsaar fe5f46c330 added experimental support for ARMV8 2013-11-24 15:47:00 +01:00