Commit Graph

36 Commits

Author SHA1 Message Date
Martin Kroeker f096a339e4
Use long value fields for cpu ident on OSX 2023-04-13 18:16:09 +02:00
Martin Kroeker caa2945138
Support Apple A15/M2 cpus through the existing VORTEX target 2023-04-11 00:04:09 +02:00
Martin Kroeker 9ecfa94744
Add part numbers for A715 and X3 aliased to A710/X2 2023-02-02 17:30:30 +01:00
Martin Kroeker 09b8545fc5
Add initial support for M1 on Linux, Phytium FT2xxx series, ARM Cortex 510/710/X1/X2 2022-03-27 15:24:40 +02:00
Sunita Nadampalli 19c8f615dc OpenBLAS: aarch64: Add neoverse-v1/n2 architecture specifics 2022-01-07 00:28:17 +00:00
Martin Kroeker 22bf5c27ba
Add basic support for the Fujitsu A64FX (#3415)
* Add initial support for Fujitsu A64FX as generic ARMV8
2021-10-18 15:00:19 +02:00
Martin Kroeker d7351deccf
Fix cache reporting for Apple M1 2021-10-04 17:58:29 +02:00
Martin Kroeker 1cce778585
Fix detection of Apple M1 "Vortex" 2021-10-04 16:46:41 +02:00
Martin Kroeker f0b822a709
Update cpuid_arm64.c 2021-06-23 10:11:01 +02:00
User User-User 130327e9af OK 2021-06-22 23:58:59 +02:00
User User-User b7da75e4fd WiP CORTEX A55 support 2021-06-19 21:37:51 +02:00
Martin Kroeker 2e7ee7c716
Fix naming of L2 cache size item reported for Vortex 2020-10-18 19:22:05 +02:00
Martin Kroeker be40440ec5
Change ifdef linux to __linux for C11 compatibility 2020-09-30 22:45:18 +02:00
Martin Kroeker af5bc95503
Rename SILICON to VORTEX and fix duplicate numbering 2020-09-03 08:43:26 +02:00
Martin Kroeker 029fd01cfb
Detect AppleSilicon cpu on OSX 2020-09-02 22:47:38 +02:00
Martin Kroeker 3210a42734
Report cpu as ARMV8 instead of just giving up on non-Linux hosts 2020-08-31 20:03:21 +02:00
Ashwin Sekhar T K 4e1be0e481 ARM64: Add THUNDERX3T110 Target 2020-07-26 23:32:24 -07:00
Ali Saidi c623a965f9 Add Neoverse-N1 core
The implementation is a hybird of the ARMV8 one with some of the
improved TX2 rountines along with specifying -march=v8.2-a
2020-02-29 03:22:04 +00:00
Martin Kroeker e8d82c01d4
Recognize Ampere EMAG8180 2020-02-19 18:49:13 +01:00
Martin Kroeker 6b83079368
Count cpu cores on ARMV8 and use that to pick the GEMM_PQ parameters (#2267)
There is currently no simple way to query cache sizes on ARMV8, so this takes the number of cores as a trivial indication if the target is a server-class device with a big cache, or just a single-board toy or smartphone.
2019-09-25 23:13:24 +02:00
Martin Kroeker b7bbb02447
Silence two nuisance warnings from gcc 2019-08-11 12:46:05 +02:00
maomao194313 f074d7d146
make DYNAMIC_ARCH=1 package work on TSV110. 2019-03-12 16:05:19 +08:00
Renato Golin 31a490ea88 Fix two mistakes on Arm64 builds
* Falkor is an ARMv8.0 with ARMv8.1 features, and chosing armv8.1-a for
   march generates instructions it cannot cope with. Reverting it back
   to armv8-a.
 * ThunderX2's build was left with a #define VULCAN, which made it miss
   the right compiler flags in Makefile.arm64, although it did create
   the right library in the end.
2018-12-05 18:51:38 +00:00
Renato Golin 310ea55f29 Simplifying ARMv8 build parameters
ARMv8 builds were a bit mixed up, with ThunderX2 code in ARMv8 mode
(which is not right because TX2 is ARMv8.1) as well as requiring a few
redundancies in the defines, making it harder to maintain and understand
what core has what. A few other minor issues were also fixed.

Tests were made on the following cores: A53, A57, A72, Falkor, ThunderX,
ThunderX2, and XGene.

Tests were: OpenBLAS/test, OpenBLAS/benchmark, BLAS-Tester.

A summary:
 * Removed TX2 code from ARMv8 build, to make sure it is compatible with
   all ARMv8 cores, not just v8.1. Also, the TX2 code has actually
   harmed performance on big cores.
 * Commoned up ARMv8 architectures' defines in params.h, to make sure
   that all will benefit from ARMv8 settings, in addition to their own.
 * Adding a few more cores, using ARMv8's include strategy, to benefit
   from compiler optimisations using mtune. Also updated cache
   information from the manuals, making sure we set good conservative
   values by default. Removed Vulcan, as it's an alias to TX2.
 * Auto-detecting most of those cores, but also updating the forced
   compilation in getarch.c, to make sure the parameters are the same
   whether compiled natively or forced arch.

Benefits:
 * ARMv8 build is now guaranteed to work on all ARMv8 cores
 * Improved performance for ARMv8 builds on some cores (A72, Falkor,
   ThunderX1 and 2: up to 11%) over current develop
 * Improved performance for *all* cores comparing to develop branch
   before TX2's patch (9% ~ 36%)
 * ThunderX1 builds are 14% faster than ARMv8 on TX1, 9% faster than
   current develop's branch and 8% faster than deveop before tx2 patches

Issues:
 * Regression from current develop branch for A53 (-12%) and A57 (-3%)
   with ARMv8 builds, but still faster than before TX2's commit (+15%
   and +24% respectively). This can be improved with a simplification of
   TX2's code, to be done in future patches. At least the code is
   guaranteed to be ARMv8.0 now.

Comments:
 * CortexA57 builds are unchanged on A57 hardware from develop's branch,
   which makes sense, as it's untouched.
 * CortexA72 builds improve over A57 on A72 hardware, even if they're
   using the same includes due to new compiler tunning in the makefile.
2018-11-19 16:41:49 +00:00
Renato Golin fb5b2177ca [Arm64) Revert A53 detection as A57
This patch reverts the decision of treating A53 like A57, which was
based on an analysis done on server class hardware and is not
representative of all A53s out there.

Fixes #1855.
2018-11-05 11:34:49 +00:00
Ashwin Sekhar T K af2837c392 ARM64: Remove #define ARMV8 for THUNDERX 2018-10-22 01:49:35 -07:00
Ashwin Sekhar T K 68a3c4fca6 ARM64: Enable Auto Detection of ThunderX2T99 2018-04-19 09:05:25 +00:00
Martin Kroeker 0ae5e14923
Detect CORTEX A53 and A72 as CORTEXA57 2018-02-06 11:38:18 +01:00
Dan Horák 1763e01567 fix detection of generic ARMv8 CPUs 2017-08-18 14:53:29 +02:00
Ashwin Sekhar T K 4b55fae337 ARM64: Add Cavium THUNDERX2T99 Target 2017-01-11 11:18:40 +05:30
Ashwin Sekhar T K 738d622feb ARM64: Fix auto detect of ARM64 cpus 2017-01-11 11:18:40 +05:30
Andrew Pinski fb200c7245 ARM64: Add Cavium THUNDERX Target 2017-01-10 15:01:37 +05:30
Ashwin Sekhar T K 4713e7c47f ARM64: Add the VULCAN Target 2017-01-10 15:01:17 +05:30
Zhang Xianyi 38593cd3a3 Fix compiling bug on ARM Cortex-A57. 2016-02-13 15:38:52 +00:00
Ashwin Sekhar T K f2f8a0fe8b Adding arm64 target CORTEXA57
Co-Authored-By: Ralph Campbell <ralph.campbell@broadcom.com>
2015-11-09 14:15:50 +05:30
Benedikt Huber 58c90d5937 # The first commit's message is:
Optimizations for APM's xgene-1 (aarch64).

1) general system updates to support armv8 better.  Make all did not work, one needed to supply TARGET=ARMV8.
2) sgem 4x4 kernel in assembler using SIMD, and configuration changes to use it.
3) strmm 4x4 kernel in C.  Since the sgem kernel does 4x4, the trmm kernel must also do 4xN.

Added Dave Nuechterlein to the contributors list.
2014-11-11 22:19:23 +08:00