Commit Graph

5235 Commits

Author SHA1 Message Date
Martin Kroeker ccb9731c7b
Fix propagation of cpu properties to compiler options 2020-11-07 20:30:15 +01:00
Martin Kroeker a29338aaa6
Remove extraneous quotes that caused a cmake policy warning 2020-11-07 20:27:42 +01:00
Martin Kroeker 438a8e5624
Fix placement of getarch call and spurious cpu property accumulation in DYNAMIC_ARCH builds 2020-11-07 20:26:12 +01:00
Martin Kroeker e5967810b7
Merge pull request #110 from xianyi/develop
rebase
2020-11-07 20:22:41 +01:00
Martin Kroeker ff74319ea5
Merge pull request #2977 from martin-frbg/issue2976
Fix macro name used in ifdef for POWERPC/PGI
2020-11-07 14:41:34 +01:00
Martin Kroeker 28d2dfe2b3
Fix macro name used in ifdef 2020-11-07 12:17:49 +01:00
Martin Kroeker 60ab9c783f
Merge pull request #2966 from martin-frbg/issue2964
Ensure that EXPRECISION is disabled for DYNAMIC_ARCH with TARGET=GENERIC and fix CMAKE DYNAMIC_ARCH builds
2020-11-04 16:02:46 +01:00
Martin Kroeker 8cc73fee98
Export NO_EXPRECISION after overriding for DYNAMIC_ARCH with GENERIC target 2020-11-03 23:47:04 +01:00
Martin Kroeker 0155cd53a3
Add -msse3 where needed for DYNAMIC_ARCH builds 2020-11-03 23:45:49 +01:00
Martin Kroeker a9f9354296
Fix target test 2020-11-02 23:17:46 +01:00
Martin Kroeker b9bc76aec4
Add files via upload 2020-11-02 22:43:50 +01:00
Martin Kroeker f071245939
Merge pull request #2967 from RajalakshmiSR/dgemm88
POWER10:  Change dgemm unroll factors
2020-11-02 18:54:36 +01:00
Martin Kroeker e5f8c2bf8a
typo fix 2020-11-01 22:25:43 +01:00
Martin Kroeker 6baf8af658
Disable EXPRECISION for the combination of DYNAMIC_CORE and GENERIC target 2020-11-01 22:11:48 +01:00
Martin Kroeker 40a93c232b
Disable EXPRECISION for DYNAMIC_ARCH in combination with TARGET=GENERIC
NO_EXPRECISION is disabled for the GENERIC_TARGET already, so prevent mixing with code parts that use a different float size by default
2020-11-01 21:58:26 +01:00
Martin Kroeker fab952bee4
Merge pull request #2962 from brada4/develop
add openbsd 68+ gfortran name
2020-11-01 14:24:40 +01:00
Martin Kroeker 1cf04a6f0e
Merge pull request #2963 from martin-frbg/issue2959
Reunify default BUFFER_SIZE on ARM64 to avoid crashes in DYNAMIC_ARCH mode
2020-11-01 09:14:54 +01:00
Rajalakshmi Srinivasaraghavan dd7a9cc5bf POWER10: Change dgemm unroll factors
Changing the unroll factors for dgemm to 8 shows improved performance with
POWER10 MMA feature.   Also made some minor changes in sgemm for edge cases.
2020-10-31 18:28:57 -05:00
Martin Kroeker 7f26be4802
Reunify BUFFERSIZE across arm64 platforms to avoid segfaults in DYNAMIC_ARCH 2020-11-01 00:00:43 +01:00
User User-User 9fab65e90a add openbsd gfortran 2020-11-01 00:38:08 +02:00
Martin Kroeker 9efc3f0815
Merge pull request #109 from xianyi/develop
rebase
2020-10-31 22:33:52 +01:00
Martin Kroeker aa21cb5217
Merge pull request #2960 from thrasibule/avx2_detection
fix avx2 detection
2020-10-31 20:24:21 +01:00
Guillaume Horel 1f564d729b fix avx2 detection
reword commits to make it clearer
2020-10-31 10:00:48 -04:00
Martin Kroeker 9349dcd206
Merge pull request #2956 from RajalakshmiSR/caxpy_p10
Optimize caxpy for POWER10
2020-10-30 08:54:10 +01:00
Rajalakshmi Srinivasaraghavan b435491885 Optimize caxpy for POWER10
This patch makes use of new POWER10 vector pair instructions for
loads and stores.
2020-10-29 14:57:51 -05:00
Martin Kroeker 9a058f2451
Merge pull request #2940 from Qiyu8/optimize-benchmark
Refactor the performance measurement system
2020-10-29 20:28:37 +01:00
Martin Kroeker 074927a7d0
Merge pull request #2954 from Guobing-Chen/BF16_gemv_support
Implementation of BF16 based gemv
2020-10-29 09:22:33 +01:00
Martin Kroeker 60b22e3462
Merge pull request #2955 from Guobing-Chen/Fix_cooperlake_build_issue
Fix cooperlake compile issue
2020-10-29 09:22:07 +01:00
Chen, Guobing c5e62dad69 Fix cooperlake compile issue
Add a missing macro which is required in Makefile.x86_64 due to recent
clearnup, which causes cooperlake platform build failure.
2020-10-29 03:37:59 +08:00
Chen, Guobing a7b1f9b1bb Implementation of BF16 based gemv
1. Add a new API -- sbgemv to support bfloat16 based gemv
2. Implement a generic kernel for sbgemv
3. Implement an avx512-bf16 based kernel for sbgemv

Signed-off-by: Chen, Guobing <guobing.chen@intel.com>
2020-10-29 02:08:23 +08:00
Martin Kroeker 67f39ad813
Merge pull request #2939 from thrasibule/Makefile_cleanup
reuse variables defined in Makefile.system
2020-10-28 09:38:40 +01:00
Martin Kroeker 6e13a7e99e
Merge pull request #2951 from martin-frbg/cleanup_make
Minor Makefile cleanup
2020-10-28 09:37:56 +01:00
Martin Kroeker 2207a16235
Merge pull request #2952 from martin-frbg/issue2931
Try to read cpu ID from /sys/devices/.../cpu0 if HWCAP_CPUID fails
2020-10-28 09:37:32 +01:00
Martin Kroeker 5d643929dd
Merge pull request #2948 from martin-frbg/issue2947
Expressly enable neon for use with intrinsics if available
2020-10-28 09:37:09 +01:00
Martin Kroeker e8cbf0fc50
Output predefined HAVE_ entries to Makefile.conf for ARM with specified TARGET 2020-10-27 23:01:19 +01:00
Martin Kroeker b937d78a6d
Try to read cpu information from /sys/devices/system/cpu/cpu0 if HWCAP_CPUID fails 2020-10-27 17:51:32 +01:00
Martin Kroeker e2f9005db8
Merge pull request #2950 from RajalakshmiSR/saxpy
Optimize saxpy for POWER10
2020-10-27 00:02:18 +01:00
Martin Kroeker 6a1f3e40af
Remove debug printout of object list 2020-10-26 21:37:04 +01:00
Martin Kroeker 878b6d1f41
Remove spurious expr in flang version check 2020-10-26 21:35:40 +01:00
Rajalakshmi Srinivasaraghavan c24ba8b1dd Optimize saxpy for POWER10
This patch makes use of new POWER10 vector pair instructions for
loads and stores.
2020-10-26 13:24:59 -05:00
Qiyu8 f917c26e83 Refractoring remaining benchmark cases. 2020-10-26 10:25:05 +08:00
Martin Kroeker 76203e2120
Merge pull request #2946 from martin-frbg/issue2945
Move definitions that are neither needed nor supported on Solaris
2020-10-26 00:43:44 +01:00
Martin Kroeker eec517af0e
Expressly enable neon for use with intrinsics if available 2020-10-26 00:21:56 +01:00
Martin Kroeker fd7da56965
Move definitions that are neither needed nor supported on SUNOS 2020-10-25 12:01:50 +01:00
Martin Kroeker 2f9fc9be30
Update version to 0.3.12.dev 2020-10-24 23:29:05 +02:00
Martin Kroeker 81fcfd5ed3
Update version to 0.3.12.dev 2020-10-24 23:28:29 +02:00
Martin Kroeker addf7593ae
Merge pull request #2944 from xianyi/release-0.3.0
Merge back 0.3.12 tag (and Changelog typo fixes) from release
2020-10-24 13:10:51 +02:00
Martin Kroeker c5f280a7f0
Fix typos 2020-10-24 13:03:28 +02:00
Martin Kroeker 6e3a05f2c9
Merge pull request #2943 from xianyi/develop
Merge from develop for 0.3.12 release
2020-10-24 12:52:59 +02:00
Martin Kroeker 89db73569b
Update Changelog with 0.3.12 changes 2020-10-24 12:50:04 +02:00