OpenBLAS

Author	SHA1	Message	Date
Martin Kroeker	3d8054fb16	add clobber list	2024-06-14 22:07:44 +02:00
Chris Sidebottom	60e66725e4	Use numeric labels to allow repeated inlining	2023-12-19 13:11:06 +00:00
Chris Sidebottom	7a4fef4f60	Tweak SVE dot kernel This changes the SVE dot kernel to only predicate when necessary as well as streamlining the assembly a bit. The benchmarks seem to indicate this can improve performance by ~33%.	2023-12-19 12:08:54 +00:00
Martin Kroeker	3d31191b0f	Work around Clang failing to disambiguate SVE intrinsics and add AppleClang crossbuild to MacOS/arm64 DYNAMIC_ARCH in AzureCI (#4140 ) * Add AppleClang crossbuild to MacOS/arm64 DYNAMIC_ARCH * add casts to disambiguate svwhilelt for clang	2023-07-14 11:06:48 +02:00
Chris Sidebottom	fd4f52c797	Add SVE implementation for sdot/ddot This adds an SVE implementation to sdot/ddot when available, falling back to the previous Advanced SIMD kernel where there's no SVE implementation for the kernel. All the targets were essentially treating `dot_thunderx2t99.c` as the Advanced SIMD implementation so I've renamed it to better fit with the feature detection.	2022-12-01 12:07:50 +00:00