Commit Graph

7518 Commits

Author SHA1 Message Date
Chip-Kerchner
d8e6e2b04d Merge branch 'develop' into dynamicDispatchAIXandClang 2023-11-01 14:22:06 -05:00
Martin Kroeker
0de786cfa6 Merge pull request #4278 from martin-frbg/issue4277
CirrusCI: Add FreeBSD clang/gfortran build with OpenMP
2023-11-01 19:45:09 +01:00
Martin Kroeker
9f7c35a4a8 Merge pull request #4279 from martin-frbg/issue4269
Increase the default GEMM buffer size on modern ARM server cpus
2023-10-31 15:41:25 +01:00
Martin Kroeker
728788f667 typo fix 2023-10-31 11:08:22 +01:00
Martin Kroeker
d003ad630b Increase the default GEMM buffer size on modern ARM server cpus 2023-10-31 10:26:38 +01:00
Martin Kroeker
dc1c880782 fix libgfortran path on bsd 2023-10-28 23:14:36 +02:00
Martin Kroeker
289a5f6d9b work around libgfortran install issue on FreeBSD 2023-10-28 18:44:58 +02:00
Martin Kroeker
1cec1c0fc7 Add FreeBSD clang/gfortran build with OpenMP 2023-10-28 14:43:19 +02:00
Martin Kroeker
9d425a5fe7 Merge pull request #4276 from martin-frbg/issue4275
Clarify make/make install in the README and update the TARGET list there
2023-10-28 14:34:17 +02:00
Martin Kroeker
f5e1f20f4d Update target list 2023-10-27 17:10:37 +02:00
Martin Kroeker
a7f73c764c Clarify "make" options and the need to repeat them in the install step 2023-10-27 16:48:47 +02:00
Chip Kerchner
badfb2e60f Merge branch 'develop' into XLC-AIX 2023-10-26 09:19:31 -05:00
Martin Kroeker
96f8bb1eb9 Merge pull request #4272 from RajalakshmiSR/AIX_AS
POWER: AIX: Make use of power10 optimization
2023-10-24 12:08:51 +02:00
Rajalakshmi Srinivasaraghavan
980f702f72 POWER: AIX: Make use of power10 optimization
POWER10 optimizations are disabled when using default AIX assembler.
As we have fixed many issues recently, enabling optimization path
for default assembler.
2023-10-19 18:48:19 -05:00
Martin Kroeker
68906a98c7 Merge pull request #4271 from rgommers/homebrew-nightly-on-main-repo
Run nightly Homebrew cron job only on the main repo, not on forks
2023-10-19 13:28:24 +02:00
Ralf Gommers
6b8379d6d9 Run nightly Homebrew cron job only on the main repo, not on forks
I noticed this because GitHub emailed me that it would disable the
nightly job because it hadn't changed for 3 months. It currently takes
30-50 minutes daily, and by default runs on all forks of the main
repository that have the relevant workflow yaml file. That serves little
purpose and wastes quite a bit of energy - so disable the runs outside
of the main repo.

This will not disable the runs on forks already made in the past that
contain this workflow file, but it does save 3 months worth of runs on
every new fork that is created.

[skip ci]
2023-10-19 11:38:26 +02:00
Martin Kroeker
0799b0d215 Merge pull request #4266 from martin-frbg/gh-mingw-ucrt
GH Workflows: Switch MINGW-W64 jobs to UCRT
2023-10-18 18:58:32 +02:00
Martin Kroeker
5c411ac7a8 Merge pull request #4268 from martin-frbg/issue4267
Fix unwanted "hard" fallback to Prescott in runtime detection of Intel cpus
2023-10-18 17:47:33 +02:00
Martin Kroeker
e12aaed13d Fix unwanted fallthrough from Intel Family 6 to 15 in case of identification failure 2023-10-18 16:28:54 +02:00
Martin Kroeker
f8c230c21c Switch MINGW-W64 jobs to UCRT 2023-10-18 11:58:54 +02:00
Martin Kroeker
c28d71c6fb Merge pull request #4265 from martin-frbg/issue4228
Fix compilation with (the fortran compiler from) Cray CCE
2023-10-17 15:08:30 +02:00
Martin Kroeker
b41cab0875 Need to use override to actually strip down the already defined FFLAGS for NAG and CCE Fortran 2023-10-16 22:20:59 +02:00
Martin Kroeker
301e2ecc49 Cray Fortran uses -O in combinations like -O omp so don't filter that out 2023-10-16 22:15:46 +02:00
Martin Kroeker
66c2c41e99 Merge pull request #4260 from RajalakshmiSR/AIX-M4
POWER: Increase macro size limit for AIX
2023-10-13 10:51:23 +02:00
Martin Kroeker
425bcc1f8b Merge pull request #4256 from ChipKerchner/fixBfloat16BitsStruct
Fix bfloat16_bits union so that it always the sizeof unsigned short for AIX.
2023-10-12 22:01:50 +02:00
Martin Kroeker
789cdcc94f Merge pull request #4259 from martin-frbg/azureosxclang
AzureCI: move OSX-Clang jobs to macOS-12 to resolve setup/build timeouts
2023-10-12 20:04:28 +02:00
Rajalakshmi Srinivasaraghavan
9f42570e33 POWER: Increase macro size limit for AIX
This patch increases the macro size limit from 4096 to 16384 to
allow compiling larger assembly files in AIX.
Tested with GCC and IBM Open XL C.
2023-10-12 12:37:40 -05:00
Martin Kroeker
9f49aef91b Merge pull request #4255 from RajalakshmiSR/AIX-P10
POWER10: Fix compilation issues with Open XL C
2023-10-12 18:59:17 +02:00
Martin Kroeker
fe75c88a2c AzureCI: move OSX-Clang jobs to macOS-12 to resolve setup/build timeouts 2023-10-12 18:20:09 +02:00
Chip-Kerchner
d46eba06a7 Pack structure only on AIX. 2023-10-12 09:41:33 -05:00
Martin Kroeker
90231bfc4e Merge pull request #4258 from martin-frbg/issue4257
Fix build on Fujitsu A64FX
2023-10-12 16:38:28 +02:00
Martin Kroeker
e7d05402e0 Fix up S/D GEMM copy function definitions after #4009 2023-10-12 14:24:53 +02:00
Chip-Kerchner
e98e3c4783 Fix float32_bits union so that it always the sizeof float. 2023-10-11 18:05:55 -05:00
Chip-Kerchner
97a61d0577 Fix bfloat16_bits union so that it always the sizeof unsigned short. 2023-10-11 17:36:43 -05:00
Rajalakshmi Srinivasaraghavan
71d733e5f7 POWER: Avoid m4 conversions for C files
This patch removes intermediate m4 conversions used in sbgemm
compilation as it is not needed for .c files.
Tested on AIX with gcc and IBM Open XL C.
2023-10-11 17:18:42 -05:00
Rajalakshmi Srinivasaraghavan
82fc29a57a POWER10: Fallback to POWER8 functions
As cgemm and zgemm kernels are not optimized for big endian falling
back to POWER8 versions.  Tested on AIX using gcc and Open XL C.
2023-10-11 17:04:42 -05:00
Martin Kroeker
bf3183d31d Merge pull request #4253 from martin-frbg/issue4239-2
Require "classic ld" with XCODE 15.x on Mac
2023-10-10 18:44:08 +02:00
Martin Kroeker
103d6f4e42 Require "classic ld" with XCODE 15.x on Mac 2023-10-10 16:15:52 +02:00
Martin Kroeker
4a0f86397b Merge pull request #4235 from angsch/develop
Fix division by zero in [z]rotg
2023-10-09 08:43:42 +02:00
Martin Kroeker
617294b9e4 Merge pull request #4251 from martin-frbg/4142-2
Correct function prototypes in f2c-converted sources (lapack-netlib and ctest)
2023-10-08 18:11:12 +02:00
Martin Kroeker
c5e7339c9e correct prototypes for INTERFACE64 builds 2023-10-08 16:13:37 +02:00
Martin Kroeker
d8126c76e7 fix prototype 2023-10-08 13:38:39 +02:00
Martin Kroeker
769a58e9d1 fix prototypes of stest and itest1 for INTERFACE64 2023-10-08 12:51:41 +02:00
Martin Kroeker
c30b530878 fix prototypes of ctest and itest for INTERFACE64 2023-10-08 11:59:19 +02:00
Martin Kroeker
2b865da730 fix prototypes of stest and ctest for INTERFACE64 2023-10-08 11:55:10 +02:00
Martin Kroeker
65bfe1a06d fix prototype of itest1 for INTERFACE64 2023-10-08 11:36:06 +02:00
Martin Kroeker
1806cfecbc fix function prototypes in f2c-converted files 2023-10-07 22:38:30 +02:00
Martin Kroeker
281f1e4432 fix function prototypes in f2c-converted files 2023-10-07 22:36:29 +02:00
Martin Kroeker
4041b7fb42 fix function prototypes in f2c-converted files 2023-10-07 22:33:08 +02:00
Martin Kroeker
b626544ca3 complete function prototypes and remove unused functions 2023-10-07 22:31:03 +02:00