Commit Graph

7433 Commits

Author SHA1 Message Date
Martin Kroeker 8d0f7f0176
Revert accidental change of generic ARMV8 DGEMM parameters from #3425 2022-03-27 13:10:47 +02:00
Martin Kroeker 153f8bc8da
Merge pull request #3584 from martin-frbg/ctestskip
Add a (CMAKE) option to skip the LAPACK testsuite and use it in Azure CI
2022-03-25 14:35:15 +01:00
Markus Mützel aeb561d234 Add support for Intel Fortran compilers.
Port changes from upstream Reference-LAPACK.
2022-03-25 13:37:15 +01:00
Martin Kroeker 6c3842a891
Disable the LAPACK testsuite for the Windows clang/flang build as it takes too long 2022-03-24 21:25:16 +01:00
Martin Kroeker 4199ca728e
Add LAPACK-like option to omit the LAPACK testsuite 2022-03-24 21:23:28 +01:00
Larson, Eric 8fe3555792 ILP support
long's in windows are 4 bytes (MSVS, intel compilers). Use int64_t and int32_t
to ensure 8 byte integers for ILP interface.

support 8 byte integer flag for intel ifort compiler
2022-03-24 19:09:23 +01:00
Aisha Tammy 3efbf968f1 create INDEX64 target 2022-03-24 19:09:23 +01:00
Martin Kroeker 34ecd967a5
Merge pull request #3580 from martin-frbg/dynx86_sbgemm
Remove extraneous (and wrong) definition of sbgemm_r on x86_64
2022-03-24 11:33:00 +01:00
Martin Kroeker 2519c9d93f
Merge pull request #3579 from martin-frbg/issue3557-2
Fix malfunctioning AVX512 check
2022-03-24 08:28:37 +01:00
Martin Kroeker 40302558ed
Remove extraneous (and wrong) definition of sbgemm_r on x86_64 2022-03-23 20:05:32 +01:00
Martin Kroeker b79b99d695
Merge branch 'xianyi:develop' into issue3557-2 2022-03-23 19:13:54 +01:00
Martin Kroeker c87a4dbf35
Fix checks for AVX512 and atomics 2022-03-23 15:48:58 +01:00
Martin Kroeker 93a81856ae
Revert AVX512 capability check from PR #1980 (moved to build) 2022-03-23 15:22:13 +01:00
Martin Kroeker 9fbeb88fb8
Utilize compiler AVX512 capability info from c_check when building getarch 2022-03-23 15:19:55 +01:00
Martin Kroeker 4cb302a596
Merge pull request #3561 from AlessioZanga/patch-msvc
Remove MSVC limitation
2022-03-23 11:28:13 +01:00
Martin Kroeker f67977a323
Merge pull request #3576 from martin-frbg/cmaketestbom
Skip BLAS tests if Windows powershell added a BOM
2022-03-23 07:19:15 +01:00
Martin Kroeker 0ee2d15fdb
Merge pull request #3577 from martin-frbg/azure_win2022
Update Windows jobs in Azure CI to use Windows2022
2022-03-23 07:18:45 +01:00
Martin Kroeker a0e86adf93
Update Windows jobs in Azure CI to use Windows2022 2022-03-22 21:51:09 +01:00
Martin Kroeker 2408315d10
Skip tests if Windows powershell added a BOM 2022-03-22 21:37:55 +01:00
Martin Kroeker 694f6c5c8d
Merge pull request #3574 from AdamNiederer/fix-dynamic-list-compilation
Fix broken elif in dynamic.c
2022-03-19 09:21:56 +01:00
Adam Niederer 69f2ac4ea2 Fix broken elif in dynamic.c
This fixes compilation in the following case:

$(MAKE) USE_OPENMP=1 USE_THREAD=1 NO_LAPACK=0 DYNAMIC_ARCH=1 \
DYNAMIC_LIST="HASWELL SKYLAKEX ATOM COOPERLAKE SAPPHIRERAPIDS ZEN"
2022-03-17 20:04:37 -04:00
Martin Kroeker 501bf31e3e
Merge pull request #3567 from cenewcombe/develop
Fix unsafe read of Y in zsymv_L_sse2.S
2022-03-12 13:40:17 +01:00
Caroline Newcombe 5cc1111383 fix unsafe read of Y in assembly kernel 2022-03-11 11:56:33 -06:00
Martin Kroeker 8d5a9c2f98
Merge pull request #3565 from jonaszhou1/develop
Support Zhaoxin/Centaur kh40000 as ZEN
2022-03-11 14:29:30 +01:00
Martin Kroeker 9dcd8aeb7a
Merge pull request #3566 from martin-frbg/configtls
Report USE_TLS in get_config output if set
2022-03-11 14:27:27 +01:00
Martin Kroeker bf4642eb7e
Report USE_TLS if set 2022-03-10 16:19:29 +01:00
JonasZhou 2d0ad89b0d Support Zhaoxin/Centaur kh40000 as ZEN
Signed-off-by: JonasZhou <JonasZhou@zhaoxin.com>
2022-03-10 15:08:38 +08:00
AlessioZanga ed2871cb71
Change `BUILD_WITHOUT_LAPACK` to `OFF` by default 2022-03-05 23:35:29 +01:00
Alessio Zanga ed8c028f7f
Remove MSVC limitation 2022-03-05 14:06:21 +01:00
Martin Kroeker 1ef97c470c
Merge pull request #3550 from guowangy/smatrix-mask-fix
Small Matrix: use proper inline asm input constraint for AVX512 mask
2022-02-28 08:28:02 +01:00
Xianyi Zhang 45786b05da Merge branch 'develop' into risc-v 2022-02-28 11:48:02 +08:00
Wangyang Guo 225683218c Small Matrix: use proper inline asm input constraint for AVX512 mask 2022-02-28 03:22:31 +00:00
Martin Kroeker 10b0428b2c
Merge pull request #3549 from martin-frbg/issue3543
Annotate LAPACKE_lsame with attribute const for GCC(+compatible)
2022-02-26 21:49:05 +01:00
Martin Kroeker 9c3e0bf319
Merge pull request #3548 from martin-frbg/rela-gemmt
Enable the ?GEMMT functions in ReLAPACK
2022-02-26 21:48:39 +01:00
Martin Kroeker 1c1ffb0591
Annotate LAPACKE_lsame with the const attribute for GCC and compatible compilers 2022-02-26 19:27:34 +01:00
Martin Kroeker 4058f32492
Fix xGEMMT argument lists 2022-02-26 19:24:27 +01:00
Martin Kroeker 35d5105922
Enable xGEMMT functions 2022-02-26 19:23:40 +01:00
Martin Kroeker ab304cca69
Merge pull request #3547 from martin-frbg/issue3540-2
More build fixes for CooperLake with BFLOAT16 and DYNAMIC_ARCH
2022-02-25 21:54:11 +01:00
Martin Kroeker 9c626e466e
really fix definition of SHUFFLE_MAGIC_NO 2022-02-25 15:36:02 +01:00
Martin Kroeker 0698212c8c
Remove stray $ 2022-02-25 15:33:02 +01:00
Martin Kroeker 9d7429406f
Declare SHUFFLE_MAGIC_NO as const to placate clang 2022-02-25 10:05:36 +01:00
Martin Kroeker d9894f45d3
Define sbgemm_r to fix DYNAMIC_ARCH builds 2022-02-25 10:04:00 +01:00
Martin Kroeker 522f809825
Merge pull request #3542 from martin-frbg/issue3540
Fix compilation for CooperLake on Windows/clang
2022-02-24 00:00:00 +01:00
Martin Kroeker d50287fa5b
Merge pull request #3544 from giordano/mg/gcc6
Fix compilation of Skylake AVX512 kernels with GCC 6
2022-02-23 23:57:57 +01:00
Mosè Giordano abbc947edb Fix compilation of Skylake AVX512 kernels with GCC 6 2022-02-23 22:51:59 +00:00
Martin Kroeker f2f0e1287b
Merge pull request #3541 from martin-frbg/issue3530
Fix compilation for SkylakeX with gcc 6.x
2022-02-23 23:13:53 +01:00
Martin Kroeker c62f8e2c01
Prevent compiler attempts to use k0 as mask register 2022-02-23 20:12:20 +01:00
Martin Kroeker 80eb581c83
Fix non-portable u_int64_t 2022-02-23 20:10:59 +01:00
Martin Kroeker 73ffabe6ba
Guard uses of _mm512_reduce_add_p? 2022-02-23 20:06:14 +01:00
Martin Kroeker 5ad66f0e96
Merge pull request #3537 from xianyi/release-0.3.0
Merge back from 0.3.20 release to copy tag
2022-02-21 06:57:27 +01:00