Commit Graph

7452 Commits

Author SHA1 Message Date
Wangyang Guo 225683218c Small Matrix: use proper inline asm input constraint for AVX512 mask 2022-02-28 03:22:31 +00:00
Martin Kroeker 10b0428b2c
Merge pull request #3549 from martin-frbg/issue3543
Annotate LAPACKE_lsame with attribute const for GCC(+compatible)
2022-02-26 21:49:05 +01:00
Martin Kroeker 9c3e0bf319
Merge pull request #3548 from martin-frbg/rela-gemmt
Enable the ?GEMMT functions in ReLAPACK
2022-02-26 21:48:39 +01:00
Martin Kroeker 1c1ffb0591
Annotate LAPACKE_lsame with the const attribute for GCC and compatible compilers 2022-02-26 19:27:34 +01:00
Martin Kroeker 4058f32492
Fix xGEMMT argument lists 2022-02-26 19:24:27 +01:00
Martin Kroeker 35d5105922
Enable xGEMMT functions 2022-02-26 19:23:40 +01:00
Martin Kroeker ab304cca69
Merge pull request #3547 from martin-frbg/issue3540-2
More build fixes for CooperLake with BFLOAT16 and DYNAMIC_ARCH
2022-02-25 21:54:11 +01:00
Martin Kroeker 9c626e466e
really fix definition of SHUFFLE_MAGIC_NO 2022-02-25 15:36:02 +01:00
Martin Kroeker 0698212c8c
Remove stray $ 2022-02-25 15:33:02 +01:00
Martin Kroeker 9d7429406f
Declare SHUFFLE_MAGIC_NO as const to placate clang 2022-02-25 10:05:36 +01:00
Martin Kroeker d9894f45d3
Define sbgemm_r to fix DYNAMIC_ARCH builds 2022-02-25 10:04:00 +01:00
Martin Kroeker 522f809825
Merge pull request #3542 from martin-frbg/issue3540
Fix compilation for CooperLake on Windows/clang
2022-02-24 00:00:00 +01:00
Martin Kroeker d50287fa5b
Merge pull request #3544 from giordano/mg/gcc6
Fix compilation of Skylake AVX512 kernels with GCC 6
2022-02-23 23:57:57 +01:00
Mosè Giordano abbc947edb Fix compilation of Skylake AVX512 kernels with GCC 6 2022-02-23 22:51:59 +00:00
Martin Kroeker f2f0e1287b
Merge pull request #3541 from martin-frbg/issue3530
Fix compilation for SkylakeX with gcc 6.x
2022-02-23 23:13:53 +01:00
Martin Kroeker c62f8e2c01
Prevent compiler attempts to use k0 as mask register 2022-02-23 20:12:20 +01:00
Martin Kroeker 80eb581c83
Fix non-portable u_int64_t 2022-02-23 20:10:59 +01:00
Martin Kroeker 73ffabe6ba
Guard uses of _mm512_reduce_add_p? 2022-02-23 20:06:14 +01:00
Martin Kroeker 5ad66f0e96
Merge pull request #3537 from xianyi/release-0.3.0
Merge back from 0.3.20 release to copy tag
2022-02-21 06:57:27 +01:00
Martin Kroeker 0b678b19dc
Update version to 0.3.20 2022-02-20 22:35:05 +01:00
Martin Kroeker 15ff556862
Merge pull request #3536 from xianyi/develop
Update from develop for release 0.3.20
2022-02-20 22:33:59 +01:00
Martin Kroeker 1564b632ad
Merge branch 'release-0.3.0' into develop 2022-02-20 22:33:45 +01:00
Martin Kroeker dec53e0ca2
Update version to 0.3.20 2022-02-20 22:30:50 +01:00
Martin Kroeker c3f8de7923
Merge pull request #3535 from martin-frbg/0320changes
Update with 0.3.20 changes
2022-02-20 22:21:02 +01:00
Martin Kroeker c352ac0ae3
Update with 0.3.20 changes 2022-02-20 22:16:04 +01:00
Martin Kroeker 77433af83e
Merge pull request #3532 from martin-frbg/issue3528-2
Fix building a shared library on Mac with flang-classic
2022-02-11 11:44:32 +01:00
Martin Kroeker db7a03dd4c
keep flang-classic on MacOS from trying to create an executable instead of a library 2022-02-10 23:04:45 +01:00
Martin Kroeker 0e04710099
filter out libflangmain as well 2022-02-10 23:03:05 +01:00
Martin Kroeker dc80925c92
Merge pull request #3531 from martin-frbg/issue2973
Add .NOTPARALLEL: to MATGEN Makefile as a workaround for builds on DFS
2022-02-10 14:16:08 +01:00
Martin Kroeker e2bf3f31a6
Add .NOTPARALLEL: as a workaround for builds on DFS 2022-02-09 22:09:25 +01:00
Martin Kroeker 92d243fee3
Merge pull request #3527 from martin-frbg/issue3490
Treat AVX512-enabled Alder Lake like Cooper Lake/Sapphire Rapids
2022-02-07 08:14:11 +01:00
Martin Kroeker fa3e9f25e6
Support AVX512-enabled Alder Lake 2022-02-07 00:00:56 +01:00
Martin Kroeker f7e8f9ec57
Support AVX512-enabled AlderLake 2022-02-07 00:00:15 +01:00
Martin Kroeker 7656aba00e
Merge pull request #3493 from martin-frbg/casts+cleanup
WIP casts and cleanups
2022-02-06 23:55:06 +01:00
Martin Kroeker aec32e5bd4
Update azure-pipelines.yml 2022-02-05 22:39:03 +01:00
Martin Kroeker 3007ca6371
Merge pull request #3524 from martin-frbg/lapack646
Fix input argument check in ?GEQRT2 (from Reference-LAPACK PR 646)
2022-02-03 22:31:23 +01:00
Martin Kroeker a3eea3e127
Fix input argument check (LAPACK PR 646) 2022-02-03 11:43:17 +01:00
Martin Kroeker b212577e50
Merge pull request #3521 from martin-frbg/issue3520
Add proper defaults for Sparc IMIN/IMAX
2022-01-28 13:39:36 +01:00
Martin Kroeker 63483ba0ff
Merge pull request #3522 from martin-frbg/issue3517
Disable building C/Z SPMV,SPR,SYMV,SYR when NO_LAPACK=1
2022-01-28 10:36:57 +01:00
Martin Kroeker d2b5fbf80f
Exclude some complex (LAPACK) functions when NO_LAPACK is set 2022-01-27 22:02:08 +01:00
Martin Kroeker 7f0b11fbc1
Exclude some complex drivers when NO_LAPACK is set 2022-01-27 22:00:39 +01:00
Martin Kroeker addc2a7aaa
Add proper defaults for IMIN/IMAX 2022-01-27 19:56:32 +01:00
Martin Kroeker 204e021515
Merge pull request #3518 from martin-frbg/elbrus
Add basic support for the (mostly x86_64 compatible) Elbrus E2000 architecture
2022-01-25 20:57:59 +01:00
Martin Kroeker b0d39349f9
Merge pull request #3516 from mmuetzel/no-fortran
cmake: Check if Fortran compiler is usable before enabling it.
2022-01-25 20:57:38 +01:00
Martin Kroeker 5d24f3d210
Update CONTRIBUTORS.md 2022-01-22 19:09:00 +01:00
Martin Kroeker 66a15e15a8
Update CONTRIBUTORS.md 2022-01-22 19:02:57 +01:00
Martin Kroeker 299d4d70a3
Add default KERNEL file for Elbrus E2K arch 2022-01-22 18:59:36 +01:00
Martin Kroeker 3492bea602
Create Makefile 2022-01-22 18:57:28 +01:00
Martin Kroeker 898cf5faf3
Add Elbrus e2k architecture support 2022-01-22 18:55:10 +01:00
Martin Kroeker bc93f468ef
Add Elbrus E2000 architecture as generic x86_64 compatible 2022-01-22 18:53:38 +01:00