Martin Kroeker
880bc1d1db
Acknowledge past CZI EOSS 1/EOSS 3 funding
2022-07-30 12:34:09 +02:00
Martin Kroeker
d0ba257de0
Merge pull request #3704 from XiWeiGu/loongarch64_dynamic_arch
...
LoongArch64: Add DYNAMIC_ARCH support
2022-07-28 20:31:20 +02:00
Martin Kroeker
78da6a750a
Merge pull request #3705 from RajalakshmiSR/bf16ppc
...
POWER: Enable bfloat16 kernels by default
2022-07-28 18:38:14 +02:00
Rajalakshmi Srinivasaraghavan
1d97405c02
POWER: Enable bfloat16 kernels by default
...
This patch enables bfloat16 kernels by default for POWER processors.
Tested on Linux POWER8, POWER9, POWER10 and AIX POWER10 systems.
2022-07-28 07:43:53 -05:00
gxw
fbfe1daf6e
LoongArch64: Add DYNAMIC_ARCH support
2022-07-28 14:28:45 +08:00
Martin Kroeker
047a279f09
Merge pull request #3702 from martin-frbg/issue3687
...
Add openblas_getaffinity() extension (Linux-only)
2022-07-27 20:57:50 +02:00
Martin Kroeker
30473b6a9d
add openblas_getaffinity()
2022-07-27 19:15:18 +02:00
Martin Kroeker
8668571040
add openblas_getaffinity()
2022-07-27 19:14:36 +02:00
Martin Kroeker
daca01622b
fix detection of Neoverse V1 and user-enforced selection of N2 in ARM64 DYNAMIC_ARCH ( #3700 )
...
* fix detection of Neoverse V1 and user-enforced selection of N2
2022-07-27 09:17:43 +02:00
Martin Kroeker
c322aab685
Merge pull request #3684 from imzhuhl/neoversen2_dynamic_arch
...
Neoverse N2: DYNAMIC_ARCH
2022-07-26 20:06:26 +02:00
Martin Kroeker
cf796aee8c
Merge pull request #3699 from martin-frbg/issue3692
...
Add c_check recognition of Fujitsu fcc for Fugaku A64FX
2022-07-26 16:36:43 +02:00
Martin Kroeker
28d40ba60b
Merge pull request #3696 from XiWeiGu/loongson2k1000
...
LoongArch64: Add core LOONGSON2K1000 and LOONGSONGENERIC
2022-07-26 13:55:41 +02:00
Martin Kroeker
692848d20c
typo fix
2022-07-25 21:59:03 +02:00
Martin Kroeker
76ea7739dd
Merge pull request #3698 from martin-frbg/issue3697
...
utest needs to be linked against libm on QNX as well
2022-07-25 20:25:23 +02:00
Martin Kroeker
f8c5bdfbab
Treat Fujitsu fcc on Fugaku like clang
2022-07-25 19:48:59 +02:00
Martin Kroeker
70001e1e9e
Add Fujitsu compiler
2022-07-25 19:42:59 +02:00
Martin Kroeker
cf37182260
Add Fujitsu compiler (fcc)
2022-07-25 19:39:17 +02:00
Martin Kroeker
68d86ea150
Add Fujitsu compiler
2022-07-25 19:34:16 +02:00
Martin Kroeker
7aaa0ce0e8
utest needs to be linked against libm on QNX as well
2022-07-25 17:02:16 +02:00
Martin Kroeker
cd8e57040c
Merge pull request #3691 from martin-frbg/issue3679-sparc
...
SPARC: fix DNRM2 returning INF instead of zero due to intermediate overflow
2022-07-25 15:41:15 +02:00
gxw
3573306a69
LoongArch64: Add core LOONGSON2K1000 and LOONGSONGENERIC
2022-07-25 16:04:56 +08:00
Martin Kroeker
a4303ae378
Merge pull request #3695 from martin-frbg/ppc6nrm2
...
PPC6: Fix DNRM2 returning INF instead of zero due to intermediate overflow
2022-07-25 06:14:30 +02:00
Martin Kroeker
31377d04f0
Merge pull request #3694 from martin-frbg/traviswait
...
Add back travis_wait to keep ppc jobs from getting cancelled
2022-07-24 22:13:08 +02:00
Martin Kroeker
6c118b7977
Fix DNRM2 returning INF instead of zero due to intermediate overflow
2022-07-24 17:42:31 +02:00
Martin Kroeker
b60415a347
Add back travis_wait to keep ppc jobs from getting cancelled
2022-07-24 16:44:16 +02:00
Martin Kroeker
c43ec53bdd
Merge pull request #3690 from RajalakshmiSR/cdotp10
...
POWER: Fix complex dot function failures
2022-07-19 13:59:16 +02:00
Martin Kroeker
b7c65d08cb
Merge pull request #3689 from RajalakshmiSR/dgemvgcc10
...
POWER10: dgemv builtin rename
2022-07-19 10:25:01 +02:00
Martin Kroeker
fcbbd8c25c
Merge pull request #3682 from XiWeiGu/develop
...
Fix dnrm2_tiny testcase failure
2022-07-19 10:24:28 +02:00
Martin Kroeker
06ef015234
fix DNRM2 returning INF instead of zero due to intermediate overflow
2022-07-19 10:19:27 +02:00
Rajalakshmi Srinivasaraghavan
a612e78a97
POWER: Fix complex dot function failures
...
There are some test failures in complex dot functions when compiling with gcc12.
The machine constraints used now do not update all the four elements in the
expected result array. Fixing this with a reduced level of optimization.
This is not changing any performance numbers but will be converted to C code in future.
2022-07-18 14:48:43 -05:00
Rajalakshmi Srinivasaraghavan
432fd99445
POWER10: dgemv builtin rename
...
Add check to use correct builtin name for older versions
of gcc10 compilers.
2022-07-18 09:48:01 -05:00
gxw
4dd05e526b
LoongArch64: Fix dnrm2_tiny testcase failure
2022-07-15 11:18:59 +08:00
Martin Kroeker
7da799dc66
Merge pull request #3686 from martin-frbg/issue3685
...
Fix Fortran-less CTEST build option
2022-07-13 08:24:15 +02:00
Martin Kroeker
6e018b84c4
Fix function prototypes and INTERFACE64 support
2022-07-12 19:37:30 +02:00
Martin Kroeker
ccd87cc472
Fix switching between Fortran and C build
2022-07-12 19:35:31 +02:00
Honglin Zhu
d5ca477f42
Neoverse N2: DYNAMIC_ARCH
2022-07-12 00:50:45 +08:00
gxw
cce4b1d956
MIPS64: Fix dnrm2_tiny testcase failure
2022-07-11 19:18:38 +08:00
Martin Kroeker
7918ba11c2
Merge pull request #3680 from martin-frbg/issue3636-2
...
Guard against sysconf(__SC_NPROCESSORS_CONF) returning zero at runtime
2022-07-07 11:38:24 +02:00
Martin Kroeker
69148ae795
Guard against sysconf returning zero processors
2022-07-06 17:22:18 +02:00
Martin Kroeker
e9260f5451
Guard against system call returning zero processors
2022-07-06 17:21:10 +02:00
Martin Kroeker
4cfd6f110a
Merge pull request #3678 from martin-frbg/issue3677
...
Eliminate uses of CREAL on left-hand side of assignments
2022-07-05 10:40:32 +02:00
Martin Kroeker
e12d474780
Eliminate uses of CREAL on left-hand side of assignments
2022-07-05 00:01:09 +02:00
Martin Kroeker
686e6d7c10
Merge pull request #3676 from martin-frbg/dnrm2-utest
...
Add DNRM2 regression test for issues 2998 and 3654
2022-07-04 08:37:18 +02:00
Martin Kroeker
c5041ae270
properly embed test_dnrm2
2022-07-03 23:48:30 +02:00
Martin Kroeker
8e6f719ad3
use huge_val not huge_valf for portability
2022-07-03 20:19:24 +02:00
Martin Kroeker
af88494f87
old systems may not have inf in math.h
2022-07-03 18:23:51 +02:00
Martin Kroeker
ee41b6eb24
Add DNRM2 regression test for issues 2998 and 3654
2022-07-03 17:56:49 +02:00
Martin Kroeker
bf8998a9f4
Merge pull request #3675 from martin-frbg/issue3654
...
workaround ThunderX2 DNRM2 fault with ssq=inf,scale=0
2022-07-03 08:45:45 +02:00
Martin Kroeker
9e29598575
workaround fault with ssq=inf,scale=0
2022-07-02 23:47:17 +02:00
Martin Kroeker
3df3d622eb
Merge pull request #3672 from imzhuhl/neoversen2_bf16
...
sbgemm support for ARM Neoverse N2
2022-07-01 12:13:42 +02:00