Martin Kroeker
e671d0386b
Update version to 0.3.21.dev
2022-08-07 23:03:20 +02:00
Martin Kroeker
1dd979959d
set version to 0.3.21.dev
2022-08-07 23:02:36 +02:00
Martin Kroeker
b89fb708ca
Update version to 0.3.21
2022-08-07 22:36:26 +02:00
Martin Kroeker
9a34217cc6
Merge pull request #3717 from xianyi/develop
...
Update from develop for 0.3.21 release
2022-08-07 22:35:20 +02:00
Martin Kroeker
79f54f266d
Update version to 0.3.21
2022-08-07 22:32:11 +02:00
Martin Kroeker
94cba8e3c5
Merge pull request #3716 from martin-frbg/0321changes
...
Update Changelog for 0.3.21
2022-08-07 22:30:58 +02:00
Martin Kroeker
25ce2e2a63
Update with 0.3.21 changes
2022-08-07 22:21:23 +02:00
Jiaxun Yang
b633eb79f2
Use $at as temporary register for mips/loongson CPUCFG read
...
Some compilers (namely LLVM) are not happy with clobbering
registers in inline assembly.
Use $at as temporary register and explicitly use noat
hint.
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
2022-08-07 13:22:32 +01:00
Martin Kroeker
9f89b62b25
Merge pull request #3715 from martin-frbg/issue3648
...
Increase thresholds for STFSM and CTFSM in the LAPACK testsuite
2022-08-07 08:45:06 +02:00
Martin Kroeker
9c4e91a77d
Increase threshold
2022-08-07 00:03:50 +02:00
Martin Kroeker
1fe06caf49
Increase threshold
2022-08-07 00:03:20 +02:00
Martin Kroeker
ff58e9a7f1
Merge pull request #3609 from martin-frbg/lapack3101
...
Update LAPACK/LAPACKE to Reference-LAPACK 3.10.1
2022-08-06 14:31:56 +02:00
Martin Kroeker
f6a1854ce9
resync gensymbol with develop
2022-08-06 09:29:09 +02:00
Martin Kroeker
2bee490287
Merge pull request #3714 from martin-frbg/crosscmake
...
Add more x86_64 target definitions for CMAKE cross-compiling
2022-08-04 23:58:21 +02:00
Martin Kroeker
85fd3c4279
Support compilation with the Cray C and Fortran compilers ( #3712 )
...
* Add support for the Cray Fortran compiler
2022-08-04 20:42:18 +02:00
Martin Kroeker
3784b3d45c
Add more x86_64 target definitions for cross-compiling
2022-08-04 19:18:32 +02:00
Martin Kroeker
096ae6f2bd
Merge pull request #3709 from nursik/develop
...
Add TCORE Generic
2022-08-03 15:43:27 +02:00
Martin Kroeker
19fefd100e
Merge pull request #3703 from martin-frbg/omp_adaptive
...
Add env variable OMP_ADAPTIVE to control OMP threadpool behaviour
2022-08-03 15:38:39 +02:00
Martin Kroeker
2e51a61914
Merge pull request #3693 from Mayank-Raj3/Mayank-Raj3-patch-1
...
corrected indentation of for and if statement dgemv_thread_safety.cpp
2022-08-03 15:38:14 +02:00
Nursultan Zarlyk
a7ac252fd9
Add TCORE Generic in prebuild.cmake
...
During the cross-compilation on x64 host with MSVC for ARMv8, the
build fails as there is no define directives for Generic core.
2022-08-02 10:50:58 +02:00
Jiaxun Yang
19d4f90c44
Use auvx to detect CPUCFG on mips/loongson
...
It's safer and easier than SIGILL.
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
2022-07-31 19:41:59 +01:00
Martin Kroeker
648a69a67e
Merge pull request #3707 from martin-frbg/getarch_risc
...
Fix crash in RISCV autodetection when pmodel is not present in /proc/cpuinfo
2022-07-31 10:13:38 +02:00
Martin Kroeker
ef9c976a94
Really fix compilation; fix crash when pmodel is not present in cpuinfo
2022-07-31 00:41:04 +02:00
Martin Kroeker
f727235be4
Merge pull request #3706 from martin-frbg/czifunding
...
Acknowledge past CZI EOSS 1/EOSS 3 funding
2022-07-30 14:11:45 +02:00
Martin Kroeker
880bc1d1db
Acknowledge past CZI EOSS 1/EOSS 3 funding
2022-07-30 12:34:09 +02:00
Martin Kroeker
d0ba257de0
Merge pull request #3704 from XiWeiGu/loongarch64_dynamic_arch
...
LoongArch64: Add DYNAMIC_ARCH support
2022-07-28 20:31:20 +02:00
Martin Kroeker
78da6a750a
Merge pull request #3705 from RajalakshmiSR/bf16ppc
...
POWER: Enable bfloat16 kernels by default
2022-07-28 18:38:14 +02:00
Rajalakshmi Srinivasaraghavan
1d97405c02
POWER: Enable bfloat16 kernels by default
...
This patch enables bfloat16 kernels by default for POWER processors.
Tested on Linux POWER8, POWER9, POWER10 and AIX POWER10 systems.
2022-07-28 07:43:53 -05:00
gxw
fbfe1daf6e
LoongArch64: Add DYNAMIC_ARCH support
2022-07-28 14:28:45 +08:00
Martin Kroeker
80cdfed7b2
Use OMP_ADAPTIVE setting to choose between static and dynamic OMP threadpool size
2022-07-27 23:43:20 +02:00
Martin Kroeker
08e3754b39
Add environment variable OMP_ADAPTIVE
2022-07-27 23:41:47 +02:00
Martin Kroeker
047a279f09
Merge pull request #3702 from martin-frbg/issue3687
...
Add openblas_getaffinity() extension (Linux-only)
2022-07-27 20:57:50 +02:00
Martin Kroeker
30473b6a9d
add openblas_getaffinity()
2022-07-27 19:15:18 +02:00
Martin Kroeker
8668571040
add openblas_getaffinity()
2022-07-27 19:14:36 +02:00
Martin Kroeker
daca01622b
fix detection of Neoverse V1 and user-enforced selection of N2 in ARM64 DYNAMIC_ARCH ( #3700 )
...
* fix detection of Neoverse V1 and user-enforced selection of N2
2022-07-27 09:17:43 +02:00
Martin Kroeker
c322aab685
Merge pull request #3684 from imzhuhl/neoversen2_dynamic_arch
...
Neoverse N2: DYNAMIC_ARCH
2022-07-26 20:06:26 +02:00
Martin Kroeker
cf796aee8c
Merge pull request #3699 from martin-frbg/issue3692
...
Add c_check recognition of Fujitsu fcc for Fugaku A64FX
2022-07-26 16:36:43 +02:00
Martin Kroeker
28d40ba60b
Merge pull request #3696 from XiWeiGu/loongson2k1000
...
LoongArch64: Add core LOONGSON2K1000 and LOONGSONGENERIC
2022-07-26 13:55:41 +02:00
Martin Kroeker
692848d20c
typo fix
2022-07-25 21:59:03 +02:00
Martin Kroeker
76ea7739dd
Merge pull request #3698 from martin-frbg/issue3697
...
utest needs to be linked against libm on QNX as well
2022-07-25 20:25:23 +02:00
Martin Kroeker
f8c5bdfbab
Treat Fujitsu fcc on Fugaku like clang
2022-07-25 19:48:59 +02:00
Martin Kroeker
70001e1e9e
Add Fujitsu compiler
2022-07-25 19:42:59 +02:00
Martin Kroeker
cf37182260
Add Fujitsu compiler (fcc)
2022-07-25 19:39:17 +02:00
Martin Kroeker
68d86ea150
Add Fujitsu compiler
2022-07-25 19:34:16 +02:00
Martin Kroeker
7aaa0ce0e8
utest needs to be linked against libm on QNX as well
2022-07-25 17:02:16 +02:00
Martin Kroeker
cd8e57040c
Merge pull request #3691 from martin-frbg/issue3679-sparc
...
SPARC: fix DNRM2 returning INF instead of zero due to intermediate overflow
2022-07-25 15:41:15 +02:00
gxw
3573306a69
LoongArch64: Add core LOONGSON2K1000 and LOONGSONGENERIC
2022-07-25 16:04:56 +08:00
Martin Kroeker
a4303ae378
Merge pull request #3695 from martin-frbg/ppc6nrm2
...
PPC6: Fix DNRM2 returning INF instead of zero due to intermediate overflow
2022-07-25 06:14:30 +02:00
Martin Kroeker
31377d04f0
Merge pull request #3694 from martin-frbg/traviswait
...
Add back travis_wait to keep ppc jobs from getting cancelled
2022-07-24 22:13:08 +02:00
Martin Kroeker
6c118b7977
Fix DNRM2 returning INF instead of zero due to intermediate overflow
2022-07-24 17:42:31 +02:00