Commit Graph

6983 Commits

Author SHA1 Message Date
Martin Kroeker a1eecccda2
Update f_check 2020-12-03 23:43:17 +01:00
Rajalakshmi Srinivasaraghavan 41fe6e864e POWER10: Update param.h
Increasing the values of DGEMM_DEFAULT_P and DGEMM_DEFAULT_Q helps
in improving performance ~10% for DGEMM.
2020-12-03 14:40:11 -06:00
Martin Kroeker 74b5850581
Add libomp to the LAPACK(-test) dependencies in clang/gfortran builds 2020-12-03 21:28:10 +01:00
Martin Kroeker da0c94c76f
Avoid linking both GNU libgomp and LLVM libomp in clang/gfortran builds 2020-12-03 21:25:57 +01:00
Martin Kroeker a6692dc129
use gfortran-10 with xcode 12 2020-12-03 14:32:21 +01:00
Martin Kroeker 72a553f5bc
Update .travis.yml 2020-12-03 09:17:27 +01:00
Martin Kroeker dcbb3b5ef1
fix misplaced lines 2020-12-02 23:13:13 +01:00
Martin Kroeker 57456c248b
fix gfortran requirement in osx interface64 test 2020-12-02 15:56:21 +01:00
Martin Kroeker c361313564
Disable deprecated 32bit xcode 2020-12-02 07:49:43 +01:00
Gengxin Xie 0cb7a403b2 fix error declare function blas_level1_thread_with_return_value 2020-12-02 09:51:52 +08:00
Martin Kroeker 77a538d4ba
Update an overlooked instance of xcode 10.0 as well 2020-12-01 22:05:35 +01:00
Martin Kroeker 9621062eba
Update OSX xcode version to 11.5 2020-12-01 12:23:30 +01:00
Gengxin Xie b766c1e9bb Improve the performance of zasum and casum with AVX512 intrinsic 2020-12-01 16:49:26 +08:00
Martin Kroeker 22574b474e
Suppress -mfma as well for gcc 4.6 2020-11-30 21:41:51 +01:00
Martin Kroeker f662022994
Move the version check to avoid overwriting unprocessed compiler data 2020-11-30 17:24:27 +01:00
Martin Kroeker 5e81e81478
Merge pull request #3014 from RajalakshmiSR/dgemvnp10
POWER10:  Optimize dgemv_n
2020-11-30 08:18:24 +01:00
Rajalakshmi Srinivasaraghavan 7d46e31de1 POWER10: Optimize dgemv_n
Handling as 4x8 with vector pairs gives better performance than
existing code in POWER10.
2020-11-29 15:28:28 -06:00
Martin Kroeker 62a2eb884f
Add SSE flags for x86 2020-11-29 15:33:07 +01:00
Martin Kroeker 2e99e2699b
Add workaround for gcc 4.6 miscompiling assembly kernels with -mavx 2020-11-29 15:32:17 +01:00
Martin Kroeker 006b13299f
Merge pull request #3012 from martin-frbg/restore-getarch
Restore RISCV entries accidentally trashed by my PR 3005
2020-11-29 13:27:47 +01:00
Martin Kroeker ca17d3dc3d
Restore RISCV entries accidentally trashed by my PR 3005 2020-11-29 13:19:51 +01:00
Martin Kroeker 52ed2741c5
Merge pull request #3010 from ggouaillardet/topic/fj_compilers
add Fujitsu compilers
2020-11-29 11:36:43 +01:00
cyy 3b4c016110 link math lib on FreeBSD 2020-11-29 17:17:35 +08:00
Gilles Gouaillardet 358100ec15 add Fujitsu compilers
Co-authored-by: Tomoki Karatsu <karatsu.spack@gmail.com>
2020-11-29 14:35:42 +09:00
Martin Kroeker 3788b6d156
Merge pull request #3005 from martin-frbg/ssefix
Add -msse for x86 and silence build warning in getarch
2020-11-23 08:35:32 +01:00
Martin Kroeker bc5b1ddf0d
Merge pull request #3004 from martin-frbg/bsd_getauxval
ARM64 DYNAMIC_ARCH build fix for BSD/OSX
2020-11-23 08:35:12 +01:00
Martin Kroeker 2f42d23104
Merge pull request #3002 from martin-frbg/issue3000
Ensure that all targets in a DYNAMIC_ARCH build on POWER use the same buffer size
2020-11-22 22:51:26 +01:00
Martin Kroeker b72dd007dc
Merge pull request #3001 from martin-frbg/issue2996
Fix ambiguous ifdefs in tests for user-defined options in Makefiles
2020-11-22 22:50:41 +01:00
Martin Kroeker 11ebe5fa25
Avoid redefinition warning 2020-11-22 21:16:07 +01:00
Martin Kroeker 01f01dae98
Add -msse if supported 2020-11-22 21:15:08 +01:00
Martin Kroeker e7bf8ced6c
Build fix for systems that do not support getauxval 2020-11-22 20:20:28 +01:00
Martin Kroeker 0256294921
Fix syntax mixup 2020-11-22 17:41:44 +01:00
Martin Kroeker 2b114c3f30
Restore proper Makefile 2020-11-22 17:16:22 +01:00
Martin Kroeker 60e1fddca7
Ensure that the same (large) BUFFERSIZE is used for all cpus in DYNAMIC_ARCH builds 2020-11-22 16:48:22 +01:00
Martin Kroeker ebb8788696
Use ifneq instead of ifdef for CROSS option 2020-11-22 16:33:34 +01:00
Martin Kroeker 857afcc41d
Use ifeq instead of ifdef for user-definable build options 2020-11-22 16:31:44 +01:00
Martin Kroeker 5fa305172a
Use ifeq instead of ifdef for user-definable options 2020-11-22 16:29:56 +01:00
Martin Kroeker d3ff1f889f
Convert ifndefs to ifneq 2020-11-22 16:27:17 +01:00
Martin Kroeker 65eb7afaf4
Change ifndef CROSS to ifneq 2020-11-22 16:25:36 +01:00
Martin Kroeker 8a6b17f97d
Change ifndefs to ifneq 2020-11-22 16:19:31 +01:00
Martin Kroeker 0f863f96e4
Merge pull request #112 from xianyi/develop
rebase
2020-11-22 16:17:19 +01:00
Martin Kroeker 437702e0e1
Merge pull request #2965 from epsilon-0/develop
allow setting soname without suffix or prefix
2020-11-22 12:25:33 +01:00
Martin Kroeker f1bf040b25
Merge pull request #2988 from xiegengxin/smp-asum
Improve the performance of dasum and sasum when SMP is defined
2020-11-22 12:24:13 +01:00
Martin Kroeker 613e3b2baf
Merge pull request #2997 from Flamefire/reproduce_crash
Add reproducer test for crash after fork
2020-11-22 12:22:57 +01:00
Xianyi Zhang 05a0ea2340 Merge branch 'risc-v' into develop 2020-11-22 16:05:32 +08:00
Xianyi Zhang 7037849498 Merge branch 'develop' into risc-v 2020-11-22 16:04:50 +08:00
Xianyi Zhang c6c9c24d1b Update doc for C910. 2020-11-22 16:02:19 +08:00
Martin Kroeker 6dd71af0c3
Merge pull request #2995 from Flamefire/fix_thread_buffer_init
Don't overwrite blas_thread_buffer if already set
2020-11-20 09:42:10 +01:00
Alexander Grund a05dc6e62b
Add reproducer test for crash after fork
See #2993 for an analysis
2020-11-19 15:46:37 +01:00
Alexander Grund 60005eb47b
Don't overwrite blas_thread_buffer if already set
After a fork it is possible that blas_thread_buffer has already
allocated memory buffers: goto_set_num_threads does allocate those
already and it may be called by num_cpu_avail in case the OpenBLAS
NUM_THREADS differ from the OMP num threads.
This leads to a memory leak which can cause subsequent execution of BLAS
kernels to fail.

Fixes #2993
2020-11-19 14:51:51 +01:00