Gordon Fossum
|
213c0e7abb
|
Added special unrolled vectorized versions of "Solve" for specific sizes,
in DTRSM and STRSM, to improve performance in Power9 and Power10.
|
2020-12-04 17:07:06 -06:00 |
Martin Kroeker
|
f21618684b
|
Merge pull request #3018 from martin-frbg/issue3015
Avoid concurrent inclusion of libgomp and libomp in clang+gfortran builds
|
2020-12-04 22:08:17 +01:00 |
Martin Kroeker
|
441c08c9ff
|
Merge pull request #3016 from xiegengxin/complex-asum
Improve the performance of zasum and casum with AVX512 intrinsic
|
2020-12-04 22:07:16 +01:00 |
Martin Kroeker
|
66302b3c06
|
Merge pull request #3013 from martin-frbg/gcc46
Fix 32bit x86 builds and add workaround for x86_64 miscompilations by gcc 4.6 (including our Travis setup)
|
2020-12-04 08:54:11 +01:00 |
Martin Kroeker
|
07e9a12349
|
Merge pull request #3011 from cyyever/fix_link
link math lib on FreeBSD
|
2020-12-04 08:50:59 +01:00 |
Martin Kroeker
|
dd1adbdec4
|
Merge pull request #3019 from RajalakshmiSR/dgemm_param
POWER10: Update param.h
|
2020-12-04 08:49:28 +01:00 |
Martin Kroeker
|
a1eecccda2
|
Update f_check
|
2020-12-03 23:43:17 +01:00 |
Rajalakshmi Srinivasaraghavan
|
41fe6e864e
|
POWER10: Update param.h
Increasing the values of DGEMM_DEFAULT_P and DGEMM_DEFAULT_Q helps
in improving performance ~10% for DGEMM.
|
2020-12-03 14:40:11 -06:00 |
Martin Kroeker
|
74b5850581
|
Add libomp to the LAPACK(-test) dependencies in clang/gfortran builds
|
2020-12-03 21:28:10 +01:00 |
Martin Kroeker
|
da0c94c76f
|
Avoid linking both GNU libgomp and LLVM libomp in clang/gfortran builds
|
2020-12-03 21:25:57 +01:00 |
Martin Kroeker
|
a6692dc129
|
use gfortran-10 with xcode 12
|
2020-12-03 14:32:21 +01:00 |
Martin Kroeker
|
72a553f5bc
|
Update .travis.yml
|
2020-12-03 09:17:27 +01:00 |
Martin Kroeker
|
dcbb3b5ef1
|
fix misplaced lines
|
2020-12-02 23:13:13 +01:00 |
Martin Kroeker
|
57456c248b
|
fix gfortran requirement in osx interface64 test
|
2020-12-02 15:56:21 +01:00 |
Martin Kroeker
|
c361313564
|
Disable deprecated 32bit xcode
|
2020-12-02 07:49:43 +01:00 |
Gengxin Xie
|
0cb7a403b2
|
fix error declare function blas_level1_thread_with_return_value
|
2020-12-02 09:51:52 +08:00 |
Martin Kroeker
|
77a538d4ba
|
Update an overlooked instance of xcode 10.0 as well
|
2020-12-01 22:05:35 +01:00 |
Martin Kroeker
|
9621062eba
|
Update OSX xcode version to 11.5
|
2020-12-01 12:23:30 +01:00 |
Gengxin Xie
|
b766c1e9bb
|
Improve the performance of zasum and casum with AVX512 intrinsic
|
2020-12-01 16:49:26 +08:00 |
Martin Kroeker
|
22574b474e
|
Suppress -mfma as well for gcc 4.6
|
2020-11-30 21:41:51 +01:00 |
Martin Kroeker
|
f662022994
|
Move the version check to avoid overwriting unprocessed compiler data
|
2020-11-30 17:24:27 +01:00 |
Martin Kroeker
|
5e81e81478
|
Merge pull request #3014 from RajalakshmiSR/dgemvnp10
POWER10: Optimize dgemv_n
|
2020-11-30 08:18:24 +01:00 |
Rajalakshmi Srinivasaraghavan
|
7d46e31de1
|
POWER10: Optimize dgemv_n
Handling as 4x8 with vector pairs gives better performance than
existing code in POWER10.
|
2020-11-29 15:28:28 -06:00 |
Martin Kroeker
|
62a2eb884f
|
Add SSE flags for x86
|
2020-11-29 15:33:07 +01:00 |
Martin Kroeker
|
2e99e2699b
|
Add workaround for gcc 4.6 miscompiling assembly kernels with -mavx
|
2020-11-29 15:32:17 +01:00 |
Martin Kroeker
|
006b13299f
|
Merge pull request #3012 from martin-frbg/restore-getarch
Restore RISCV entries accidentally trashed by my PR 3005
|
2020-11-29 13:27:47 +01:00 |
Martin Kroeker
|
ca17d3dc3d
|
Restore RISCV entries accidentally trashed by my PR 3005
|
2020-11-29 13:19:51 +01:00 |
Martin Kroeker
|
52ed2741c5
|
Merge pull request #3010 from ggouaillardet/topic/fj_compilers
add Fujitsu compilers
|
2020-11-29 11:36:43 +01:00 |
cyy
|
3b4c016110
|
link math lib on FreeBSD
|
2020-11-29 17:17:35 +08:00 |
Gilles Gouaillardet
|
358100ec15
|
add Fujitsu compilers
Co-authored-by: Tomoki Karatsu <karatsu.spack@gmail.com>
|
2020-11-29 14:35:42 +09:00 |
Martin Kroeker
|
3788b6d156
|
Merge pull request #3005 from martin-frbg/ssefix
Add -msse for x86 and silence build warning in getarch
|
2020-11-23 08:35:32 +01:00 |
Martin Kroeker
|
bc5b1ddf0d
|
Merge pull request #3004 from martin-frbg/bsd_getauxval
ARM64 DYNAMIC_ARCH build fix for BSD/OSX
|
2020-11-23 08:35:12 +01:00 |
Martin Kroeker
|
2f42d23104
|
Merge pull request #3002 from martin-frbg/issue3000
Ensure that all targets in a DYNAMIC_ARCH build on POWER use the same buffer size
|
2020-11-22 22:51:26 +01:00 |
Martin Kroeker
|
b72dd007dc
|
Merge pull request #3001 from martin-frbg/issue2996
Fix ambiguous ifdefs in tests for user-defined options in Makefiles
|
2020-11-22 22:50:41 +01:00 |
Martin Kroeker
|
11ebe5fa25
|
Avoid redefinition warning
|
2020-11-22 21:16:07 +01:00 |
Martin Kroeker
|
01f01dae98
|
Add -msse if supported
|
2020-11-22 21:15:08 +01:00 |
Martin Kroeker
|
e7bf8ced6c
|
Build fix for systems that do not support getauxval
|
2020-11-22 20:20:28 +01:00 |
Martin Kroeker
|
0256294921
|
Fix syntax mixup
|
2020-11-22 17:41:44 +01:00 |
Martin Kroeker
|
2b114c3f30
|
Restore proper Makefile
|
2020-11-22 17:16:22 +01:00 |
Martin Kroeker
|
60e1fddca7
|
Ensure that the same (large) BUFFERSIZE is used for all cpus in DYNAMIC_ARCH builds
|
2020-11-22 16:48:22 +01:00 |
Martin Kroeker
|
ebb8788696
|
Use ifneq instead of ifdef for CROSS option
|
2020-11-22 16:33:34 +01:00 |
Martin Kroeker
|
857afcc41d
|
Use ifeq instead of ifdef for user-definable build options
|
2020-11-22 16:31:44 +01:00 |
Martin Kroeker
|
5fa305172a
|
Use ifeq instead of ifdef for user-definable options
|
2020-11-22 16:29:56 +01:00 |
Martin Kroeker
|
d3ff1f889f
|
Convert ifndefs to ifneq
|
2020-11-22 16:27:17 +01:00 |
Martin Kroeker
|
65eb7afaf4
|
Change ifndef CROSS to ifneq
|
2020-11-22 16:25:36 +01:00 |
Martin Kroeker
|
8a6b17f97d
|
Change ifndefs to ifneq
|
2020-11-22 16:19:31 +01:00 |
Martin Kroeker
|
0f863f96e4
|
Merge pull request #112 from xianyi/develop
rebase
|
2020-11-22 16:17:19 +01:00 |
Martin Kroeker
|
437702e0e1
|
Merge pull request #2965 from epsilon-0/develop
allow setting soname without suffix or prefix
|
2020-11-22 12:25:33 +01:00 |
Martin Kroeker
|
f1bf040b25
|
Merge pull request #2988 from xiegengxin/smp-asum
Improve the performance of dasum and sasum when SMP is defined
|
2020-11-22 12:24:13 +01:00 |
Martin Kroeker
|
613e3b2baf
|
Merge pull request #2997 from Flamefire/reproduce_crash
Add reproducer test for crash after fork
|
2020-11-22 12:22:57 +01:00 |