Martin Kroeker
|
3fd6ccdf76
|
Include just the definition of BLASLONG rather than all of common.h
|
2021-03-18 07:50:19 +01:00 |
Martin Kroeker
|
fa9a30b491
|
Merge pull request #19 from xianyi/develop
rebase
|
2021-03-18 07:47:03 +01:00 |
Martin Kroeker
|
d90ca75a6c
|
Update version to 0.3.14.dev
|
2021-03-17 21:14:42 +01:00 |
Martin Kroeker
|
e107454454
|
Update version to 0.3.14.dev
|
2021-03-17 21:14:05 +01:00 |
Martin Kroeker
|
d43962d013
|
Merge pull request #3151 from xianyi/release-0.3.0
Merge 0.3.14 release branch back into develop to acquire tag
|
2021-03-17 21:13:25 +01:00 |
Martin Kroeker
|
2f6d35c3d4
|
Merge pull request #3150 from xianyi/develop
Update branch from develop for 0.3.14 release
|
2021-03-17 20:21:42 +01:00 |
Martin Kroeker
|
86de5f768b
|
Update version to 0.3.14 for release
|
2021-03-17 20:20:34 +01:00 |
Martin Kroeker
|
2663e44724
|
Update version to 0.3.14 for release
|
2021-03-17 20:20:00 +01:00 |
Martin Kroeker
|
6f2900c164
|
Merge pull request #3149 from martin-frbg/changelog14
Update Changelog for 0.3.14
|
2021-03-17 20:14:50 +01:00 |
Martin Kroeker
|
7888b5127c
|
Update Changelog for 0.3.14
|
2021-03-17 16:17:55 +01:00 |
Martin Kroeker
|
8808c291b9
|
Merge pull request #3148 from martin-frbg/issue3145
Add workaround for older gcc on big-endian ppc64 not supporting casts in defines
|
2021-03-17 09:05:43 +01:00 |
Martin Kroeker
|
8cdf0825de
|
Add workaround for older gcc on ppc64be not supporting casts in defines
|
2021-03-16 21:20:05 +01:00 |
Martin Kroeker
|
9e0dbe8e59
|
Merge pull request #18 from xianyi/develop
rebase
|
2021-03-16 21:09:45 +01:00 |
Martin Kroeker
|
52f99d3944
|
Merge pull request #3147 from martin-frbg/issue3146
Fix DYNAMIC_ARCH builds with CLANG on ppc64
|
2021-03-16 20:25:42 +01:00 |
Martin Kroeker
|
186368ddc3
|
Fix compilation with CLANG
|
2021-03-16 16:52:57 +01:00 |
Martin Kroeker
|
c0b94ae1df
|
Merge pull request #3143 from martin-frbg/fix3088
Resolve circular dependency between common.h and param.h
|
2021-03-14 23:12:55 +01:00 |
Martin Kroeker
|
ddd86309a1
|
Merge pull request #3144 from xoviat/fix-test
disable openmp
|
2021-03-14 23:12:33 +01:00 |
xoviat
|
e9d453b623
|
disable openmp
|
2021-03-14 16:34:02 -05:00 |
Martin Kroeker
|
ecb4babcf4
|
remove inclusion of common.h again to avoid circular dependency
|
2021-03-14 17:36:51 +01:00 |
Martin Kroeker
|
34753eaebb
|
Include common.h (and indirectly param.h) rather than just param.h to have BLASLONG available w/o circular dependencies
|
2021-03-14 17:28:43 +01:00 |
Martin Kroeker
|
efa72a631b
|
Merge pull request #17 from xianyi/develop
rebase
|
2021-03-14 17:20:49 +01:00 |
Martin Kroeker
|
30d835168a
|
Merge pull request #3088 from xoviat/msvc
add misc fixes.
|
2021-03-14 17:14:28 +01:00 |
Martin Kroeker
|
8f6a744807
|
Merge pull request #3141 from martin-frbg/nagfor-2
Leave out ARM64 march/mtune options when compiling with nagfor
|
2021-03-13 23:04:53 +01:00 |
Martin Kroeker
|
6726771645
|
Support compilation with NAG fortran
|
2021-03-13 20:16:18 +01:00 |
Martin Kroeker
|
a51cae6b2e
|
Merge pull request #3140 from martin-frbg/issue3139
Fix compilation on older x86_64 targets with old compilers that lack intrinsics support
|
2021-03-12 15:35:58 +01:00 |
Martin Kroeker
|
d30b943251
|
Merge pull request #3138 from martin-frbg/nagfor
Add support for compilation with the NAG Fortran compiler
|
2021-03-12 12:46:19 +01:00 |
Martin Kroeker
|
0934568d9c
|
Move includes under the ifdef for compilers w/o intrinsics support
|
2021-03-12 12:42:05 +01:00 |
Martin Kroeker
|
697e64bbb6
|
Fix syntax
|
2021-03-11 23:03:58 +01:00 |
Martin Kroeker
|
bffb9b0e95
|
Merge pull request #3136 from austinpagan/Gemm.PQ
Modifying a couple parameters in the "POWER10"-specific section of pa…
|
2021-03-11 15:17:48 +01:00 |
Martin Kroeker
|
6ae7af78a3
|
Support compilation with nagfor
|
2021-03-11 11:53:51 +01:00 |
Martin Kroeker
|
041a26fd79
|
Support compilation with nagfor
|
2021-03-11 11:52:29 +01:00 |
Martin Kroeker
|
3c356b1a1f
|
Support compilation with the NAG Fortran compiler
|
2021-03-11 11:51:09 +01:00 |
Martin Kroeker
|
b1215f2f8c
|
Merge pull request #16 from xianyi/develop
rebase
|
2021-03-11 11:48:37 +01:00 |
Martin Kroeker
|
0b73041b16
|
Merge pull request #3137 from RajalakshmiSR/zscal_p10
Optimize zscal function for POWER10
|
2021-03-11 07:18:05 +01:00 |
austinpagan
|
9579bd47e5
|
Modifying a couple paramaters in the "POWER10"-specific section of param.h, for performance enhancements for SGEMM and DGEMM.
|
2021-03-10 18:19:12 -05:00 |
Rajalakshmi Srinivasaraghavan
|
09d47af2c0
|
Optimize zscal function for POWER10
This patch makes use of new POWER10 vector pair instructions for
loads and stores.
|
2021-03-10 17:15:33 -06:00 |
Martin Kroeker
|
ef0238ba2b
|
Merge pull request #3130 from martin-frbg/issue3128
Replace spurious AVX512 requirement in the Haswell srot microkernel with an AVX2/FMA3 guard
|
2021-03-06 19:15:53 +01:00 |
Martin Kroeker
|
a9f6f7ad39
|
Remove spurious AVX512 requirement and add AVX2/FMA3 guard
|
2021-03-06 14:35:49 +01:00 |
Martin Kroeker
|
1d254d321b
|
Merge pull request #3129 from RajalakshmiSR/asum_p10
Optimize s/dasum function for POWER10
|
2021-03-06 09:13:59 +01:00 |
Rajalakshmi Srinivasaraghavan
|
41646ed006
|
Optimize s/dasum function for POWER10
This patch makes use of new POWER10 vector pair instructions for
loads and stores.
|
2021-03-05 16:22:36 -06:00 |
Martin Kroeker
|
3679781872
|
Merge pull request #3126 from martin-frbg/m1bench
Support timing Apple M1 in the benchmarks
|
2021-03-02 21:27:21 +01:00 |
Martin Kroeker
|
38dcf3454b
|
Support timing Apple M1
|
2021-03-02 17:50:55 +01:00 |
Martin Kroeker
|
e34d57ca90
|
Merge pull request #3125 from martin-frbg/issue3123
Fix AMD AOCC compiler detection
|
2021-03-02 09:58:40 +01:00 |
Martin Kroeker
|
20f492c298
|
Fix AMD AOCC compiler detection
|
2021-03-01 21:00:10 +01:00 |
Martin Kroeker
|
c7c82be1c3
|
Merge pull request #3122 from martin-frbg/xeigtstz
Fix unusual stack size requirements of the LAPACK EIG tests (from Reference-LAPACK PR 335)
|
2021-02-28 22:13:09 +01:00 |
Martin Kroeker
|
9564f688c4
|
Adjust build rules for ?chkee.F
|
2021-02-28 18:57:05 +01:00 |
Martin Kroeker
|
90c1776c86
|
Adjust build rules for ?chkee.F
|
2021-02-28 18:53:20 +01:00 |
Martin Kroeker
|
9cf861e8fa
|
Add rewritten cchkee.F from Reference-LAPACK PR335
|
2021-02-28 18:51:03 +01:00 |
Martin Kroeker
|
9b7b1da133
|
Add rewritten dchkee.F from Reference-LAPACK PR335
|
2021-02-28 18:50:26 +01:00 |
Martin Kroeker
|
a5ab891292
|
Add rewritten schkee.F from Reference-LAPACK PR335
|
2021-02-28 18:49:50 +01:00 |