Martin Kroeker
0544cbc806
Fix syntax of endianness conditional
2020-02-12 20:00:29 +01:00
Martin Kroeker
120d20731f
Fix syntax of endianness conditional
2020-02-12 19:58:42 +01:00
Martin Kroeker
dc345d84df
Fix syntax of endianness conditional and add gcc version check for workaround
2020-02-12 19:56:52 +01:00
Martin Kroeker
616921fd91
Merge pull request #27 from xianyi/develop
...
rebase
2020-02-12 19:16:14 +01:00
Martin Kroeker
8a9e9a82a1
Merge pull request #2410 from bartoldeman/fix-dscal-inline-asm
...
Fix inline asm in dscal: mark x, x1 as clobbered. Fixes #2408
2020-02-12 15:38:37 +01:00
Bart Oldeman
7ea5e07d1c
Fix inline asm in dscal: mark x, x1 as clobbered. Fixes #2408
...
The leaq instructions in dscal_kernel_inc_8 modify x and x1 so they
must be declared as input/output constraints, otherwise the compiler
may assume the corresponding registers are not modified.
2020-02-12 14:11:44 +00:00
Martin Kroeker
cb6ef49857
Merge pull request #2407 from susilehtola/patch-2
...
Patch out instances of Z15 in dynamic_zarch.c
2020-02-11 13:04:44 +01:00
Martin Kroeker
63994e1cdb
Merge pull request #2405 from susilehtola/patch-1
...
Fix typo in dynamic_zarch.c
2020-02-11 13:03:35 +01:00
Martin Kroeker
496e3019bc
Merge pull request #2404 from martin-frbg/issue2395
...
Fix spurious application of USE_TRMM in cmake builds
2020-02-11 13:00:36 +01:00
Martin Kroeker
169be3f097
Merge pull request #2403 from martin-frbg/issue2400
...
Fix coretype identification of Intel Cannon Lake, Ice Lake and Goldmont
2020-02-11 13:00:16 +01:00
Martin Kroeker
6ccbb089c2
Merge pull request #2402 from gxw-loongson/develop
...
Avoid printing the following information on mips and mips64 when check msa
2020-02-11 12:59:53 +01:00
Martin Kroeker
59ebe3636a
Merge pull request #2399 from martin-frbg/buffersize
...
Make BUFFER_SIZE configurable at build time
2020-02-11 12:56:56 +01:00
Susi Lehtola
5a6bba3061
Patch out instances of Z15 in dynamic_zarch.c
...
There does not appear to be a Z15 kernel yet, causing link errors from the code. This patch fixes the issue.
2020-02-11 15:07:33 +13:00
Susi Lehtola
dff173e50e
Fix typo in dynamic_zarch.c
2020-02-11 14:46:30 +13:00
Martin Kroeker
7e5cbb6f35
Fix bad conditional syntax that caused spurious application of USE_TRMM
2020-02-10 21:17:39 +01:00
Martin Kroeker
303bdb673b
Fix coretype detection for Intel extended models 6 and 7
...
affecting Goldmont, Cannon Lake, Ice Lake autodetection
2020-02-10 19:17:32 +01:00
gxw
754433f420
Avoid printing the following information on mips and mips64 when check msa:
...
"unrecognized command line option ‘-mmsa’"
2020-02-10 19:11:45 +08:00
Martin Kroeker
7f0d523b42
Make BUFFER_SIZE configurable
2020-02-09 23:32:57 +01:00
Martin Kroeker
c353d8b106
Make BUFFER_SIZE configurable
2020-02-09 23:30:22 +01:00
Martin Kroeker
579be3aa9d
Add configuration option for BUFFER_SIZE
2020-02-09 23:28:04 +01:00
Martin Kroeker
449e8ea443
Merge pull request #26 from xianyi/develop
...
rebase
2020-02-09 23:23:55 +01:00
Martin Kroeker
3bec250cf9
Increment version to 0.3.9.dev
2020-02-09 23:18:44 +01:00
Martin Kroeker
f03dd23e90
Increment version to 0.3.9.dev
2020-02-09 23:18:07 +01:00
Martin Kroeker
fa93d63365
Merge branch 'release-0.3.0' into develop
2020-02-09 23:16:06 +01:00
Martin Kroeker
90e6c66a57
Merge pull request #2397 from martin-frbg/038changes
...
Update Changelog with changes from 0.3.8
2020-02-09 23:01:52 +01:00
Martin Kroeker
32d97330b3
Update with changes from 0.3.8
2020-02-09 23:00:36 +01:00
Martin Kroeker
29eaf4b6d7
Merge pull request #25 from xianyi/develop
...
rebase
2020-02-09 22:48:15 +01:00
Martin Kroeker
47c1bf7f4d
typo fixes
2020-02-09 01:06:40 +01:00
Martin Kroeker
2b55f0ad30
Merge pull request #2393 from martin-frbg/issue2388
...
Provide more documentation in README.md
2020-02-09 01:00:33 +01:00
Martin Kroeker
a5b32ab06c
Merge pull request #2390 from martin-frbg/pgi
...
Small corrections for compilation with PGI compilers
2020-02-09 00:13:40 +01:00
Martin Kroeker
50545b19d0
Update CPU and OS support and document DYNAMIC_ARCH option in README.md
...
prompted by #2388
2020-02-09 00:06:07 +01:00
Martin Kroeker
b3cbd60d7a
Remove PGI from list again as it is actually still not capable
2020-02-08 10:20:13 +01:00
Martin Kroeker
70199d1905
Merge pull request #2389 from Zeyiii/develop
...
Fix bugs in benchmark of gemv
2020-02-07 16:05:46 +01:00
Martin Kroeker
cfe63d8cc2
Remove OpenMP libraries from link list
2020-02-07 16:03:51 +01:00
Martin Kroeker
d55b10830f
Remove OpenMP libraries from link list
2020-02-07 16:02:17 +01:00
Martin Kroeker
c1c10cbb21
Merge pull request #2384 from wjc404/develop
...
Optimize AVX512 DGEMM (& DTRMM)
2020-02-07 13:47:12 +01:00
Martin Kroeker
5989841524
Add PGI to avx512-supporting compilers
2020-02-07 13:01:31 +01:00
Martin Kroeker
68a43db358
Fix utest compilation with PGI
2020-02-07 10:15:18 +01:00
Martin Kroeker
9694037b23
Set SUFFIX in tempfile commands, fix bad architecture option for PGI compiler in avx512 test
2020-02-07 10:09:25 +01:00
Martin Kroeker
71faa1c1a7
Merge pull request #24 from xianyi/develop
...
rebase
2020-02-07 10:03:02 +01:00
wjc404
3447d04eaf
Update dgemm_kernel_16x2_skylakex.c
2020-02-06 02:14:10 +00:00
wjc404
8b5cdcc64c
Update sgemm_kernel_8x4_haswell.c
2020-02-06 01:47:46 +00:00
wjc404
4e00d96a78
Update dgemm_kernel_16x2_skylakex.c
2020-02-06 01:46:36 +00:00
w00421467
ce9ea8f826
Fix another branch
2020-02-05 15:07:18 +08:00
w00421467
0b909203cb
Fix bugs in benchmark of gemv
2020-02-05 14:53:37 +08:00
wjc404
096da2f51a
Update dgemm_kernel_16x2_skylakex.c
2020-02-05 13:36:57 +08:00
wjc404
2f96a2c55b
Update trmm_R.c
2020-02-05 10:15:02 +08:00
wjc404
833bd0f8ff
Update trmm_L.c
2020-02-05 10:09:41 +08:00
wjc404
77b8f49556
Update level3_thread.c
2020-02-04 20:33:08 +08:00
wjc404
1c3e20ce48
Update level3.c
2020-02-04 20:30:23 +08:00