Martin Kroeker
|
63994e1cdb
|
Merge pull request #2405 from susilehtola/patch-1
Fix typo in dynamic_zarch.c
|
2020-02-11 13:03:35 +01:00 |
Martin Kroeker
|
496e3019bc
|
Merge pull request #2404 from martin-frbg/issue2395
Fix spurious application of USE_TRMM in cmake builds
|
2020-02-11 13:00:36 +01:00 |
Martin Kroeker
|
169be3f097
|
Merge pull request #2403 from martin-frbg/issue2400
Fix coretype identification of Intel Cannon Lake, Ice Lake and Goldmont
|
2020-02-11 13:00:16 +01:00 |
Martin Kroeker
|
6ccbb089c2
|
Merge pull request #2402 from gxw-loongson/develop
Avoid printing the following information on mips and mips64 when check msa
|
2020-02-11 12:59:53 +01:00 |
Martin Kroeker
|
59ebe3636a
|
Merge pull request #2399 from martin-frbg/buffersize
Make BUFFER_SIZE configurable at build time
|
2020-02-11 12:56:56 +01:00 |
Susi Lehtola
|
5a6bba3061
|
Patch out instances of Z15 in dynamic_zarch.c
There does not appear to be a Z15 kernel yet, causing link errors from the code. This patch fixes the issue.
|
2020-02-11 15:07:33 +13:00 |
Susi Lehtola
|
dff173e50e
|
Fix typo in dynamic_zarch.c
|
2020-02-11 14:46:30 +13:00 |
Martin Kroeker
|
7e5cbb6f35
|
Fix bad conditional syntax that caused spurious application of USE_TRMM
|
2020-02-10 21:17:39 +01:00 |
Martin Kroeker
|
303bdb673b
|
Fix coretype detection for Intel extended models 6 and 7
affecting Goldmont, Cannon Lake, Ice Lake autodetection
|
2020-02-10 19:17:32 +01:00 |
gxw
|
754433f420
|
Avoid printing the following information on mips and mips64 when check msa:
"unrecognized command line option ‘-mmsa’"
|
2020-02-10 19:11:45 +08:00 |
Martin Kroeker
|
7f0d523b42
|
Make BUFFER_SIZE configurable
|
2020-02-09 23:32:57 +01:00 |
Martin Kroeker
|
c353d8b106
|
Make BUFFER_SIZE configurable
|
2020-02-09 23:30:22 +01:00 |
Martin Kroeker
|
579be3aa9d
|
Add configuration option for BUFFER_SIZE
|
2020-02-09 23:28:04 +01:00 |
Martin Kroeker
|
449e8ea443
|
Merge pull request #26 from xianyi/develop
rebase
|
2020-02-09 23:23:55 +01:00 |
Martin Kroeker
|
3bec250cf9
|
Increment version to 0.3.9.dev
|
2020-02-09 23:18:44 +01:00 |
Martin Kroeker
|
f03dd23e90
|
Increment version to 0.3.9.dev
|
2020-02-09 23:18:07 +01:00 |
Martin Kroeker
|
fa93d63365
|
Merge branch 'release-0.3.0' into develop
|
2020-02-09 23:16:06 +01:00 |
Martin Kroeker
|
90e6c66a57
|
Merge pull request #2397 from martin-frbg/038changes
Update Changelog with changes from 0.3.8
|
2020-02-09 23:01:52 +01:00 |
Martin Kroeker
|
32d97330b3
|
Update with changes from 0.3.8
|
2020-02-09 23:00:36 +01:00 |
Martin Kroeker
|
29eaf4b6d7
|
Merge pull request #25 from xianyi/develop
rebase
|
2020-02-09 22:48:15 +01:00 |
Martin Kroeker
|
47c1bf7f4d
|
typo fixes
|
2020-02-09 01:06:40 +01:00 |
Martin Kroeker
|
2b55f0ad30
|
Merge pull request #2393 from martin-frbg/issue2388
Provide more documentation in README.md
|
2020-02-09 01:00:33 +01:00 |
Martin Kroeker
|
a5b32ab06c
|
Merge pull request #2390 from martin-frbg/pgi
Small corrections for compilation with PGI compilers
|
2020-02-09 00:13:40 +01:00 |
Martin Kroeker
|
50545b19d0
|
Update CPU and OS support and document DYNAMIC_ARCH option in README.md
prompted by #2388
|
2020-02-09 00:06:07 +01:00 |
Martin Kroeker
|
b3cbd60d7a
|
Remove PGI from list again as it is actually still not capable
|
2020-02-08 10:20:13 +01:00 |
Martin Kroeker
|
70199d1905
|
Merge pull request #2389 from Zeyiii/develop
Fix bugs in benchmark of gemv
|
2020-02-07 16:05:46 +01:00 |
Martin Kroeker
|
cfe63d8cc2
|
Remove OpenMP libraries from link list
|
2020-02-07 16:03:51 +01:00 |
Martin Kroeker
|
d55b10830f
|
Remove OpenMP libraries from link list
|
2020-02-07 16:02:17 +01:00 |
Martin Kroeker
|
c1c10cbb21
|
Merge pull request #2384 from wjc404/develop
Optimize AVX512 DGEMM (& DTRMM)
|
2020-02-07 13:47:12 +01:00 |
Martin Kroeker
|
5989841524
|
Add PGI to avx512-supporting compilers
|
2020-02-07 13:01:31 +01:00 |
Martin Kroeker
|
68a43db358
|
Fix utest compilation with PGI
|
2020-02-07 10:15:18 +01:00 |
Martin Kroeker
|
9694037b23
|
Set SUFFIX in tempfile commands, fix bad architecture option for PGI compiler in avx512 test
|
2020-02-07 10:09:25 +01:00 |
Martin Kroeker
|
71faa1c1a7
|
Merge pull request #24 from xianyi/develop
rebase
|
2020-02-07 10:03:02 +01:00 |
wjc404
|
3447d04eaf
|
Update dgemm_kernel_16x2_skylakex.c
|
2020-02-06 02:14:10 +00:00 |
wjc404
|
8b5cdcc64c
|
Update sgemm_kernel_8x4_haswell.c
|
2020-02-06 01:47:46 +00:00 |
wjc404
|
4e00d96a78
|
Update dgemm_kernel_16x2_skylakex.c
|
2020-02-06 01:46:36 +00:00 |
w00421467
|
ce9ea8f826
|
Fix another branch
|
2020-02-05 15:07:18 +08:00 |
w00421467
|
0b909203cb
|
Fix bugs in benchmark of gemv
|
2020-02-05 14:53:37 +08:00 |
wjc404
|
096da2f51a
|
Update dgemm_kernel_16x2_skylakex.c
|
2020-02-05 13:36:57 +08:00 |
wjc404
|
2f96a2c55b
|
Update trmm_R.c
|
2020-02-05 10:15:02 +08:00 |
wjc404
|
833bd0f8ff
|
Update trmm_L.c
|
2020-02-05 10:09:41 +08:00 |
wjc404
|
77b8f49556
|
Update level3_thread.c
|
2020-02-04 20:33:08 +08:00 |
wjc404
|
1c3e20ce48
|
Update level3.c
|
2020-02-04 20:30:23 +08:00 |
wjc404
|
83b6be7976
|
Update param.h
|
2020-02-04 19:55:26 +08:00 |
wjc404
|
081b188529
|
Update KERNEL.SKYLAKEX
|
2020-02-03 21:38:08 +08:00 |
wjc404
|
f3f969f681
|
Update param.h
|
2020-02-03 21:34:12 +08:00 |
wjc404
|
8019e70211
|
AVX512 16x2 DGEMM kernel
|
2020-02-03 21:32:56 +08:00 |
Martin Kroeker
|
8d2a796f49
|
Merge pull request #2378 from martin-frbg/issue2377
Add -march option for AVX512 in cmake as well
|
2020-01-30 17:07:19 +01:00 |
Martin Kroeker
|
8dc9fd4dfe
|
Add -march option for AVX512
|
2020-01-30 12:41:18 +01:00 |
Martin Kroeker
|
abc67bdd74
|
Merge pull request #2375 from ewanglong/master
fix a few performance drop in some matrix size per data type
|
2020-01-30 10:27:29 +01:00 |