Martin Koehler
|
76c6e33e54
|
Enable EXCAVATOR kernels for A12-9800
|
2017-02-07 21:38:28 +01:00 |
Martin Kroeker
|
a9594e8072
|
Merge pull request #1085 from vladimir-ch/lapacke_laswp_work
LAPACKE: fix incorrect value of lda_t in lapacke_?laswp_work
|
2017-02-07 11:40:41 +01:00 |
Ashwin Sekhar T K
|
8e89668f62
|
THUNDERX2T99: Fix bug in SNRM2
|
2017-02-07 02:14:33 -08:00 |
Ashwin Sekhar T K
|
f63deae9de
|
THUNDERX2T99: Add Optimized S/D IAMAX Implementation
|
2017-02-07 01:35:55 -08:00 |
Vladimir Chalupecky
|
4c2b713ce5
|
LAPACKE: fix incorrect value of lda_t in lapacke_?laswp_work
Fixed in Reference LAPACK in commit:
07e1fbd897
|
2017-02-07 09:21:46 +01:00 |
Isuru Fernando
|
cdc954675c
|
Install pkg-config files
|
2017-02-06 12:15:58 +05:30 |
Martin Kroeker
|
60eea75409
|
Merge pull request #1076 from ashwinyes/develop_20170130_thunderx2t99
More optimized implementations for ThunderX2T99
|
2017-02-04 17:25:43 +01:00 |
Ashwin Sekhar T K
|
071a830e8b
|
THUNDERX2T99: Add optimized S/D/C/Z SWAP Implementations
|
2017-02-03 03:55:06 -08:00 |
Ashwin Sekhar T K
|
d09f88192c
|
THUNDERX2T99: Add optimized S/D/C/Z COPY Implementations
|
2017-02-02 15:26:38 +05:30 |
Ashwin Sekhar T K
|
e58233460a
|
THUDNERX2T99: Add optimized D/C/Z ASUM Implementations
|
2017-02-02 15:26:22 +05:30 |
Ashwin Sekhar T K
|
3918d17025
|
LAPACK: Fix lapack-test errors in ARM64 threaded version
|
2017-01-31 23:36:23 +05:30 |
Ashwin Sekhar T K
|
99bd2892bf
|
THUNDERX2T99: Add optimized CASUM Implementation
|
2017-01-30 17:44:32 +05:30 |
Ashwin Sekhar T K
|
ff6f572f2e
|
THUNDERX2T99: Rename labels in for DDOT and SNRM2
|
2017-01-30 17:44:32 +05:30 |
Ashwin Sekhar T K
|
e0dc5f58c5
|
THUNDERX2T99: Remove Duplicate Code
|
2017-01-30 17:44:32 +05:30 |
Ashwin Sekhar T K
|
2757b49767
|
THUNDERX2T99: Add Optimized CGEMM Implementation
|
2017-01-30 17:44:26 +05:30 |
Zhang Xianyi
|
ff41e13385
|
Merge pull request #1074 from ashwinyes/develop_20170116_thunderx2t99_sgemm
Add more THUNDERX2T99 Optimized APIs
|
2017-01-25 22:17:05 +08:00 |
Ashwin Sekhar T K
|
1de6fa0f50
|
Update .gitignore
|
2017-01-24 23:14:09 -08:00 |
Ashwin Sekhar T K
|
efda640723
|
Benchmark: Add MFlops print in iamax benchmark
|
2017-01-24 23:13:47 -08:00 |
Ashwin Sekhar T K
|
1530e78cfe
|
Benchmarks: Avoid building lapack benchmarks when NO_LAPACK=1
|
2017-01-24 20:50:23 -08:00 |
Ashwin Sekhar T K
|
907e286eb6
|
THUNDERX2T99: Add threaded SNRM2 Implementation
|
2017-01-24 21:39:29 +05:30 |
Ashwin Sekhar T K
|
cde3aee08b
|
ARM64: Rename kernel files to have consistent naming
|
2017-01-24 14:53:34 +05:30 |
Ashwin Sekhar T K
|
ee6ea7e988
|
THUNDERX2T99: Add Optimized CNRM2 Implementation
|
2017-01-24 10:23:32 +05:30 |
Ashwin Sekhar T K
|
ca0b36b012
|
THUNDERX2T99: Add Optimized SNRM2 Implementation
|
2017-01-24 10:23:21 +05:30 |
Ashwin Sekhar T K
|
01e1d85339
|
Update .gitignore
|
2017-01-19 11:58:59 +05:30 |
Ashwin Sekhar T K
|
d0a79ca6e0
|
THUNDERX2T99: Add threaded DDOT Implementation
|
2017-01-19 11:11:42 +05:30 |
Ashwin Sekhar T K
|
0c07003ccf
|
THUNDERX2T99: Add Optimized DDOT Implementation
|
2017-01-19 11:11:07 +05:30 |
Ashwin Sekhar T K
|
f33fcedb30
|
THUNDERX2T99: Improve SGEMM
|
2017-01-19 11:11:07 +05:30 |
Ashwin Sekhar T K
|
0f1d6e8b39
|
THUNDERX2T99: Improve DGEMM
|
2017-01-19 11:11:07 +05:30 |
Ashwin Sekhar T K
|
981064acc6
|
THUNDERX2T99: Add Optimized DAXPY Implementation
|
2017-01-19 11:10:57 +05:30 |
Zhang Xianyi
|
ab2033f2db
|
Merge pull request #1068 from sva-img/develop
Added MSA optimised rot functions.
|
2017-01-17 22:02:21 +08:00 |
Shivraj Patil
|
a4d97d980f
|
Added rot functions.
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
|
2017-01-17 12:15:07 +05:30 |
Ashwin Sekhar T K
|
f279ff4789
|
THUNDERX2T99: Add Optimized SGEMM Implementation
|
2017-01-16 21:44:33 +05:30 |
Ashwin Sekhar T K
|
759f37feba
|
ARM64: Let target VULCAN inherit THUNDERX2T99 properties
|
2017-01-16 21:44:19 +05:30 |
Martin Kroeker
|
e8d0e66982
|
Merge pull request #1067 from martin-frbg/msysinst
Fix DESTDIR support for cygwin/msys2 install
|
2017-01-16 16:03:53 +01:00 |
Martin Kroeker
|
331fd51260
|
Fix DESTDIR support for cygwin/msys2 install
fixes #1066
|
2017-01-16 15:15:46 +01:00 |
Zhang Xianyi
|
0863a0d4b4
|
Merge pull request #1061 from ashwinyes/develop_aarch64_vulcan_thunderx_patch
Add new targets for ARM64
|
2017-01-16 13:20:10 +08:00 |
Martin Kroeker
|
2e5f906f41
|
Update Makefile.install (#1064)
* Update Makefile.install to reflect name change of lapacke_mangling.h source
|
2017-01-11 17:40:06 +01:00 |
Werner Saar
|
d1a97bad39
|
Merge pull request #1063 from wernsaar/develop
prepared kernel/setparam-ref.c for UNROLL values, that are not a power of two
|
2017-01-11 12:37:45 +01:00 |
Werner Saar
|
28e2fab33e
|
prepared kernel/setparam-ref.c for UNROLL values, that are not a power of two
|
2017-01-11 11:56:50 +01:00 |
Werner Saar
|
752fdc6f82
|
Merge pull request #1062 from wernsaar/develop
prepared parameter.c for UNROLL values, that are not a power of two
|
2017-01-11 10:30:46 +01:00 |
Werner Saar
|
c1c5a63d3c
|
prepared parameter.c for UNROLL values, that are not a power of two
|
2017-01-11 09:50:28 +01:00 |
Werner Saar
|
209b63197e
|
prepared lapack/lauum for UNROLL values, that are not a power of two
|
2017-01-11 07:29:17 +01:00 |
Ashwin Sekhar T K
|
4b55fae337
|
ARM64: Add Cavium THUNDERX2T99 Target
|
2017-01-11 11:18:40 +05:30 |
Ashwin Sekhar T K
|
738d622feb
|
ARM64: Fix auto detect of ARM64 cpus
|
2017-01-11 11:18:40 +05:30 |
Andrew Pinski
|
95649dee28
|
THUNDERX: Add optimized version of daxpy
This is better for single core but does not change anything for multiple cores
|
2017-01-11 11:18:36 +05:30 |
Martin Kroeker
|
3a8c5180b9
|
Merge pull request #1060 from martin-frbg/lapacke-mingw
Split LAPACKE 3.7.0 obj list (take 2, missed splitting the actual ar command invocation)
|
2017-01-10 19:09:49 +01:00 |
Martin Kroeker
|
7611a41f40
|
Split LAPACKE 3.7.0 obj list (take 2)
Missed the splitting of the actual ar call
|
2017-01-10 17:11:35 +01:00 |
Werner Saar
|
1a39b92b1d
|
Merge pull request #1059 from wernsaar/develop
updated some level1 funcions, that are not thread save
|
2017-01-10 16:00:28 +01:00 |
Werner Saar
|
dd6212e684
|
updated some level1 funcions, that are not thread save
|
2017-01-10 14:05:07 +01:00 |
Werner Saar
|
9bcf50872b
|
Merge pull request #1058 from wernsaar/develop
prepared lapack/potrf functions for UNROLL values, that are not a pow…
|
2017-01-10 11:30:08 +01:00 |