Shivraj Patil
|
a4d97d980f
|
Added rot functions.
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
|
2017-01-17 12:15:07 +05:30 |
Zhang Xianyi
|
0863a0d4b4
|
Merge pull request #1061 from ashwinyes/develop_aarch64_vulcan_thunderx_patch
Add new targets for ARM64
|
2017-01-16 13:20:10 +08:00 |
Werner Saar
|
28e2fab33e
|
prepared kernel/setparam-ref.c for UNROLL values, that are not a power of two
|
2017-01-11 11:56:50 +01:00 |
Ashwin Sekhar T K
|
4b55fae337
|
ARM64: Add Cavium THUNDERX2T99 Target
|
2017-01-11 11:18:40 +05:30 |
Andrew Pinski
|
95649dee28
|
THUNDERX: Add optimized version of daxpy
This is better for single core but does not change anything for multiple cores
|
2017-01-11 11:18:36 +05:30 |
Andrew Pinski
|
8fdb0655e9
|
THUNDERX: Add an optimized version of ddot
|
2017-01-10 15:01:37 +05:30 |
Andrew Pinski
|
fb200c7245
|
ARM64: Add Cavium THUNDERX Target
|
2017-01-10 15:01:37 +05:30 |
Ashwin Sekhar T K
|
0b8e876d89
|
VULCAN: Add optimized DGEMM implementation
|
2017-01-10 15:01:37 +05:30 |
Ashwin Sekhar T K
|
4713e7c47f
|
ARM64: Add the VULCAN Target
|
2017-01-10 15:01:17 +05:30 |
Ashwin Sekhar T K
|
6085386b10
|
CORTEXA57: Add assembly kernels for copy routines
|
2017-01-10 15:01:05 +05:30 |
kaustubh
|
1480f3df71
|
Add msa optimization for AXPY, COPY, SCALE, SWAP
Signed-off-by: kaustubh <kaustubh.raste@imgtec.com>
|
2017-01-09 18:27:23 +05:30 |
kaustubh
|
88afb3bc94
|
Add msa optimization for AXPY, COPY, SCALE, SWAP
Signed-off-by: kaustubh <kaustubh.raste@imgtec.com>
|
2017-01-09 18:22:09 +05:30 |
Zhang Xianyi
|
b678471d65
|
Merge branch 'z13' into develop
Conflicts:
CONTRIBUTORS.md
|
2017-01-09 05:52:42 -05:00 |
Zhang Xianyi
|
864e202afd
|
Add USE_TRMM=1 for IBM z13 in kernel/Makefile.L3
|
2017-01-09 05:48:09 -05:00 |
Abdurrauf
|
6418667818
|
dtrmm and dgemm for z13
|
2017-01-04 19:32:33 +04:00 |
Shivraj Patil
|
a9bf8a781a
|
Added prefetch to CGEMV and ZGEMV.
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
|
2016-12-27 11:33:51 +05:30 |
kaustubh
|
5f93aa5f87
|
Updated data prefetch in TRSM, ASUM, DOT functions
Signed-off-by: kaustubh <kaustubh.raste@imgtec.com>
|
2016-12-14 14:05:11 +05:30 |
kaustubh
|
9db451acd0
|
Updated data prefetch in TRSM, ASUM, DOT functions
Signed-off-by: kaustubh <kaustubh.raste@imgtec.com>
|
2016-12-13 14:02:14 +05:30 |
kaustubh
|
3eaff85191
|
Updated data prefetch in TRSM, ASUM, DOT functions
Signed-off-by: kaustubh <kaustubh.raste@imgtec.com>
|
2016-12-13 11:41:17 +05:30 |
kaustubh
|
00abce3b93
|
Add data prefetch in DOT and ASUM functions
Signed-off-by: kaustubh <kaustubh.raste@imgtec.com>
|
2016-11-22 11:21:03 +05:30 |
Andrew
|
becf8bc7a0
|
remove dead code
|
2016-10-31 12:46:56 +01:00 |
kaustubh
|
f3419e634c
|
SGEMM, DGEMM, CGEMM, ZGEMM functions data prefetch
Signed-off-by: kaustubh <kaustubh.raste@imgtec.com>
|
2016-10-17 18:29:38 +05:30 |
Zhang Xianyi
|
7472c79ea6
|
Merge pull request #984 from ksraste/develop
STRSM, DTRSM functions data prefetch
|
2016-10-17 11:33:16 +08:00 |
kaustubh
|
90e2321ac3
|
STRSM, DTRSM functions data prefetch
Signed-off-by: kaustubh <kaustubh.raste@imgtec.com>
|
2016-10-14 16:41:28 +05:30 |
Martin Kroeker
|
4998e19869
|
Change file comments to work around clang 3.9 assembler bug
|
2016-10-13 16:51:08 +02:00 |
Martin Kroeker
|
91610f3835
|
Update zdot_msa.c
|
2016-10-05 18:59:09 +02:00 |
Martin Kroeker
|
6e22ecf102
|
Update zdot.c
|
2016-10-05 18:58:03 +02:00 |
Martin Kroeker
|
6221d6df5f
|
Update zdot.c
|
2016-10-05 18:57:14 +02:00 |
Martin Kroeker
|
16446d1d23
|
Remove explicit include of complex.h
|
2016-09-29 23:45:56 +02:00 |
Martin Kroeker
|
a6e9e0b94b
|
Remove explicit include of complex.h
|
2016-09-29 23:43:28 +02:00 |
Martin Kroeker
|
3178e4fea0
|
Remove explicit include of complex.h
|
2016-09-29 23:41:43 +02:00 |
Martin Kroeker
|
95c245ddb0
|
Remove explicit include of complex.h
|
2016-09-29 23:40:36 +02:00 |
Martin Kroeker
|
4b1b27347f
|
Remove explicit include of complex.h
|
2016-09-29 23:39:35 +02:00 |
Shivraj Patil
|
54747fe24a
|
DGEMM function split and data prefech
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
|
2016-09-22 17:25:46 +05:30 |
Zhang Xianyi
|
515bc56ea9
|
Refs #946. Use nrm2 reference implementation for Power8.
|
2016-08-18 18:59:43 -07:00 |
Zhang Xianyi
|
ae70b916f4
|
Refs #929. Deal with zero and NaNs for scale.
|
2016-08-18 10:24:42 -07:00 |
Shivraj Patil
|
9687437928
|
MIPS n32 ABI and build time mips simd support check
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
|
2016-08-10 17:44:22 +05:30 |
Shivraj Patil
|
d1c6469283
|
MIPS n32 ABI support, MSA support detection and rename ARCH, ARCHFLAGS
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
|
2016-08-08 11:58:01 +05:30 |
Ashwin Sekhar T K
|
c54a29bb48
|
Cortex A57: Improvements to DGEMM 8x4 kernel
|
2016-07-26 10:58:21 +05:30 |
Shivraj Patil
|
beb1d076a4
|
Added MSA optimization for GEMV_N, GEMV_T, ASUM, DOT functions
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
|
2016-07-15 18:38:25 +05:30 |
Zhang Xianyi
|
8a592ee386
|
Merge pull request #924 from ashwinyes/develop_aarch64_improvements_20160714
Improvements to Aarch64 kernels
|
2016-07-14 15:47:55 -04:00 |
Ashwin Sekhar T K
|
0a5ff9f9f9
|
Improvements to TRMM and GEMM kernels
|
2016-07-14 13:56:04 +05:30 |
Ashwin Sekhar T K
|
8a40f1355e
|
Improvements to GEMV kernels
|
2016-07-14 13:50:38 +05:30 |
Ashwin Sekhar T K
|
78782485b6
|
Improvements to COPY and IAMAX kernels
|
2016-07-14 13:49:34 +05:30 |
Shivraj Patil
|
57df7956ee
|
Added CGEMM, ZGEMM, STRMM, DTRMM, CTRMM, ZTRMM. Updated macros in SGEMM, DGEMM, STRMM.
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
|
2016-06-28 17:51:10 +05:30 |
Zhang Xianyi
|
4a30a2584a
|
Merge pull request #897 from ksraste/develop
STRSM optimized for MSA
|
2016-06-27 10:04:18 -04:00 |
Werner Saar
|
f04af36ad0
|
Merge pull request #898 from wernsaar/develop
added experimental support for optimized lapack fortran functions
|
2016-05-31 14:13:52 +02:00 |
Kaustubh Raste
|
011431b9d7
|
STRSM optimized for MSA
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
|
2016-05-31 10:17:23 +05:30 |
Kaustubh Raste
|
c8a7860eb3
|
STRSM optimized
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
|
2016-05-30 21:17:00 +05:30 |
Zhang Xianyi
|
2daad2bcb5
|
Merge pull request #893 from biddisco/develop
Replace CMAKE_SOURCE_DIR/CMAKE_BINARY_DIR with PROJECT_SOURCE_DIR/PRO…
|
2016-05-30 14:52:58 +08:00 |