Zhang Xianyi
bcfc298c38
Merge pull request #987 from Sbte/master
...
Fix HASWELL capitalization in kernel cmake file
2016-10-18 12:38:33 +08:00
Sven Baars
ce7c6c6b2d
Fix HASWELL capitalization in kernel cmake file
...
Refs #951
2016-10-17 16:34:13 +02:00
kaustubh
f3419e634c
SGEMM, DGEMM, CGEMM, ZGEMM functions data prefetch
...
Signed-off-by: kaustubh <kaustubh.raste@imgtec.com>
2016-10-17 18:29:38 +05:30
Zhang Xianyi
7472c79ea6
Merge pull request #984 from ksraste/develop
...
STRSM, DTRSM functions data prefetch
2016-10-17 11:33:16 +08:00
Zhang Xianyi
66c9a9b33d
Merge pull request #981 from howard0su/develop
...
USE NPROCESSOR_CONF instaed of NPORCESSOR_ONLN
2016-10-17 11:32:57 +08:00
Zhang Xianyi
3705f5675a
Merge pull request #982 from martin-frbg/develop
...
Change file comments to work around clang 3.9 assembler bug; add support for Bay Trail atom
2016-10-17 11:32:20 +08:00
Martin Kroeker
bce2b34f7a
Merge pull request #1 from martin-frbg/martin-frbg-patch-1
...
Add Intel "Bay Trail" atom cpu
2016-10-16 22:51:42 +02:00
Martin Kroeker
da83ec94d1
Merge pull request #2 from martin-frbg/martin-frbg-patch-1-1
...
Update cpuid_x86.c
2016-10-16 22:48:58 +02:00
Martin Kroeker
3409bccb21
Update cpuid_x86.c
...
Add Bay Trail "Pentium N3520" atom cpu
2016-10-16 22:45:44 +02:00
Martin Kroeker
8a8f3932eb
Update dynamic.c
...
Add Bay Trail "Pentium N3520" atom
2016-10-16 22:40:00 +02:00
kaustubh
90e2321ac3
STRSM, DTRSM functions data prefetch
...
Signed-off-by: kaustubh <kaustubh.raste@imgtec.com>
2016-10-14 16:41:28 +05:30
Martin Kroeker
4998e19869
Change file comments to work around clang 3.9 assembler bug
2016-10-13 16:51:08 +02:00
Howard Su
ff1da01476
USE NPROCESSOR_CONF instaed of NPORCESSOR_ONLN
...
to determine the number of CPU. In ARM platform,
online CPU will increasing when there is more workload.
while configure cpu is the max number of CPU.
2016-10-13 12:37:50 +00:00
Zhang Xianyi
ef52a9266b
Fixed #979 . Patch for NetBSD.
2016-10-13 10:17:07 +08:00
Zhang Xianyi
4f38ae3199
Merge pull request #970 from martin-frbg/develop
...
Remove implicit inclusions of complex.h in various zdot implementations
2016-10-13 10:13:56 +08:00
Zhang Xianyi
4baf0c7cfc
Merge pull request #980 from kiwifb/utest_ldflags
...
make utest/Makefile respect LDFLAGS
2016-10-13 10:13:12 +08:00
Zhang Xianyi
595a0224e4
Merge pull request #973 from vladimir-ch/fix-lapacke-xlarfb
...
LAPACKE: fix wrong direction check in LAPACKE_?larfb_work
2016-10-13 10:12:35 +08:00
François Bissey
f124ffab47
make utest/Makefile respect LDFLAGS
2016-10-13 09:32:25 +13:00
Martin Kroeker
91610f3835
Update zdot_msa.c
2016-10-05 18:59:09 +02:00
Martin Kroeker
6e22ecf102
Update zdot.c
2016-10-05 18:58:03 +02:00
Martin Kroeker
6221d6df5f
Update zdot.c
2016-10-05 18:57:14 +02:00
Vladimir Chalupecky
117d3371d4
LAPACKE: fix wrong direction check in LAPACKE_?larfb_work
...
Closes #971
2016-10-01 05:31:30 +09:00
Martin Kroeker
16446d1d23
Remove explicit include of complex.h
2016-09-29 23:45:56 +02:00
Martin Kroeker
a6e9e0b94b
Remove explicit include of complex.h
2016-09-29 23:43:28 +02:00
Martin Kroeker
3178e4fea0
Remove explicit include of complex.h
2016-09-29 23:41:43 +02:00
Martin Kroeker
95c245ddb0
Remove explicit include of complex.h
2016-09-29 23:40:36 +02:00
Martin Kroeker
4b1b27347f
Remove explicit include of complex.h
2016-09-29 23:39:35 +02:00
Zhang Xianyi
161c927071
Merge pull request #968 from buffer51/develop
...
Updated CROSS_SUFFIX regex to work with CC containing arguments
2016-09-22 11:34:57 -04:00
Zhang Xianyi
662f89f059
Merge pull request #969 from sva-img/develop
...
DGEMM function split and data prefech
2016-09-22 11:33:51 -04:00
Shivraj Patil
54747fe24a
DGEMM function split and data prefech
...
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
2016-09-22 17:25:46 +05:30
Paul MUSTIÈRE
157ee498ac
Updated CROSS_SUFFIX regex to work with CC containing arguments
2016-09-14 11:42:22 -07:00
Zhang Xianyi
b09cc3b9bb
Merge pull request #958 from intelfx/remove-stabs
...
common_arm.h, common_mips.h: get rid of .func directives
2016-09-13 16:15:37 -04:00
Ivan Shapovalov
6c0862a94f
common_arm.h, common_mips.h: get rid of .func directives
...
.func/.endfunc are gcc/gas-specific directives for generating stabs
debug information (and nothing more). This is near-useless now because
DWARF is commonly used, and not implemented in Clang. Hence building
OpenBLAS with Clang fails, and there is no sane way to detect GCC vs.
anything else with preprocessor definitions.
Hence, just remove these directives.
2016-09-09 03:37:11 +03:00
Zhang Xianyi
842d842751
Update develop for 0.2.20.dev.
2016-09-01 00:01:23 -04:00
Zhang Xianyi
85636ff1a0
Merge branch 'develop'
2016-08-31 23:58:42 -04:00
Zhang Xianyi
821affb9a0
Update doc for 0.2.19.
2016-08-31 23:58:29 -04:00
Zhang Xianyi
515bc56ea9
Refs #946 . Use nrm2 reference implementation for Power8.
2016-08-18 18:59:43 -07:00
Zhang Xianyi
ae70b916f4
Refs #929 . Deal with zero and NaNs for scale.
2016-08-18 10:24:42 -07:00
Zhang Xianyi
9ea0144482
Merge pull request #941 from sva-img/develop
...
MIPS n32 ABI support, MSA support detection and rename ARCH, ARCHFLAGS
2016-08-18 09:31:31 -04:00
Zhang Xianyi
1f217a6175
Merge pull request #943 from ibmsoe/IBMMASS_Support
...
Added support of IBM's MASS library that optimizes performance on Pow…
2016-08-12 17:20:59 -04:00
nishidha@us.ibm.com
78348a2853
Added support of IBM's MASS library that optimizes performance on Power architectures
2016-08-11 14:43:26 +05:30
Shivraj Patil
9687437928
MIPS n32 ABI and build time mips simd support check
...
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
2016-08-10 17:44:22 +05:30
Shivraj Patil
d1c6469283
MIPS n32 ABI support, MSA support detection and rename ARCH, ARCHFLAGS
...
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
2016-08-08 11:58:01 +05:30
Zhang Xianyi
b544be914d
Merge pull request #933 from ashwinyes/develop_aarch64_20160726_Dgemm_8x4_Opts
...
Cortex A57: Improvements to DGEMM 8x4 kernel
2016-07-26 09:54:31 -04:00
Ashwin Sekhar T K
c54a29bb48
Cortex A57: Improvements to DGEMM 8x4 kernel
2016-07-26 10:58:21 +05:30
Zhang Xianyi
ff4c5deafa
Merge pull request #930 from sva-img/develop
...
P6600/I6400 Build fix.
2016-07-22 11:42:30 -04:00
Shivraj Patil
22b9c2747d
P6600/I6400 Build fix. Reverted the changes which was done to support for MIPS n32 ABI
...
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
2016-07-22 18:45:06 +05:30
Zhang Xianyi
27b5211ccd
Merge pull request #927 from sva-img/develop
...
Added MSA optimization for GEMV_N, GEMV_T, ASUM, DOT functions
2016-07-15 11:17:30 -04:00
Shivraj Patil
beb1d076a4
Added MSA optimization for GEMV_N, GEMV_T, ASUM, DOT functions
...
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
2016-07-15 18:38:25 +05:30
Zhang Xianyi
9e44f3ddd0
Refs #917 Avoid detecting gfortran bug on IBM POWER + Ubuntu
2016-07-14 13:09:36 -07:00