Commit Graph

7452 Commits

Author SHA1 Message Date
Zhang Xianyi bcfc298c38 Merge pull request #987 from Sbte/master
Fix HASWELL capitalization in kernel cmake file
2016-10-18 12:38:33 +08:00
Sven Baars ce7c6c6b2d Fix HASWELL capitalization in kernel cmake file
Refs #951
2016-10-17 16:34:13 +02:00
kaustubh f3419e634c SGEMM, DGEMM, CGEMM, ZGEMM functions data prefetch
Signed-off-by: kaustubh <kaustubh.raste@imgtec.com>
2016-10-17 18:29:38 +05:30
Zhang Xianyi 7472c79ea6 Merge pull request #984 from ksraste/develop
STRSM, DTRSM functions data prefetch
2016-10-17 11:33:16 +08:00
Zhang Xianyi 66c9a9b33d Merge pull request #981 from howard0su/develop
USE NPROCESSOR_CONF instaed of NPORCESSOR_ONLN
2016-10-17 11:32:57 +08:00
Zhang Xianyi 3705f5675a Merge pull request #982 from martin-frbg/develop
Change file comments to work around clang 3.9 assembler bug; add support for Bay Trail atom
2016-10-17 11:32:20 +08:00
Martin Kroeker bce2b34f7a Merge pull request #1 from martin-frbg/martin-frbg-patch-1
Add Intel "Bay Trail" atom cpu
2016-10-16 22:51:42 +02:00
Martin Kroeker da83ec94d1 Merge pull request #2 from martin-frbg/martin-frbg-patch-1-1
Update cpuid_x86.c
2016-10-16 22:48:58 +02:00
Martin Kroeker 3409bccb21 Update cpuid_x86.c
Add Bay Trail "Pentium N3520" atom cpu
2016-10-16 22:45:44 +02:00
Martin Kroeker 8a8f3932eb Update dynamic.c
Add Bay Trail "Pentium N3520" atom
2016-10-16 22:40:00 +02:00
kaustubh 90e2321ac3 STRSM, DTRSM functions data prefetch
Signed-off-by: kaustubh <kaustubh.raste@imgtec.com>
2016-10-14 16:41:28 +05:30
Martin Kroeker 4998e19869 Change file comments to work around clang 3.9 assembler bug 2016-10-13 16:51:08 +02:00
Howard Su ff1da01476 USE NPROCESSOR_CONF instaed of NPORCESSOR_ONLN
to determine the number of CPU. In ARM platform,
online CPU will increasing when there is more workload.
while configure cpu is the max number of CPU.
2016-10-13 12:37:50 +00:00
Zhang Xianyi ef52a9266b Fixed #979. Patch for NetBSD. 2016-10-13 10:17:07 +08:00
Zhang Xianyi 4f38ae3199 Merge pull request #970 from martin-frbg/develop
Remove implicit inclusions of complex.h in various zdot implementations
2016-10-13 10:13:56 +08:00
Zhang Xianyi 4baf0c7cfc Merge pull request #980 from kiwifb/utest_ldflags
make utest/Makefile respect LDFLAGS
2016-10-13 10:13:12 +08:00
Zhang Xianyi 595a0224e4 Merge pull request #973 from vladimir-ch/fix-lapacke-xlarfb
LAPACKE: fix wrong direction check in LAPACKE_?larfb_work
2016-10-13 10:12:35 +08:00
François Bissey f124ffab47 make utest/Makefile respect LDFLAGS 2016-10-13 09:32:25 +13:00
Martin Kroeker 91610f3835 Update zdot_msa.c 2016-10-05 18:59:09 +02:00
Martin Kroeker 6e22ecf102 Update zdot.c 2016-10-05 18:58:03 +02:00
Martin Kroeker 6221d6df5f Update zdot.c 2016-10-05 18:57:14 +02:00
Vladimir Chalupecky 117d3371d4 LAPACKE: fix wrong direction check in LAPACKE_?larfb_work
Closes #971
2016-10-01 05:31:30 +09:00
Martin Kroeker 16446d1d23 Remove explicit include of complex.h 2016-09-29 23:45:56 +02:00
Martin Kroeker a6e9e0b94b Remove explicit include of complex.h 2016-09-29 23:43:28 +02:00
Martin Kroeker 3178e4fea0 Remove explicit include of complex.h 2016-09-29 23:41:43 +02:00
Martin Kroeker 95c245ddb0 Remove explicit include of complex.h 2016-09-29 23:40:36 +02:00
Martin Kroeker 4b1b27347f Remove explicit include of complex.h 2016-09-29 23:39:35 +02:00
Zhang Xianyi 161c927071 Merge pull request #968 from buffer51/develop
Updated CROSS_SUFFIX regex to work with CC containing arguments
2016-09-22 11:34:57 -04:00
Zhang Xianyi 662f89f059 Merge pull request #969 from sva-img/develop
DGEMM function split and data prefech
2016-09-22 11:33:51 -04:00
Shivraj Patil 54747fe24a DGEMM function split and data prefech
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
2016-09-22 17:25:46 +05:30
Paul MUSTIÈRE 157ee498ac Updated CROSS_SUFFIX regex to work with CC containing arguments 2016-09-14 11:42:22 -07:00
Zhang Xianyi b09cc3b9bb Merge pull request #958 from intelfx/remove-stabs
common_arm.h, common_mips.h: get rid of .func directives
2016-09-13 16:15:37 -04:00
Ivan Shapovalov 6c0862a94f common_arm.h, common_mips.h: get rid of .func directives
.func/.endfunc are gcc/gas-specific directives for generating stabs
debug information (and nothing more). This is near-useless now because
DWARF is commonly used, and not implemented in Clang. Hence building
OpenBLAS with Clang fails, and there is no sane way to detect GCC vs.
anything else with preprocessor definitions.

Hence, just remove these directives.
2016-09-09 03:37:11 +03:00
Zhang Xianyi 842d842751 Update develop for 0.2.20.dev. 2016-09-01 00:01:23 -04:00
Zhang Xianyi 85636ff1a0 Merge branch 'develop' 2016-08-31 23:58:42 -04:00
Zhang Xianyi 821affb9a0 Update doc for 0.2.19. 2016-08-31 23:58:29 -04:00
Zhang Xianyi 515bc56ea9 Refs #946. Use nrm2 reference implementation for Power8. 2016-08-18 18:59:43 -07:00
Zhang Xianyi ae70b916f4 Refs #929. Deal with zero and NaNs for scale. 2016-08-18 10:24:42 -07:00
Zhang Xianyi 9ea0144482 Merge pull request #941 from sva-img/develop
MIPS n32 ABI support, MSA support detection and rename ARCH, ARCHFLAGS
2016-08-18 09:31:31 -04:00
Zhang Xianyi 1f217a6175 Merge pull request #943 from ibmsoe/IBMMASS_Support
Added support of IBM's MASS library that optimizes performance on Pow…
2016-08-12 17:20:59 -04:00
nishidha@us.ibm.com 78348a2853 Added support of IBM's MASS library that optimizes performance on Power architectures 2016-08-11 14:43:26 +05:30
Shivraj Patil 9687437928 MIPS n32 ABI and build time mips simd support check
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
2016-08-10 17:44:22 +05:30
Shivraj Patil d1c6469283 MIPS n32 ABI support, MSA support detection and rename ARCH, ARCHFLAGS
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
2016-08-08 11:58:01 +05:30
Zhang Xianyi b544be914d Merge pull request #933 from ashwinyes/develop_aarch64_20160726_Dgemm_8x4_Opts
Cortex A57: Improvements to DGEMM 8x4 kernel
2016-07-26 09:54:31 -04:00
Ashwin Sekhar T K c54a29bb48 Cortex A57: Improvements to DGEMM 8x4 kernel 2016-07-26 10:58:21 +05:30
Zhang Xianyi ff4c5deafa Merge pull request #930 from sva-img/develop
P6600/I6400 Build fix.
2016-07-22 11:42:30 -04:00
Shivraj Patil 22b9c2747d P6600/I6400 Build fix. Reverted the changes which was done to support for MIPS n32 ABI
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
2016-07-22 18:45:06 +05:30
Zhang Xianyi 27b5211ccd Merge pull request #927 from sva-img/develop
Added MSA optimization for GEMV_N, GEMV_T, ASUM, DOT functions
2016-07-15 11:17:30 -04:00
Shivraj Patil beb1d076a4 Added MSA optimization for GEMV_N, GEMV_T, ASUM, DOT functions
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
2016-07-15 18:38:25 +05:30
Zhang Xianyi 9e44f3ddd0 Refs #917 Avoid detecting gfortran bug on IBM POWER + Ubuntu 2016-07-14 13:09:36 -07:00