Commit Graph

2262 Commits

Author SHA1 Message Date
Ashwin Sekhar T K 759f37feba ARM64: Let target VULCAN inherit THUNDERX2T99 properties 2017-01-16 21:44:19 +05:30
Martin Kroeker e8d0e66982 Merge pull request #1067 from martin-frbg/msysinst
Fix DESTDIR support for cygwin/msys2 install
2017-01-16 16:03:53 +01:00
Martin Kroeker 331fd51260 Fix DESTDIR support for cygwin/msys2 install
fixes #1066
2017-01-16 15:15:46 +01:00
Zhang Xianyi 0863a0d4b4 Merge pull request #1061 from ashwinyes/develop_aarch64_vulcan_thunderx_patch
Add new targets for ARM64
2017-01-16 13:20:10 +08:00
Martin Kroeker 2e5f906f41 Update Makefile.install (#1064)
* Update Makefile.install to reflect name change of lapacke_mangling.h source
2017-01-11 17:40:06 +01:00
Werner Saar d1a97bad39 Merge pull request #1063 from wernsaar/develop
prepared kernel/setparam-ref.c for UNROLL values, that are not a power of two
2017-01-11 12:37:45 +01:00
Werner Saar 28e2fab33e prepared kernel/setparam-ref.c for UNROLL values, that are not a power of two 2017-01-11 11:56:50 +01:00
Werner Saar 752fdc6f82 Merge pull request #1062 from wernsaar/develop
prepared parameter.c for UNROLL values, that are not a power of two
2017-01-11 10:30:46 +01:00
Werner Saar c1c5a63d3c prepared parameter.c for UNROLL values, that are not a power of two 2017-01-11 09:50:28 +01:00
Werner Saar 209b63197e prepared lapack/lauum for UNROLL values, that are not a power of two 2017-01-11 07:29:17 +01:00
Ashwin Sekhar T K 4b55fae337 ARM64: Add Cavium THUNDERX2T99 Target 2017-01-11 11:18:40 +05:30
Ashwin Sekhar T K 738d622feb ARM64: Fix auto detect of ARM64 cpus 2017-01-11 11:18:40 +05:30
Andrew Pinski 95649dee28 THUNDERX: Add optimized version of daxpy
This is better for single core but does not change anything for multiple cores
2017-01-11 11:18:36 +05:30
Martin Kroeker 3a8c5180b9 Merge pull request #1060 from martin-frbg/lapacke-mingw
Split LAPACKE 3.7.0 obj list (take 2, missed splitting the actual ar command invocation)
2017-01-10 19:09:49 +01:00
Martin Kroeker 7611a41f40 Split LAPACKE 3.7.0 obj list (take 2)
Missed the splitting of the actual ar call
2017-01-10 17:11:35 +01:00
Werner Saar 1a39b92b1d Merge pull request #1059 from wernsaar/develop
updated some level1 funcions, that are not thread save
2017-01-10 16:00:28 +01:00
Werner Saar dd6212e684 updated some level1 funcions, that are not thread save 2017-01-10 14:05:07 +01:00
Werner Saar 9bcf50872b Merge pull request #1058 from wernsaar/develop
prepared lapack/potrf functions for UNROLL values, that are not a pow…
2017-01-10 11:30:08 +01:00
Werner Saar c81dc6322f prepared lapack/potrf functions for UNROLL values, that are not a power of two 2017-01-10 10:50:28 +01:00
Andrew Pinski 8fdb0655e9 THUNDERX: Add an optimized version of ddot 2017-01-10 15:01:37 +05:30
Andrew Pinski fb200c7245 ARM64: Add Cavium THUNDERX Target 2017-01-10 15:01:37 +05:30
Ashwin Sekhar T K 0b8e876d89 VULCAN: Add optimized DGEMM implementation 2017-01-10 15:01:37 +05:30
Ashwin Sekhar T K 4713e7c47f ARM64: Add the VULCAN Target 2017-01-10 15:01:17 +05:30
Ashwin Sekhar T K 6085386b10 CORTEXA57: Add assembly kernels for copy routines 2017-01-10 15:01:05 +05:30
Zhang Xianyi 002b41f024 Merge pull request #1055 from ksraste/develop
Add msa optimization for AXPY, COPY, SCALE, SWAP
2017-01-10 13:58:26 +08:00
jiahaipeng 84b8170bfb Adding multi-threading for copy, dot, rot, and asum funcitons 2017-01-10 11:48:58 +08:00
jiahaipeng 1aa1e6cb54 modify the blas_l1_thread.c for support multi-threded for L1 fuction with return value 2017-01-10 11:47:06 +08:00
Martin Kroeker cbd2bf1f6e Merge pull request #1057 from martin-frbg/lapacke-mingw
Split the obj list of LAPACKE 3.7.0
2017-01-09 20:45:26 +01:00
Martin Kroeker 9f5cfd43dc Split the obj list of LAPACKE 3.7.0
Split obj list to allow building with mingw (argument list too long for the msys ar)
2017-01-09 18:29:53 +01:00
kaustubh 1480f3df71 Add msa optimization for AXPY, COPY, SCALE, SWAP
Signed-off-by: kaustubh <kaustubh.raste@imgtec.com>
2017-01-09 18:27:23 +05:30
kaustubh 88afb3bc94 Add msa optimization for AXPY, COPY, SCALE, SWAP
Signed-off-by: kaustubh <kaustubh.raste@imgtec.com>
2017-01-09 18:22:09 +05:30
Werner Saar 2ffbbb54f6 Merge pull request #1054 from wernsaar/develop
prepared lapack/getrf functions for UNROLL values, that are not a pow…
2017-01-09 13:38:56 +01:00
Werner Saar 3e1bbd6b5f prepared lapack/getrf functions for UNROLL values, that are not a power of two 2017-01-09 12:57:26 +01:00
Zhang Xianyi b678471d65 Merge branch 'z13' into develop
Conflicts:
	CONTRIBUTORS.md
2017-01-09 05:52:42 -05:00
Zhang Xianyi 864e202afd Add USE_TRMM=1 for IBM z13 in kernel/Makefile.L3 2017-01-09 05:48:09 -05:00
Werner Saar b9bb009236 Merge pull request #1053 from wernsaar/develop
prepared driver/level3 functions for UNROLL values, that are not a po…
2017-01-09 11:17:38 +01:00
Werner Saar a2672d5589 prepared driver/level3 functions for UNROLL values, that are not a power of two 2017-01-09 10:38:15 +01:00
Zhang Xianyi c2496d8f48 Merge pull request #1050 from martin-frbg/fflags
Apply COMMON_OPT to default FFLAGS
2017-01-09 16:23:22 +08:00
Zhang Xianyi fb0afdaf99 Merge pull request #1052 from martin-frbg/locking
Fix thread data races detected by helgrind 3.12
2017-01-09 16:22:58 +08:00
Martin Kroeker 51aa157e64 Relocate declaration of alloc_lock outside ifdef block 2017-01-09 01:10:43 +01:00
Martin Kroeker 87c7d10b34 Fix thread data races detected by helgrind 3.12
Ref. #995, may possibly help solve issues seen in 660,883
2017-01-08 23:33:51 +01:00
Martin Kroeker d0035b857d Apply COMMON_OPT to default FFLAGS to avoid building non-optimized LAPACK by mistake 2017-01-08 21:17:22 +01:00
Werner Saar c61a7cd293 Merge pull request #1049 from wernsaar/develop
removed blas_thread_shutdown from gensymbol
2017-01-08 09:30:19 +01:00
Werner Saar a8bb5003de removed blas_thread_shutdown from gensymbol 2017-01-08 08:51:30 +01:00
Zhang Xianyi 9a48adff3f Merge pull request #1047 from brada4/erre
Improve R benchmark timing
2017-01-08 11:19:06 +08:00
Zhang Xianyi 823a40a110 Merge pull request #1040 from martin-frbg/develop
Use appropriate int32/int64 format for error number in message string
2017-01-08 11:18:38 +08:00
Zhang Xianyi 0bd706ac8d Merge pull request #1036 from sva-img/develop
Added prefetch to CGEMV and ZGEMV.
2017-01-08 11:18:05 +08:00
Andrew 8379550076 anti GC and reflow 2017-01-07 19:01:42 +01:00
Andrew fc148b7e4d init 2017-01-07 19:01:21 +01:00
Werner Saar 5bb2b91a03 Merge pull request #1046 from wernsaar/develop
updated lapack to version 3.7.0 with latest patches from git
2017-01-07 15:09:56 +01:00