Zhang Xianyi
eece9fd889
Merge pull request #926 from vriera/develop
...
Complete support for MIPS n32 ABI
2016-07-14 15:49:33 -04:00
Zhang Xianyi
5dfa0712c3
Merge pull request #925 from martin-frbg/develop
...
Update zgetrf2.f, cpuid_x86.c, dynamic.c
2016-07-14 15:48:58 -04:00
Zhang Xianyi
8a592ee386
Merge pull request #924 from ashwinyes/develop_aarch64_improvements_20160714
...
Improvements to Aarch64 kernels
2016-07-14 15:47:55 -04:00
Zhang Xianyi
7f2409a8e1
Merge pull request #918 from sva-img/develop
...
Added CGEMM, ZGEMM, STRMM, DTRMM, CTRMM, ZTRMM.
2016-07-14 15:45:39 -04:00
Vicente Olivert Riera
7f28cd1f88
Complete support for MIPS n32 ABI
...
Signed-off-by: Vicente Olivert Riera <Vincent.Riera@imgtec.com>
2016-07-14 17:51:04 +01:00
Martin Kroeker
154729908e
Update cpuid_x86.c
2016-07-14 17:29:34 +02:00
Martin Kroeker
97bd1e42c8
Update cpuid_x86.c
2016-07-14 12:25:17 +02:00
Martin Kroeker
7de829f713
Update dynamic.c
...
Add Braswell (extended model 4, model 12) N3150 as Nehalem
2016-07-14 12:22:55 +02:00
Martin Kroeker
9b69d8a8e5
Update zgetrf2.f
...
Trivial typo correction (ZERBLA => XERBLA) to fix #910
2016-07-14 11:41:57 +02:00
Ashwin Sekhar T K
0a5ff9f9f9
Improvements to TRMM and GEMM kernels
2016-07-14 13:56:04 +05:30
Ashwin Sekhar T K
8a40f1355e
Improvements to GEMV kernels
2016-07-14 13:50:38 +05:30
Ashwin Sekhar T K
78782485b6
Improvements to COPY and IAMAX kernels
2016-07-14 13:49:34 +05:30
Ashwin Sekhar T K
8d86d14d3f
Add time prints in benchmark output
2016-07-14 13:48:13 +05:30
Ashwin Sekhar T K
925d4e1dc6
Add IAMAX and NRM2 benchmarks
2016-07-14 13:46:01 +05:30
Shivraj Patil
57df7956ee
Added CGEMM, ZGEMM, STRMM, DTRMM, CTRMM, ZTRMM. Updated macros in SGEMM, DGEMM, STRMM.
...
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
2016-06-28 17:51:10 +05:30
Zhang Xianyi
437c7d64f2
Merge pull request #913 from dpfoose/develop
...
Small change to allow compiling with USE_OPENMP on MSVC
2016-06-27 10:05:30 -04:00
Zhang Xianyi
ca5c25c870
Merge pull request #907 from jeromerobert/bug786
...
Fix z/ctrmv stack allocation on AMD bulldozer and barcelona target
2016-06-27 10:04:54 -04:00
Zhang Xianyi
4a30a2584a
Merge pull request #897 from ksraste/develop
...
STRSM optimized for MSA
2016-06-27 10:04:18 -04:00
mdong
098d8ec5d6
remove input from clobbered list
2016-06-24 16:37:58 -04:00
Daniel Patrick Foose
a94f2b7848
Change to allow compiling with USE_OPENMP on MSVC
...
MSVC treats the declaration of omp_in_parallel and omp_get_num_procs without the modifiers __declspec(dllimport) and __cdecl as a redefinition.
2016-06-14 14:37:28 -04:00
Jerome Robert
d346c533b1
Fix z/ctrmv stack allocation on AMD bulldozer and barcelona target
...
* Hopefully, because this was found by error and trial (dark magic)
* Ref #786
2016-06-07 16:11:09 +02:00
Werner Saar
f04af36ad0
Merge pull request #898 from wernsaar/develop
...
added experimental support for optimized lapack fortran functions
2016-05-31 14:13:52 +02:00
Werner Saar
41000c8443
added directory for optimized lapack fortan codes and added dlaqr5.f
2016-05-31 12:53:07 +02:00
Kaustubh Raste
011431b9d7
STRSM optimized for MSA
...
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
2016-05-31 10:17:23 +05:30
Kaustubh Raste
c8a7860eb3
STRSM optimized
...
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
2016-05-30 21:17:00 +05:30
Zhang Xianyi
2daad2bcb5
Merge pull request #893 from biddisco/develop
...
Replace CMAKE_SOURCE_DIR/CMAKE_BINARY_DIR with PROJECT_SOURCE_DIR/PRO…
2016-05-30 14:52:58 +08:00
Zhang Xianyi
bac478d17e
Merge pull request #891 from rndfax/develop
...
mips64/axpy: fix error when INCY == 0
2016-05-30 14:52:40 +08:00
John Biddiscombe
053044ae4d
Replace CMAKE_SOURCE_DIR/CMAKE_BINARY_DIR with PROJECT_SOURCE_DIR/PROJECT_BINARY_DIR
...
If OpenBLAS is built using add_subdirectory(OpenBlas) as part of another project
then the paths set by CMAKE_XXX_DIR are relative to the parent project
and not the OpenBLAS project.
2016-05-25 09:13:28 +02:00
Aleksey Kuleshov
fca66262c4
mips64/axpy: fix error when INCY == 0
2016-05-23 13:30:27 +03:00
Werner Saar
412bcd187a
optimized dtrsm_logic_LT_16x4_power8.S and dtrsm_macros_LT_16x4_power8.S
2016-05-23 11:20:41 +02:00
Werner Saar
bd06b246cc
Merge pull request #890 from wernsaar/develop
...
optimized dtrsm_kernel_LT for POWER8
2016-05-22 16:01:35 +02:00
Werner Saar
8b140220c8
optimized dtrsm_kernel_LT for POWER8
2016-05-22 15:20:04 +02:00
Werner Saar
318cad9c37
added trsm bencharks for POWER8 to benchmark/Makefile
2016-05-22 13:51:47 +02:00
Werner Saar
8fb5a1aaff
added optimized dtrsm_LT kernel for POWER8
2016-05-22 13:09:05 +02:00
Zhang Xianyi
7d0358475d
Merge the patch for musl libc.
2016-05-22 01:08:44 +08:00
Zhang Xianyi
b46f680f01
Merge pull request #887 from ksraste/develop
...
STRSM optimization for MIPS P5600 and I6400 using MSA
2016-05-21 07:17:21 +08:00
Kaustubh Raste
ad9f317870
STRSM optimization for MIPS P5600 and I6400 using MSA
...
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
2016-05-20 10:59:03 +05:30
Zhang Xianyi
a8fcd89d6d
Merge pull request #886 from vriera/develop
...
Makefile.system: P5600 and I6400 cores need -mmsa
2016-05-19 19:59:09 +08:00
Zhang Xianyi
232335fd49
Merge pull request #885 from sva-img/develop
...
SGEMM optimization for MIPS P5600 and I6400 using MSA.
2016-05-19 19:58:32 +08:00
Vicente Olivert Riera
e12cff87b8
Makefile.system: P5600 and I6400 cores need -mmsa
...
Signed-off-by: Vicente Olivert Riera <Vincent.Riera@imgtec.com>
2016-05-19 10:56:53 +01:00
Shivraj Patil
c4ba40e308
SGEMM optimization for MIPS P5600 and I6400 using MSA. Unrolled k loop in DGEMM kernel function
...
Signed-off-by: Shivraj Patil <shivraj.patil@imgtec.com>
2016-05-19 11:04:42 +05:30
Zhang Xianyi
7a19065369
Merge pull request #878 from ksraste/develop
...
DTRSM bug fix for MIPS P5600 and I6400
2016-05-19 11:16:43 +08:00
Werner Saar
8a149e6294
Merge pull request #879 from wernsaar/develop
...
optimized dgemm and dgetrf for POWER8
2016-05-17 17:10:36 +02:00
Werner Saar
956be69e1d
optimized getrf_single.c for POWER8
2016-05-17 16:19:53 +02:00
Werner Saar
6a2bde7a2d
optimized dgemm and dgetrf for POWER8
2016-05-17 14:45:27 +02:00
Kaustubh Raste
d7cbc7ac13
DTRSM bug fix for MIPS P5600 and I6400
...
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
2016-05-17 15:48:02 +05:30
Zhang Xianyi
8bf71e9e06
Merge pull request #877 from jeromerobert/bug873
...
Disable multi-threading in swap
2016-05-16 23:21:56 +08:00
Jerome Robert
40af513669
Disable multi-threading in swap
...
* Close #873
2016-05-16 13:07:55 +00:00
Werner Saar
88011f625d
Merge pull request #876 from wernsaar/develop
...
optimized dgemm on power8 for 20 threads
2016-05-16 14:52:40 +02:00
Werner Saar
8310d4d3f7
optimized dgemm for 20 threads
2016-05-16 14:14:25 +02:00