Commit Graph

231 Commits

Author SHA1 Message Date
Ashwin Sekhar T K 2757b49767 THUNDERX2T99: Add Optimized CGEMM Implementation 2017-01-30 17:44:26 +05:30
Ashwin Sekhar T K f279ff4789 THUNDERX2T99: Add Optimized SGEMM Implementation 2017-01-16 21:44:33 +05:30
Zhang Xianyi 0863a0d4b4 Merge pull request #1061 from ashwinyes/develop_aarch64_vulcan_thunderx_patch
Add new targets for ARM64
2017-01-16 13:20:10 +08:00
Werner Saar c1c5a63d3c prepared parameter.c for UNROLL values, that are not a power of two 2017-01-11 09:50:28 +01:00
Ashwin Sekhar T K 4b55fae337 ARM64: Add Cavium THUNDERX2T99 Target 2017-01-11 11:18:40 +05:30
Ashwin Sekhar T K 0b8e876d89 VULCAN: Add optimized DGEMM implementation 2017-01-10 15:01:37 +05:30
Ashwin Sekhar T K 4713e7c47f ARM64: Add the VULCAN Target 2017-01-10 15:01:17 +05:30
jiahaipeng 1aa1e6cb54 modify the blas_l1_thread.c for support multi-threded for L1 fuction with return value 2017-01-10 11:47:06 +08:00
Werner Saar b9bb009236 Merge pull request #1053 from wernsaar/develop
prepared driver/level3 functions for UNROLL values, that are not a po…
2017-01-09 11:17:38 +01:00
Werner Saar a2672d5589 prepared driver/level3 functions for UNROLL values, that are not a power of two 2017-01-09 10:38:15 +01:00
Martin Kroeker 51aa157e64 Relocate declaration of alloc_lock outside ifdef block 2017-01-09 01:10:43 +01:00
Martin Kroeker 87c7d10b34 Fix thread data races detected by helgrind 3.12
Ref. #995, may possibly help solve issues seen in 660,883
2017-01-08 23:33:51 +01:00
Martin Kroeker 0ef7841473 Update xerbla.c 2017-01-04 23:16:48 +01:00
Martin Kroeker 104ad066af Use appropriate int32/int64 format for error number in message string 2016-12-30 00:45:59 +01:00
Alex Arslan a16ace68f5
Include system headers on FreeBSD 2016-11-16 21:58:20 -08:00
Martin Kroeker 596ead0f8d Add files via upload 2016-11-06 23:26:39 +01:00
Zhang Xianyi 66c9a9b33d Merge pull request #981 from howard0su/develop
USE NPROCESSOR_CONF instaed of NPORCESSOR_ONLN
2016-10-17 11:32:57 +08:00
Martin Kroeker 8a8f3932eb Update dynamic.c
Add Bay Trail "Pentium N3520" atom
2016-10-16 22:40:00 +02:00
Howard Su ff1da01476 USE NPROCESSOR_CONF instaed of NPORCESSOR_ONLN
to determine the number of CPU. In ARM platform,
online CPU will increasing when there is more workload.
while configure cpu is the max number of CPU.
2016-10-13 12:37:50 +00:00
Zhang Xianyi ef52a9266b Fixed #979. Patch for NetBSD. 2016-10-13 10:17:07 +08:00
Martin Kroeker 7de829f713 Update dynamic.c
Add Braswell (extended model 4, model 12) N3150 as Nehalem
2016-07-14 12:22:55 +02:00
John Biddiscombe 053044ae4d Replace CMAKE_SOURCE_DIR/CMAKE_BINARY_DIR with PROJECT_SOURCE_DIR/PROJECT_BINARY_DIR
If OpenBLAS is built using add_subdirectory(OpenBlas) as part of another project
then the paths set by CMAKE_XXX_DIR are relative to the parent project
and not the OpenBLAS project.
2016-05-25 09:13:28 +02:00
Ashwin Sekhar T K 0fb380c966 Update NUMA CPU binding
When the number of process can all be
accommodated within the current node,
then use cores from the current node only.
2016-04-29 11:58:15 +05:30
Werner Saar 78b05f6476 bugfix for EXCAVATOR and DYNAMIC_ARCH 2016-04-25 10:13:30 +02:00
Werner Saar 2b967590a0 bugfix in dynamic.c 2016-04-25 09:08:38 +02:00
Theoractice aa744dfa59 Update memory.c 2016-03-22 20:02:37 +08:00
theoractice 61cf8f74d9 Fix access violation on Windows while static linking 2016-03-22 19:14:54 +08:00
Zhang Xianyi 68eb4fa329 Add missing openblas_env makefile. 2016-03-09 14:52:47 -05:00
Zhang Xianyi 05196a8497 Refs #716. Only call getenv at init function. 2016-03-09 12:50:07 -05:00
Jerome Robert 53ba1a77c8 ztrmv_L.c: no longer need a 4kB buffer
Fix #786
2016-03-05 19:07:03 +01:00
Zhang Xianyi 1edf30b790 Change Opteron(SSE3) to Opteron_SSE3 at dyanmaic core name. 2016-03-01 20:13:08 +08:00
Zhang Xianyi 6b85dbb6dc Refs #696. Turn off stack limit setting on Linux.
I cannot reproduce SEGFAULT of lapack-test with default stack size
on ARM Linux.
2016-02-24 14:21:42 -05:00
Zhang Xianyi d06b92906a Add gemm3m building for CMake. 2016-02-12 05:02:51 +08:00
Jerome Robert 78dcf5c3d5 Improve performances of ztrmv on small matrices
* Use stack allocation
* Disable multi-threading
* Ref #727
2016-02-08 11:25:02 +01:00
Martin Kroeker 935356c34f Update dynamic.c and cpuid_x86.c for Intel Avoton.
Second part of "support Intel Avoton via Nehalem kernel"
2016-02-02 13:42:55 -05:00
Zhang Xianyi f5df444ceb Merge pull request #762 from jeromerobert/bug760
Let openblas_get_num_threads return the number of active threads
2016-01-26 08:45:16 -06:00
Zhang Xianyi aaa8551c57 Merge pull request #749 from lotheac/illumos_fixes
illumos fixes
2016-01-26 08:42:20 -06:00
Jerome Robert 0d87c1ffb6 Let openblas_get_num_threads return the number of active threads
... not the number of allocated threads.

Close #760
2016-01-26 13:04:16 +01:00
Lauri Tirkkonen e737e32fd1 RLIMIT_NPROC doesn't exist on illumos 2016-01-22 18:55:51 +02:00
Lauri Tirkkonen 97cd4b8aee illumos fixes to memory.c 2016-01-22 18:55:43 +02:00
Werner Saar b07d733a71 added updates for syrk and syr2k 2016-01-21 13:16:44 +01:00
Zhang Xianyi 055b481386 Fixed CMake bug for single core. 2016-01-15 06:42:54 +08:00
Werner Saar 0d22551a6b increase the stack size limit in the constructor 2015-11-20 09:23:01 +01:00
Ralph Campbell fbc21266e6 Minor C code fixes in driver/ 2015-11-09 14:15:49 +05:30
Zhang Xianyi 839395fc25 Detect AMD Trinity and Richland. 2015-10-29 02:53:29 +08:00
j-bo 6040858b22 Fix #673
Add lacking headers declarations when compiling for Android ARM7
2015-10-27 13:55:24 +01:00
Zhang Xianyi 70642fe4ed Refs #668. Raise the signal when pthread_create fails.
Thank James K. Lowden for the patch.
2015-10-26 19:02:51 -05:00
Zhang Xianyi 2feef49fa8 Merge branch 'develop' into cmake
Conflicts:
	driver/others/memory.c
2015-10-26 14:54:34 -05:00
Zhang Xianyi 1ce054fcb3 Refs #669. Fixed the build bug with gcc on Mac OS X. 2015-10-22 11:07:35 -05:00
Zhang Xianyi d8392c1245 Fixe cmake config bugs. 2015-10-20 04:30:55 +08:00