Commit Graph

84 Commits

Author SHA1 Message Date
wernsaar 793175be3a added experimental support for big numa machines 2014-08-02 13:40:16 +02:00
Zhang Xianyi 134fa320e6 Refs #415. Fixed the x86/i386 compiling bug with DYNAMIC_ARCH=1. 2014-07-17 15:02:01 +08:00
Zhang Xianyi c94762bb56 Refs #401. Added NO_AVX2 flag for old binutils (e.g. RHEL6) 2014-07-16 08:38:25 +08:00
Timothy Gu 6c2ead30f0 Remove all trailing whitespace except lapack-netlib
Signed-off-by: Timothy Gu <timothygu99@gmail.com>
2014-06-27 12:05:18 -07:00
wernsaar 88b6bf251a force fallback for x86 32bit 2014-06-22 17:27:11 +02:00
wernsaar 4a2ab7460b Ref #391: force fallback for x86 32bit 2014-06-22 13:51:17 +02:00
wernsaar 316df0e821 fixed bug for INTERFACE64 2014-06-22 09:49:20 +02:00
wernsaar 438002204d Ref #393: fix for INTERFACE64=0 and ARCH_X86 in divtable 2014-06-21 12:29:23 +02:00
wernsaar 409b52255c changed default optimization flag from O3 to O2 for ARM 2014-05-16 14:36:24 +02:00
wernsaar a35a1a9ae7 changed makefiles for lapack development 2014-05-07 11:33:02 +02:00
Zhang Xianyi 75acf96d94 Refs #329 #287. Only disable -fopenmp for LAPACK Fortran codes on Windows. 2014-01-24 15:39:46 +08:00
wernsaar 2594728eb7 Merge remote branch 'origin/develop' into haswell 2013-12-01 16:53:39 +01:00
wernsaar 65ebab0688 modified Makefile.system 2013-12-01 16:46:32 +01:00
wernsaar 0b6e13b689 Merge remote branch 'origin/develop' into haswell 2013-12-01 13:38:11 +01:00
wernsaar 5c648a8984 Merge remote branch 'origin/develop' into haswell 2013-12-01 11:25:33 +01:00
Zhang Xianyi 5048a80032 Refs #283. Fixed the incorrect usage of long data type for Windows 64. 2013-11-14 13:46:42 +08:00
Zhang Xianyi dfd1064d7b refs #287. Don't enable OpenMP for netlib LAPACK sequential Fortran codes. 2013-11-02 15:09:33 +08:00
Zhang Xianyi c937090121 Added gfortran dependency for LSB/lsbcc. 2013-10-22 13:24:47 +08:00
Zhang Xianyi c92ae012a6 Refs #279. Provide ONLY_CBLAS flag. If you only need CBLAS without
a fortran compiler, please try make ONLY_CBLAS=1.

This mode only compiler CBLAS without BLAS fortran interface and LAPACK.
2013-08-21 00:03:25 +08:00
Zhang Xianyi 2638370844 Init code base for Intel Haswell. 2013-08-13 00:54:59 +08:00
Zhang Xianyi 673e453b3f Enable bulldozer kernels. 2013-08-05 16:07:54 +08:00
Zhang Xianyi a07cc39571 Refs #266. Fixed the compiling bug with Open64 5.0. 2013-07-31 14:41:39 +08:00
Zhang Xianyi 5b504d6c23 Refs #263. Rollback bulldozer and piledriver kernels to barcelona kernels. 2013-07-28 17:39:24 +08:00
Zhang Xianyi 77b572fa0b Merge branch 'loongson3a' into develop
Conflicts:
	Makefile.system
2013-07-20 22:33:17 +08:00
Zhang Xianyi b67252c2e4 Ensure the correct stack alignment on Win32. 2013-07-17 15:19:07 +08:00
Zhang Xianyi e80e285928 Update build matrix for Travis CI. 2013-07-11 23:49:29 +08:00
Zhang Xianyi 6df39ad9e7 Refs #248. Support LAPACK and LAPACKE with lsbcc.
For LAPACKE, use LAPACK_COMPLEX_STRUCTURE.
The reson is lsbcc didn't define complex I in complex.h.
2013-07-10 16:02:27 +08:00
Zhang Xianyi 3eb5af1955 Refs #247. Included lapack source codes. Avoid downloading tar.gz from netlib.org
Based on 3.4.2 version, apply patch.for_lapack-3.4.2.
2013-07-09 18:13:48 +08:00
Zhang Xianyi f54f5bac9e Refs #248. Fixed the LSB compatiable issue for BLAS only.
For example, make CC=lsbcc NO_LAPACK=1.
2013-07-09 15:38:03 +08:00
Zhang Xianyi 886cbaf4e4 Support AMD Piledriver by bulldozer kernels. 2013-07-06 12:06:43 -03:00
Zhang Xianyi cc522aa21d Use quiet make for Travis CI. 2013-07-05 14:52:57 +08:00
Zhang Xianyi cd1d473ba0 Merge pull request #230 from wernsaar/develop
Refs #230. New dgemm and sgemm Kernel for BULLDOZER
2013-06-13 07:29:27 -07:00
Zhang Xianyi 56f160134d Refs #231. Change the default C compiler to clang on Mac OSX. 2013-06-13 22:15:19 +08:00
wernsaar d854b30ae6 Added UNROLL values for 3M to getarch_2nd.c, Makefile.system and Makefile.L3 2013-06-09 17:26:42 +02:00
Zhang Xianyi 960b0c88a7 Refs #227. Detected LLVM/Clang compiler. 2013-06-06 23:43:40 +08:00
Zhang Xianyi f2fb8c7035 Change LIBSUFFIX from .lib to .a on windows. 2013-06-04 16:05:28 +08:00
Zhang Xianyi 357078b93e Refs #216. Revert the default value of GEMM_MULTITHREAD_THRESHOLD to 4. 2013-05-03 09:08:54 +08:00
Zhang Xianyi 48bdc1ad3b Added NO_PARALLEL_MAKE flag to disable parallel make. 2013-04-15 21:37:30 +08:00
Zhang Xianyi 990efcab6e Merge branch 'loongson3b' into loongson3a 2013-04-11 16:11:03 +00:00
Zhang Xianyi 75a5dc3975 Added the configure for the host loongcc compiling on Loongson3. 2013-04-11 16:10:47 +00:00
Xianyi Zhang 6958c1a1aa Fixed the SEGFAULT bug with Loongcc and Loongson3. 2013-04-11 15:33:43 +08:00
Xianyi Zhang 1a57717b1a Added the configuration of Loongcc compiler for Loongson 3 CPU. 2013-04-07 15:42:07 +08:00
Zhang Xianyi 5c8bf6ae0e Merge branch 'bulldozer' into develop 2013-02-10 01:19:42 +08:00
Zaheer Chothia 4db6660de4 Refs #185. Add missing 'const' to declarations in <cblas.h>. Thanks to Dan Povey!
The 'const' modifications were done automatically using this scripts:
https://kaldi.svn.sourceforge.net/svnroot/kaldi/sandbox/dan/tools/for_openblas
2013-01-20 22:52:51 +01:00
Zhang Xianyi b7c0fa6bd2 Init AMD Bulldozer codebase. 2012-12-06 07:29:54 -05:00
Alexander Nasonov e85549ee11 Fix NetBSD build. 2012-11-10 23:20:44 +00:00
Zhang Xianyi 08c177ca36 Refs #145. Update LAPACK to 3.4.2 version. 2012-09-29 23:14:39 +08:00
Zhang Xianyi 2573311308 refs #140. Fixed zdot incompatibility ABI issue with GCC 4.7 on Win 32.
GCC 4.7 uses MSVC ABI on Win 32. This means the caller pops the hidden pointer for returning
aggregate structures larger than 8 bytes.
2012-09-24 20:34:33 +08:00
Zhang Xianyi 758e34efbb Fixed the detection bug on Loongson 3A server. 2012-09-21 10:14:07 +00:00
Zhang Xianyi f76a384841 Refs #139. Added NO_AVX flag to use old Nehalem kernels on Sandy Bridge.
For example, make NO_AVX=1 or make DYNAMIC_ARCH=1 NO_AVX=1
2012-09-17 23:25:46 +08:00