Commit Graph

501 Commits

Author SHA1 Message Date
Hank Anderson 0d8e227ea7 Changed strategy for setting preprocessor definitions.
Instead of generating separate object files for each permutation of
defines for a source file, GenerateNamedObjects now writes an entirely
new source file and inserts the defines as #define c statements.

This solves a problem I ran into with ar.exe where it was refusing to
link objects that had the same filename despite having different paths.
2015-02-24 12:26:33 -06:00
Hank Anderson 12d1fb2e40 Fixed incorrect object name in kernel CMakeLists.txt 2015-02-24 10:30:16 -06:00
Hank Anderson 1b7f427401 Added conj gemv objects for complex build. 2015-02-23 10:24:31 -06:00
Hank Anderson b2284647a3 More complex objects. 2015-02-23 07:51:05 -06:00
Hank Anderson a6116e5859 Added some more complex-only objects. 2015-02-22 17:49:28 -06:00
Hank Anderson 714638c187 Added some TRMM objects for complex types. 2015-02-19 16:11:51 -06:00
Hank Anderson e27c372e53 Fixed reuse of float_char from parent loop.
Fixed in/it/on/otcopy names.
2015-02-19 13:53:29 -06:00
Hank Anderson f3f2b3d768 Added complex and single netlib-lapack fortran sources to lapack.cmake. 2015-02-19 12:26:11 -06:00
Hank Anderson 9492298048 Added other float types to Makefile.L3. 2015-02-18 13:01:05 -06:00
Hank Anderson 14fd3d35de Added checks for missing defines in kernel. 2015-02-18 10:25:01 -06:00
Hank Anderson cebc07cebd ParseMakefileVars now recursively parses included makefiles. 2015-02-17 22:09:41 -06:00
Hank Anderson 33c5e8db7f Added a helper function for setting the L1 kernel defaults.
Added loop to build objects with different KERNEL defines.
2015-02-17 21:36:23 -06:00
Hank Anderson 4662a0b13a Changed generate functions to iterate through a list of float types.
This will generate obj files for SINGLE/DOUBLE/COMPLEX/DOUBLE COMPLEX.
2015-02-15 17:44:37 -06:00
Hank Anderson 162791e30e Added common objects from kernel Makefile. 2015-02-10 12:42:05 -06:00
Hank Anderson c0624a26be Fixed some dgemm_copy function names. 2015-02-09 14:34:29 -06:00
Hank Anderson 4bfaf1ce66 Removed some list appends I missed. 2015-02-09 12:56:55 -06:00
Hank Anderson e8c39138c6 Removed return value from GenerateNamedObjects.
It sets DBLAS_OBJS directly to save a bunch of list appending in the
CMakeLists.txt files.
2015-02-09 12:28:09 -06:00
Hank Anderson f992799226 Added the rest of Makefile.L3. 2015-02-09 10:47:35 -06:00
Hank Anderson 4c65afcce1 Changed kernel filenames to vars. These will need to be read from KERNEL.
Added some kernel/L3 objects.
2015-02-09 09:52:14 -06:00
Hank Anderson 7fa5c4e2fd Fixed some case issues with ARCH.
Added some kernel and driver/others objects.
2015-02-08 15:29:18 -06:00
Hank Anderson fa0e6a6c93 Added the rest of the L1 kernel makefile. 2015-02-07 21:37:46 -06:00
Hank Anderson 38681fb1c6 Added more kernel files. 2015-02-07 12:54:30 -06:00
Hank Anderson 189fadfde0 Started implementing kernel/Makefile in cmake. 2015-02-05 21:05:11 -06:00
Werner Saar ddf983d643 added optimizations for steamroller 2014-12-30 20:14:45 +08:00
Werner Saar 4319769b79 added target processor STEAMROLLER 2014-12-28 20:16:46 +08:00
Werner Saar 587e16fba3 Ref #458: Backport, sandybrigde uses nehalem zgemm kernel 2014-12-22 17:01:18 +01:00
Werner Saar 6261342de3 small optimization on dgemm_kernel for N=1 2014-12-18 20:35:51 +01:00
Werner Saar bc5fff7085 changed inline assembler labels to short form 2014-12-07 12:38:54 +01:00
Zhang Xianyi 0cf29ba6d2 Fixed a bug of sgemm sandy bridge kernel.
Reported by Julia project. JuliaLang/julia#9084
2014-12-03 17:38:41 +08:00
Zhang Xianyi 2fb02626da Update organization info. 2014-11-25 15:28:58 +08:00
Zhang Xianyi a85c2785ae Refs #467. Added generic kernel file for x86_64. 2014-11-24 15:34:48 +08:00
Benedikt Huber 58c90d5937 # The first commit's message is:
Optimizations for APM's xgene-1 (aarch64).

1) general system updates to support armv8 better.  Make all did not work, one needed to supply TARGET=ARMV8.
2) sgem 4x4 kernel in assembler using SIMD, and configuration changes to use it.
3) strmm 4x4 kernel in C.  Since the sgem kernel does 4x4, the trmm kernel must also do 4xN.

Added Dave Nuechterlein to the contributors list.
2014-11-11 22:19:23 +08:00
wernsaar 7aae4a62e7 enabled use of GEMM3M functions 2014-09-20 14:27:10 +02:00
wernsaar b7c9566eea removed obsolete gemv kernel files 2014-09-14 11:00:53 +02:00
wernsaar 6df1b0be81 optimized zgemv_n_microk_sandy-4.c 2014-09-14 10:21:22 +02:00
wernsaar 2ac1e076c1 added optimized zgemv_n kernel for sandybridge 2014-09-14 09:02:05 +02:00
wernsaar 9908b6031c bugfix in KERNEL.PILEDRIVER 2014-09-13 16:26:53 +02:00
wernsaar 8f100a14f2 optimized cgemv_t kernel for haswell 2014-09-13 16:13:27 +02:00
wernsaar 53b5726b04 added optimized cgemv_t kernel for haswell 2014-09-13 15:14:12 +02:00
wernsaar 1a352b24e6 updated KERNEL.HASWELL 2014-09-13 12:23:27 +02:00
wernsaar 5194818d4b updated zgemv_t_4.c 2014-09-13 09:48:34 +02:00
wernsaar 8a39cdb1c1 added optimized zgemv_t kernel for haswell 2014-09-13 09:47:07 +02:00
wernsaar 0a1390f2d8 enabled optimized zgemv_t kernel for bulldozer 2014-09-12 17:43:47 +02:00
wernsaar a8b0812feb optimized zgemv_t for bulldozer 2014-09-12 17:42:25 +02:00
wernsaar a0fb68ab42 added optimized zgemv_t kernel for bulldozer 2014-09-12 17:04:22 +02:00
wernsaar 44c11165d5 bugfix in cgemv_t_4.c 2014-09-12 14:12:24 +02:00
wernsaar 564be4eb72 added optimized cgemv_t kernel 2014-09-12 13:38:01 +02:00
wernsaar 107c3ea7d5 added optimized zgemv_t routine 2014-09-12 12:35:20 +02:00
wernsaar bb8d698335 optimized zgemv_n_microk_haswell-4.c for small size 2014-09-11 13:44:55 +02:00
wernsaar e0192a6914 bugfix in zgemv_n_4.c 2014-09-11 13:18:00 +02:00