Zhiyong Dang
3716267124
Change _STDC_VERSION__ to __STDC_VERSION__
...
Change-Id: Id3fa4e8d9eedd4ef7230df69b611e7f397301a42
2018-05-11 12:15:08 +08:00
Martin Kroeker
6a99fcce94
Use _Atomic instead of volatile for thread safety where C11 is supported
...
Suggested by dodomorandi in #660
2018-03-10 00:03:49 +01:00
Andrew
11a627c54e
remove surplus parentheses to silence clang5
2018-01-01 20:56:26 +01:00
Andrew
bfc2a88594
remove unused buffer
2017-12-22 00:55:40 +01:00
Andrew
ef95cd471f
elminate unread variable, after reiteration 3 of them (clang4)
2017-11-25 02:54:37 +01:00
Martin Kroeker
db72ad8f6a
Merge pull request #1320 from timmoon10/develop
...
2D thread distribution for multi-threaded GEMMs
2017-10-08 23:31:33 +02:00
Martin Kroeker
514d237257
Merge pull request #1279 from xsacha/develop
...
CMake improvements
2017-10-06 21:13:45 +02:00
Tim Moon
30486a356c
Reduce number of data partitions in n.
2017-10-04 12:37:49 -07:00
Tim Moon
9de52b489a
Cleaning up and documenting multi-threaded GEMM code.
2017-10-03 16:32:08 -07:00
Tim Moon
860dcfc703
Use 2D thread distribution for small GEMMs.
...
Allows maximum use of available cores if one of M and N is small and the other is large.
2017-10-03 13:43:39 -07:00
Tim Moon
6aaa107865
Reducing threads for multi-threaded GEMMs on small matrices.
2017-09-27 19:25:33 -07:00
Sacha Refshauge
37858d1146
Fix threading usage in CMake: s/SMP/USE_THREAD/
2017-08-19 15:07:42 +10:00
Isuru Fernando
d245caa49a
Support out-of-source build
2017-08-01 15:16:14 +05:30
Martin Kroeker
49e62c0e77
fixed syrk_thread.c taken from wernsaar
...
Stride calculation fix copied from https://github.com/wernsaar/OpenBLAS/commit/88900e1
2017-07-06 17:30:12 +02:00
Werner Saar
a2672d5589
prepared driver/level3 functions for UNROLL values, that are not a power of two
2017-01-09 10:38:15 +01:00
John Biddiscombe
053044ae4d
Replace CMAKE_SOURCE_DIR/CMAKE_BINARY_DIR with PROJECT_SOURCE_DIR/PROJECT_BINARY_DIR
...
If OpenBLAS is built using add_subdirectory(OpenBlas) as part of another project
then the paths set by CMAKE_XXX_DIR are relative to the parent project
and not the OpenBLAS project.
2016-05-25 09:13:28 +02:00
Zhang Xianyi
d06b92906a
Add gemm3m building for CMake.
2016-02-12 05:02:51 +08:00
Werner Saar
b07d733a71
added updates for syrk and syr2k
2016-01-21 13:16:44 +01:00
Zhang Xianyi
055b481386
Fixed CMake bug for single core.
2016-01-15 06:42:54 +08:00
Ralph Campbell
fbc21266e6
Minor C code fixes in driver/
2015-11-09 14:15:49 +05:30
Zhang Xianyi
d8392c1245
Fixe cmake config bugs.
2015-10-20 04:30:55 +08:00
Zhang Xianyi
f874465bb8
Use cmake to build OpenBLAS GENERIC Target on MSVC x86 64-bit.
...
Disable CBLAS and LAPACK.
2015-08-10 14:10:44 -05:00
Hank Anderson
9eaea02f33
Added additional gemm defines for complex types.
2015-02-25 09:39:11 -06:00
Hank Anderson
0d8e227ea7
Changed strategy for setting preprocessor definitions.
...
Instead of generating separate object files for each permutation of
defines for a source file, GenerateNamedObjects now writes an entirely
new source file and inserts the defines as #define c statements.
This solves a problem I ran into with ar.exe where it was refusing to
link objects that had the same filename despite having different paths.
2015-02-24 12:26:33 -06:00
Hank Anderson
371071d461
Added CONJ defines for trmm/trsm.
2015-02-21 10:59:02 -06:00
Hank Anderson
8a143516e3
Added alternate_name to a couple of the name mangling schemes.
...
Added zherk_k sources to driver/level3.
2015-02-20 17:03:33 -06:00
Hank Anderson
e5897ecb9b
Added zherk_kernel.c objects to driver/level3.
2015-02-19 16:19:56 -06:00
Hank Anderson
4662a0b13a
Changed generate functions to iterate through a list of float types.
...
This will generate obj files for SINGLE/DOUBLE/COMPLEX/DOUBLE COMPLEX.
2015-02-15 17:44:37 -06:00
Hank Anderson
e74462a3f5
Moved declarations to start of functions to satisfy MSVC C89 implementation.
2015-02-11 11:16:57 -06:00
Hank Anderson
056ba26755
Changed a number of inline calls to use __inline.
...
MSVC doesn't inmplement C99, so can't use the inline keyword. __inline
appears to work in MSVC and GCC.
2015-02-11 11:13:17 -06:00
Hank Anderson
e8c39138c6
Removed return value from GenerateNamedObjects.
...
It sets DBLAS_OBJS directly to save a bunch of list appending in the
CMakeLists.txt files.
2015-02-09 12:28:09 -06:00
Hank Anderson
627d5e7401
Added SMP objects to driver/level3.
2015-02-05 12:22:48 -06:00
Hank Anderson
943fa2fb58
Fixed object names in level2.
2015-02-05 10:49:11 -06:00
Hank Anderson
461e691127
Codes when define is absent are now a parameter to AllCombinations.
...
The level3 object names should now be correct.
2015-02-05 09:23:47 -06:00
Hank Anderson
cfaf1c678f
Added option to append define codes with an underscore.
...
Fixed the code array not getting reset on subsequent AllCombinations
calls.
2015-02-05 09:17:18 -06:00
Hank Anderson
0d7bad1f35
Changed GenerateObjects to append combination codes (e.g. dtrmm_TU).
2015-02-05 09:02:54 -06:00
Hank Anderson
d11bde60d0
DOUBLE define for DBLAS objects is now set in main CMakeLists.txt.
...
Since the objects are the same, could generate SINGLE/COMPLEX/etc here
without having to rewrite all the object enumeration code again.
2015-02-02 15:00:44 -06:00
Hank Anderson
5057a4b4df
Added openblas add_library call that uses DBLAS_OBJS ojbects.
2015-01-30 15:21:21 -06:00
Hank Anderson
d3dcdddf75
Moved functions into util cmake file.
2015-01-30 13:47:40 -06:00
Hank Anderson
e5e7595bf9
Added paramater to GenerateObjects for defines that affect all sources.
2015-01-30 13:31:13 -06:00
Hank Anderson
7693887d61
Added empty set to the combinations generated by AllCombinations.
2015-01-30 13:01:11 -06:00
Hank Anderson
8d9b196e0d
Moved loop over define combos into a function.
...
This function takes a set of sources and a set of preprocessor
definitions. It will iterate over the sources and build an object
file for each combination of preprocessor definitions for each
source file.
2015-01-30 12:14:44 -06:00
Hank Anderson
a6cf8aafc0
Updated level3/CMakeLists with correct defines using all combos.
2015-01-30 11:21:50 -06:00
Hank Anderson
dbdca7bf0c
Added first pass at driver/level3 Makefile conversion.
...
Added a rather convoluted CMake function to find all combinations
of a given list. This will be useful for the object files that are
compiled multiple times with different combinations of preprocessor
definitions.
2015-01-29 22:53:11 -06:00
wernsaar
7aae4a62e7
enabled use of GEMM3M functions
2014-09-20 14:27:10 +02:00
wernsaar
1d33547222
optimized zgemm kernel for haswell
2014-07-27 11:51:42 +02:00
wernsaar
3ea4dadd30
optimizations for trsm
2014-07-25 11:59:17 +02:00
wernsaar
1b10ff129a
optimizations for trmm
2014-07-25 10:00:23 +02:00
wernsaar
125610d23b
allow to set custom value for ?GEMM_DEFAULT_UNROLL_MN, optimizations for syrk
2014-07-24 18:43:31 +02:00
wernsaar
be94db096c
disabled *3M functions for x86_64 platforms
2014-07-01 16:18:05 +02:00