OpenBLAS

Commit Graph

Author	SHA1	Message	Date
Martin Kroeker	18a11137f1	Update BLAS tests to correspond to Reference-LAPACK 3.9.0 replaces calculation of machine precision with call to epsilon intrinsic and removes the requirement for previous output files to be removed before rerunning tests	2020-06-14 10:26:25 +02:00
Martin Kroeker	13c28889a2	Update "cosmetic fixes for non-C99 compilers"	2020-06-06 15:22:27 +02:00
Martin Kroeker	28915eed72	Cosmetic fixes for non-C99 compilers	2020-06-05 10:05:34 +02:00
Rajalakshmi Srinivasaraghavan	8efba9b7c0	Improve shgemm test This patch adds another check to test shgemm results.	2020-05-11 17:15:10 -05:00
Rajalakshmi Srinivasaraghavan	564b0d39ef	Add test for shgemm This patch has Makefile changes to add test for shgemm which compares sgemm and shgemm result.	2020-04-29 13:40:34 -05:00
Rajalakshmi Srinivasaraghavan	7eb55504b1	RFC : Add half precision gemm for bfloat16 in OpenBLAS This patch adds support for bfloat16 data type matrix multiplication kernel. For architectures that don't support bfloat16, it is defined as unsigned short (2 bytes). Default unroll sizes can be changed as per architecture as done for SGEMM and for now 8 and 4 are used for M and N. Size of ncopy/tcopy can be changed as per architecture requirement and for now, size 2 is used. Added shgemm in kernel/power/KERNEL.POWER9 and tested in powerpc64le and powerpc64. For reference, added a small test compare_sgemm_shgemm.c to compare sgemm and shgemm output. This patch does not cover OpenBLAS test, benchmark and lapack tests for shgemm. Complex type implementation can be discussed and added once this is approved.	2020-04-14 14:55:08 -05:00
Martin Kroeker	2d8781b0dc	Do not attempt to run test without fortran	2020-03-13 20:11:19 +01:00
luz.paz	daf2fec12d	Misc. typo fixes Found via `codespell -q 3 -w -L ith,als,dum,nd,amin,nto,wis,ba -S ./relapack,./kernel,./lapack-netlib`	2019-04-29 17:03:56 -04:00
Martin Kroeker	6a5ab083b7	Handle special case of gfortran+clang+OpenMP	2018-06-19 20:47:33 +02:00
Martin Kroeker	53026dc63a	Update single and double precision BLAS1 tests from LAPACK 3.8.0 adding tests for SROTMG, SROTM, SDSDOT, DROTMG, DROTM, DSDOT	2018-02-18 12:44:14 +01:00
Sacha Refshauge	4474465438	Remove _static usages for tests	2017-08-20 00:13:46 +10:00
Isuru Fernando	d245caa49a	Support out-of-source build	2017-08-01 15:16:14 +05:30
John Biddiscombe	053044ae4d	Replace CMAKE_SOURCE_DIR/CMAKE_BINARY_DIR with PROJECT_SOURCE_DIR/PROJECT_BINARY_DIR If OpenBLAS is built using add_subdirectory(OpenBlas) as part of another project then the paths set by CMAKE_XXX_DIR are relative to the parent project and not the OpenBLAS project.	2016-05-25 09:13:28 +02:00
Aleksey Kuleshov	3d50ccdc0d	allow building tests when CROSS compiling but don't run them	2016-04-26 12:36:47 +03:00
Zhang Xianyi	aca7d7e953	Detect cmake test result.	2015-10-20 03:35:25 +08:00
Zhang Xianyi	f8eba3d548	Fixed cmake build bugs on Linux.	2015-08-11 16:25:16 -05:00
wernsaar	9d7057366d	bugfix for GEMM3M functions	2014-09-21 11:41:43 +02:00
wernsaar	7f234f8ed1	added GEMM3M tests	2014-09-21 10:55:08 +02:00
wernsaar	d49fd33885	disabled SYMM3M and HEMM3M functions because segment violations	2014-09-20 15:27:40 +02:00
wernsaar	f0f9b25bb6	added test for CGEMM3M function	2014-09-20 14:53:30 +02:00
wernsaar	7a911569b8	added test for GEMM3M functions	2014-09-20 14:21:42 +02:00
Timothy Gu	6c2ead30f0	Remove all trailing whitespace except lapack-netlib Signed-off-by: Timothy Gu <timothygu99@gmail.com>	2014-06-27 12:05:18 -07:00
Sebastien Fabbro	9f0fb6e662	Respect user's LDFLAGS	2013-07-25 14:08:37 -07:00
grisuthedragon	c19a488af2	create openblas_get_parallel to retrieve information which parallelization model is used by OpenBLAS.	2013-07-11 21:39:19 +08:00
Xianyi Zhang	83ecfbb9b3	Merge branch 'loongson3a' into release-0.1.0	2012-03-23 01:26:27 +08:00
Xianyi Zhang	57658a8c14	ref #62 . Added the user friendly message with USE_OPENMP=1. The users should use OMP_NUM_THREADS. When OpenBLAS is compiled with USE_OPENMP=1, it ignores OPENBLAS_NUM_THREADS and GOTO_NUM_THREADS flags.Therefore, you should use OMP_NUM_THREADS. Without setting OMP_NUM_THREADS, a process will use maximal number of threads on a computing node. Thus, if there are 2 processes on the computing node, the thread will contend against other threads on CPU cores. As a result, the application will hang.	2011-10-09 15:14:48 +08:00
traz	64fa709d1f	Fixed #46 . Initialize variables in cblat3.f and zblat3.f.	2011-09-05 16:30:55 +00:00
Xianyi Zhang	066465af5b	Used the environment variable OPENBLAS_NUM_THREADS to set the number of threads in test.	2011-01-24 18:11:35 +00:00
Xianyi Zhang	342bbc3871	Import GotoBLAS2 1.13 BSD version codes.	2011-01-24 14:54:24 +00:00

29 Commits