OpenBLAS/kernel/x86_64
Wangyang Guo 7a2d1601ec sbgemm: cooperlake: unroll core loop by 2 2021-09-07 21:30:46 +08:00
..
KERNEL Remove premature entry for DOMATCOPY_RT 2021-03-18 21:53:50 +01:00
KERNEL.ATOM Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
KERNEL.BARCELONA Bugfix for ztrmv 2016-03-07 09:39:34 +01:00
KERNEL.BOBCAT Fixed #395. Enable optimized cgemm for Sandybridge. Added optimized sdot kernel. 2014-06-29 10:34:51 +08:00
KERNEL.BULLDOZER Add trivially optimized dsdot based on sdot 2017-11-24 20:02:28 +01:00
KERNEL.COOPERLAKE sbgemm: cooperlake: change kernel size to 16x4 2021-09-07 21:30:45 +08:00
KERNEL.CORE2 Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
KERNEL.DUNNINGTON Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
KERNEL.EXCAVATOR Add trivially optimized dsdot based on sdot 2017-11-24 20:03:40 +01:00
KERNEL.HASWELL Improve the performance of rot by using AVX512 and AVX2 intrinsic 2020-11-05 15:12:36 +08:00
KERNEL.NANO Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
KERNEL.NEHALEM Add trivially optimized dsdot based on sdot 2017-11-24 19:59:28 +01:00
KERNEL.OPTERON Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
KERNEL.OPTERON_SSE3 Fixed #395. Enable optimized cgemm for Sandybridge. Added optimized sdot kernel. 2014-06-29 10:34:51 +08:00
KERNEL.PENRYN Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
KERNEL.PILEDRIVER Add trivially optimized dsdot based on sdot 2017-11-24 20:04:29 +01:00
KERNEL.PRESCOTT fallback to zgemm_kernel_4x2_sse.S 2014-07-06 11:05:28 +02:00
KERNEL.SANDYBRIDGE Add trivially optimized dsdot based on sdot 2017-11-24 20:00:23 +01:00
KERNEL.SKYLAKEX Small Matrix: skylakex: remove unnecessary b0 source files 2021-08-13 03:28:44 +00:00
KERNEL.STEAMROLLER Add trivially optimized dsdot based on sdot 2017-11-24 20:01:42 +01:00
KERNEL.ZEN Enable optimized srot/drot kernels from Haswell 2021-02-11 09:23:05 +01:00
KERNEL.generic Add ?sum definitions for generic kernel 2019-03-31 13:55:49 +02:00
Makefile Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
amax.S use emms instead, add WIN guards 2020-10-12 18:15:01 +03:00
amax_atom.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
amax_sse.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
amax_sse2.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
asum.S use emms instead, add WIN guards 2020-10-12 18:15:01 +03:00
asum_atom.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
asum_sse.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
asum_sse2.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
axpy.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
axpy_atom.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
axpy_sse.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
axpy_sse2.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
bf16_common_macros.h Add all SBGEMM kernels for IA AVX512-BF16 based platforms 2021-08-05 11:11:29 +08:00
bf16to.c Add bfloat16 based dot and conversion with single/double 2020-09-04 02:31:25 +08:00
builtin_stinit.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
cabs.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
casum.c fix error declare function blas_level1_thread_with_return_value 2020-12-02 09:51:52 +08:00
casum_microk_skylakex-2.c Improve the performance of zasum and casum with AVX512 intrinsic 2020-12-01 16:49:26 +08:00
caxpy.c Enable COOPERLAKE build target 2020-08-13 06:18:00 +08:00
caxpy_microk_bulldozer-2.c x86_64: clobber all xmm registers after vzeroupper 2020-10-20 02:16:47 +00:00
caxpy_microk_haswell-2.c x86_64: clobber all xmm registers after vzeroupper 2020-10-20 02:16:47 +00:00
caxpy_microk_sandy-2.c x86_64: clobber all xmm registers after vzeroupper 2020-10-20 02:16:47 +00:00
caxpy_microk_steamroller-2.c x86_64: clobber all xmm registers after vzeroupper 2020-10-20 02:16:47 +00:00
cdot.c Drop redundant inclusion of complex.h 2021-05-14 15:06:44 +02:00
cdot_microk_bulldozer-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 2019-01-17 23:20:32 +01:00
cdot_microk_haswell-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 2019-01-17 23:20:32 +01:00
cdot_microk_sandy-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 2019-01-17 23:20:32 +01:00
cdot_microk_steamroller-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 2019-01-17 23:20:32 +01:00
cgemm3m_kernel_8x4_haswell.c Update cgemm3m_kernel_8x4_haswell.c 2019-12-27 18:23:29 +08:00
cgemm_kernel_4x2_bulldozer.S bugfix for bulldozer cgemm-, zgemm- and zgemv-kernel 2014-06-28 12:16:20 +02:00
cgemm_kernel_4x2_piledriver.S bugfix for piledriver cgemm-, zgemm- and zgemv-kernel 2014-06-28 11:46:58 +02:00
cgemm_kernel_4x8_sandy.S Update organization info. 2014-11-25 15:28:58 +08:00
cgemm_kernel_8x2_haswell.S modification for clang compiler 2014-08-27 09:00:20 +02:00
cgemm_kernel_8x2_haswell.c Update cgemm_kernel_8x2_haswell.c 2020-02-27 22:26:15 +08:00
cgemm_kernel_8x2_sandy.S optimization of sandybridge cgemm-kernel 2014-07-29 19:07:21 +02:00
cgemm_kernel_8x2_skylakex.c AVX512 CGEMM & ZGEMM kernels 2019-11-11 20:04:52 +08:00
cgemv_n.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
cgemv_n_4.c Enable COOPERLAKE build target 2020-08-13 06:18:00 +08:00
cgemv_n_microk_bulldozer-4.c Tag %1 and %2 as both input and output operands 2017-12-31 18:03:36 +01:00
cgemv_n_microk_haswell-4.c Tag %1 and %2 as both input and output 2017-12-29 23:56:41 +01:00
cgemv_t.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
cgemv_t_4.c Enable COOPERLAKE build target 2020-08-13 06:18:00 +08:00
cgemv_t_microk_bulldozer-4.c Tag %1 and %2 as both input and output operands 2017-12-31 18:03:36 +01:00
cgemv_t_microk_haswell-4.c Tag %1 and %2 as both input and output 2017-12-29 23:56:41 +01:00
copy.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
copy_sse.S Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
copy_sse2.S Convert aligned moves to unaligned 2020-04-13 14:58:52 +02:00
cscal.c Enable COOPERLAKE build target 2020-08-13 06:18:00 +08:00
cscal_microk_bulldozer-2.c Fix declaration of input arguments in the x86_64 SCAL microkernels (#1966) 2019-01-18 08:11:07 +01:00
cscal_microk_haswell-2.c Fix declaration of input arguments in the x86_64 SCAL microkernels (#1966) 2019-01-18 08:11:07 +01:00
cscal_microk_steamroller-2.c Fix declaration of input arguments in the x86_64 SCAL microkernels (#1966) 2019-01-18 08:11:07 +01:00
ctrsm_kernel_LN_bulldozer.c added optimized trsm_kernels 2016-01-05 13:05:05 +01:00
ctrsm_kernel_LT_bulldozer.c added optimized trsm_kernels 2016-01-05 13:05:05 +01:00
ctrsm_kernel_RN_bulldozer.c added optimized trsm_kernels 2016-01-05 13:05:05 +01:00
ctrsm_kernel_RT_bulldozer.c added optimized trsm_kernels 2016-01-05 13:05:05 +01:00
dasum.c Use Haswell optimizations for Zen as well 2021-02-11 09:26:15 +01:00
dasum_microk_haswell-2.c align to 64, using SSE when input size is small 2020-09-03 14:25:54 +08:00
dasum_microk_skylakex-2.c align to 64, using SSE when input size is small 2020-09-03 14:25:54 +08:00
daxpy.c Add double precision universal intrinsics for X86/ARM 2020-10-15 10:29:42 +08:00
daxpy_bulldozer.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
daxpy_microk_bulldozer-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 2019-01-17 23:20:32 +01:00
daxpy_microk_haswell-2.c x86_64: clobber all xmm registers after vzeroupper 2020-10-20 02:16:47 +00:00
daxpy_microk_nehalem-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 2019-01-17 23:20:32 +01:00
daxpy_microk_piledriver-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 2019-01-17 23:20:32 +01:00
daxpy_microk_sandy-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 2019-01-17 23:20:32 +01:00
daxpy_microk_skylakex-2.c Add a AVX512 enabled SAXPY/DAXPY functions 2018-08-10 02:58:32 +00:00
daxpy_microk_steamroller-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 2019-01-17 23:20:32 +01:00
dcopy_bulldozer.S added dcopy_bulldozer.S 2013-06-21 16:06:51 +02:00
ddot.c Enable COOPERLAKE build target 2020-08-13 06:18:00 +08:00
ddot_bulldozer.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
ddot_microk_bulldozer-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 2019-01-17 23:20:32 +01:00
ddot_microk_haswell-2.c x86_64: clobber all xmm registers after vzeroupper 2020-10-20 02:16:47 +00:00
ddot_microk_nehalem-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 2019-01-17 23:20:32 +01:00
ddot_microk_piledriver-2.c x86_64: clobber all xmm registers after vzeroupper 2020-10-20 02:16:47 +00:00
ddot_microk_sandy-2.c x86_64: clobber all xmm registers after vzeroupper 2020-10-20 02:16:47 +00:00
ddot_microk_skylakex-2.c Add an AVX512 enabled DDOT function 2018-08-09 03:55:52 +00:00
ddot_microk_steamroller-2.c x86_64: clobber all xmm registers after vzeroupper 2020-10-20 02:16:47 +00:00
dgemm_beta_skylakex.c Fix thinko in skylake beta handling 2018-12-24 18:49:50 +00:00
dgemm_kernel_4x4_haswell.S small optimization on dgemm_kernel for N=1 2014-12-18 20:35:51 +01:00
dgemm_kernel_4x8_haswell.S Add files via upload 2019-07-28 07:39:09 +08:00
dgemm_kernel_4x8_sandy.S Change file comments to work around clang 3.9 assembler bug 2016-10-13 16:51:08 +02:00
dgemm_kernel_4x8_skylakex.c Use p2align instead of align for OSX compatibility 2018-12-03 13:06:43 +01:00
dgemm_kernel_4x8_skylakex_2.c Update dgemm_kernel_4x8_skylakex_2.c 2019-11-28 19:56:35 +08:00
dgemm_kernel_6x4_piledriver.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
dgemm_kernel_8x2_bulldozer.S Ref #380: lowered stack usage for piledriver and bulldozer kernels 2014-06-19 14:02:14 +02:00
dgemm_kernel_8x2_piledriver.S Ref #380: lowered stack usage for piledriver and bulldozer kernels 2014-06-19 14:02:14 +02:00
dgemm_kernel_8x8_skylakex.c Update dgemm_kernel_8x8_skylakex.c 2019-10-18 15:00:17 +08:00
dgemm_kernel_16x2_haswell.S Refs #330. Fixed the compatible issue with clang on Mac OSX. 2013-12-16 20:31:17 +08:00
dgemm_kernel_16x2_skylakex.S Use AVX512 also for DGEMM 2018-06-03 22:17:27 +00:00
dgemm_kernel_16x2_skylakex.c GEMM: skylake: improve the performance when m is small 2021-04-28 13:56:06 +00:00
dgemm_ncopy_2.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
dgemm_ncopy_4.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
dgemm_ncopy_8.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
dgemm_ncopy_8_bulldozer.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
dgemm_ncopy_8_skylakex.c Add vector optimizations for ncopy as well for dgemm/skylakex 2018-10-06 21:18:12 +00:00
dgemm_small_kernel_nn_skylakex.c Small Matrix: skylakex: fix build error in old compiler 2021-08-05 04:43:47 +00:00
dgemm_small_kernel_nt_skylakex.c Small Matrix: skylakex: fix build error in old compiler 2021-08-05 04:43:47 +00:00
dgemm_small_kernel_permit_skylakex.c Small Matrix: skylakex: add DGEMM_SMALL_M_PERMIT and tune for TN kernel 2021-08-02 07:06:54 +00:00
dgemm_small_kernel_tn_skylakex.c Small Matrix: skylakex: fix build error in old compiler 2021-08-05 04:43:47 +00:00
dgemm_small_kernel_tt_skylakex.c Small Matrix: skylakex: fix build error in old compiler 2021-08-05 04:43:47 +00:00
dgemm_tcopy_2.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
dgemm_tcopy_4.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
dgemm_tcopy_8.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
dgemm_tcopy_8_bulldozer.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
dgemm_tcopy_8_skylakex.c Add optimized *copy versions for skylakex 2018-10-06 13:51:44 +00:00
dgemm_tcopy_16_skylakex.c Fix build with -Werror=return-type 2020-10-21 08:43:39 +02:00
dgemv_n.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
dgemv_n_4.c Enable COOPERLAKE build target 2020-08-13 06:18:00 +08:00
dgemv_n_atom.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
dgemv_n_bulldozer.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
dgemv_n_microk_haswell-4.c x86_64: clobber all xmm registers after vzeroupper 2020-10-20 02:16:47 +00:00
dgemv_n_microk_nehalem-4.c Replace .align with .p2align in the Nehalem microkernels 2018-02-26 20:58:33 +01:00
dgemv_n_microk_piledriver-4.c x86_64: clobber all xmm registers after vzeroupper 2020-10-20 02:16:47 +00:00
dgemv_n_microk_skylakex-4.c Add an AVX512 enabled DGEMV (n) function 2018-08-11 17:38:12 +00:00
dgemv_t.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
dgemv_t_4.c Enable COOPERLAKE build target 2020-08-13 06:18:00 +08:00
dgemv_t_atom.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
dgemv_t_bulldozer.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
dgemv_t_microk_haswell-4.c x86_64: clobber all xmm registers after vzeroupper 2020-10-20 02:16:47 +00:00
dger.c optimized dger kernel for sandybridge 2015-04-28 16:58:11 +02:00
dger_microk_sandy-2.c Fix declaration of input arguments in the Sandybridge GER microkernels (#1967) 2019-01-18 08:11:39 +01:00
dot.S use emms instead, add WIN guards 2020-10-12 18:15:01 +03:00
dot_atom.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
dot_sse.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
dot_sse2.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
drot.c Use Haswell optimizations for Zen as well 2021-02-11 09:24:16 +01:00
drot_microk_haswell-2.c replace spurious avx512 requirement with fma check 2021-04-26 21:55:30 +02:00
drot_microk_skylakex-2.c Improve the performance of rot by using AVX512 and AVX2 intrinsic 2020-11-05 15:12:36 +08:00
dscal.c Enable COOPERLAKE build target 2020-08-13 06:18:00 +08:00
dscal_microk_bulldozer-2.c Fix declaration of input arguments in the x86_64 SCAL microkernels (#1966) 2019-01-18 08:11:07 +01:00
dscal_microk_haswell-2.c Fix declaration of input arguments in the x86_64 SCAL microkernels (#1966) 2019-01-18 08:11:07 +01:00
dscal_microk_sandy-2.c Fix declaration of input arguments in the x86_64 SCAL microkernels (#1966) 2019-01-18 08:11:07 +01:00
dscal_microk_skylakex-2.c Add an AVX512 enabled DSCAL function 2018-08-11 17:14:57 +00:00
dsymv_L.c Enable COOPERLAKE build target 2020-08-13 06:18:00 +08:00
dsymv_L_microk_bulldozer-2.c Fix declaration of arguments in inline assembly 2019-02-12 16:14:02 +01:00
dsymv_L_microk_haswell-2.c Fix declaration of arguments in inline assembly 2019-02-12 16:14:02 +01:00
dsymv_L_microk_nehalem-2.c Fix declaration of arguments in inline assembly 2019-02-12 16:14:02 +01:00
dsymv_L_microk_sandy-2.c Fix declaration of arguments in inline assembly 2019-02-12 16:14:02 +01:00
dsymv_L_microk_skylakex-2.c Duplicate earlier Clang 9.0.0 workaround for corresponding Apple Clang version 2020-05-05 10:44:50 +02:00
dsymv_U.c Enable COOPERLAKE build target 2020-08-13 06:18:00 +08:00
dsymv_U_microk_bulldozer-2.c Fix declaration of assembly arguments in SSYMV and DSYMV microkernels 2019-02-12 16:00:18 +01:00
dsymv_U_microk_haswell-2.c Fix declaration of assembly arguments in SSYMV and DSYMV microkernels 2019-02-12 16:00:18 +01:00
dsymv_U_microk_nehalem-2.c Fix declaration of assembly arguments in SSYMV and DSYMV microkernels 2019-02-12 16:00:18 +01:00
dsymv_U_microk_sandy-2.c Fix declaration of assembly arguments in SSYMV and DSYMV microkernels 2019-02-12 16:00:18 +01:00
dtobf16_microk_cooperlake.c Add bfloat16 based dot and conversion with single/double 2020-09-04 02:31:25 +08:00
dtrmm_kernel_4x8_haswell.c Replace vpermpd with vpermilpd in the Haswell DTRMM kernel 2019-07-28 23:17:28 +02:00
dtrsm_kernel_LN_bulldozer.c Remove unused variables from Haswell dtrmm and Bulldozer dtrsm 2017-11-14 23:35:10 +01:00
dtrsm_kernel_LT_8x2_bulldozer.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
dtrsm_kernel_RN_8x2_bulldozer.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
dtrsm_kernel_RN_haswell.c Replace most vpermpd calls in the Haswell DTRSM_RN kernel 2019-08-03 12:40:13 +02:00
dtrsm_kernel_RT_bulldozer.c Fix inline assembly constraints in Bulldozer TRSM kernels 2019-02-16 20:06:48 +01:00
gemm_beta.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
gemm_kernel_2x8_nehalem.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
gemm_kernel_4x2_atom.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
gemm_kernel_4x4_barcelona.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
gemm_kernel_4x4_core2.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
gemm_kernel_4x4_penryn.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
gemm_kernel_4x4_sse2.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
gemm_kernel_4x4_sse3.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
gemm_kernel_4x8_nano.S Fix crash in sgemm SSE/nano kernel on x86_64 2019-03-07 16:55:13 +01:00
gemm_kernel_4x8_nehalem.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
gemm_kernel_8x4_barcelona.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
gemm_kernel_8x4_core2.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
gemm_kernel_8x4_penryn.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
gemm_kernel_8x4_sse.S Fix crash in sgemm SSE/nano kernel on x86_64 2019-03-07 16:55:13 +01:00
gemm_kernel_8x4_sse3.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
gemm_ncopy_2.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
gemm_ncopy_2_bulldozer.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
gemm_ncopy_4.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
gemm_ncopy_4_opteron.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
gemm_tcopy_2.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
gemm_tcopy_2_bulldozer.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
gemm_tcopy_4.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
gemm_tcopy_4_opteron.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
iamax.S use emms instead, add WIN guards 2020-10-12 18:15:01 +03:00
iamax_sse.S Silence a redefinition warning 2020-10-15 19:08:12 +02:00
iamax_sse2.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
izamax.S use emms instead, add WIN guards 2020-10-12 18:15:01 +03:00
izamax_sse.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
izamax_sse2.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
lsame.S Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
mcount.S Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
nrm2.S use emms instead, add WIN guards 2020-10-12 18:15:01 +03:00
nrm2_sse.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
omatcopy_rt.c Update omatcopy_rt.c 2021-02-24 09:34:14 +01:00
qconjg.S use emms instead, add WIN guards 2020-10-12 18:15:01 +03:00
qdot.S use emms instead, add WIN guards 2020-10-12 18:15:01 +03:00
qgemm_kernel_2x2.S use emms instead, add WIN guards 2020-10-12 18:15:01 +03:00
qgemv_n.S use emms instead, add WIN guards 2020-10-12 18:15:01 +03:00
qgemv_t.S use emms instead, add WIN guards 2020-10-12 18:15:01 +03:00
qtrsm_kernel_LN_2x2.S use emms instead, add WIN guards 2020-10-12 18:15:01 +03:00
qtrsm_kernel_LT_2x2.S use emms instead, add WIN guards 2020-10-12 18:15:01 +03:00
qtrsm_kernel_RT_2x2.S use emms instead, add WIN guards 2020-10-12 18:15:01 +03:00
rot.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
rot_sse.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
rot_sse2.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
sasum.c Use Haswell optimizations for Zen as well 2021-02-11 09:25:36 +01:00
sasum_microk_haswell-2.c align to 64, using SSE when input size is small 2020-09-03 14:25:54 +08:00
sasum_microk_skylakex-2.c align to 64, using SSE when input size is small 2020-09-03 14:25:54 +08:00
saxpy.c Enable COOPERLAKE build target 2020-08-13 06:18:00 +08:00
saxpy_microk_haswell-2.c x86_64: clobber all xmm registers after vzeroupper 2020-10-20 02:16:47 +00:00
saxpy_microk_nehalem-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 2019-01-17 23:20:32 +01:00
saxpy_microk_piledriver-2.c x86_64: clobber all xmm registers after vzeroupper 2020-10-20 02:16:47 +00:00
saxpy_microk_sandy-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 2019-01-17 23:20:32 +01:00
saxpy_microk_skylakex-2.c Add a AVX512 enabled SAXPY/DAXPY functions 2018-08-10 02:58:32 +00:00
sbdot.c Rename "HALF" and "sh" to "BFLOAT16" and "sb" 2020-10-11 23:42:07 +02:00
sbdot_microk_cooperlake.c Rename "HALF" and "sh" to "BFLOAT16" and "sb" 2020-10-11 23:42:07 +02:00
sbgemm_block_microk_cooperlake.c sbgemm: cooperlake: enable SBGEMM by small matrix path 2021-08-30 17:40:30 +08:00
sbgemm_kernel_16x4_cooperlake.c sbgemm: cooperlake: unroll core loop by 2 2021-09-07 21:30:46 +08:00
sbgemm_microk_cooperlake_template.c sbgemm: cooperlake: enable SBGEMM by small matrix path 2021-08-30 17:40:30 +08:00
sbgemm_ncopy_4_cooperlake.c sbgemm: cooperlake: kernel works for NN 2021-09-07 21:30:45 +08:00
sbgemm_ncopy_16_cooperlake.c sbgemm: cooperlake: change kernel size to 16x4 2021-09-07 21:30:45 +08:00
sbgemm_small_kernel_nn_cooperlake.c sbgemm: cooperlake: enable SBGEMM by small matrix path 2021-08-30 17:40:30 +08:00
sbgemm_small_kernel_nt_cooperlake.c sbgemm: cooperlake: enable SBGEMM by small matrix path 2021-08-30 17:40:30 +08:00
sbgemm_small_kernel_permit_cooperlake.c sbgemm: cooperlake: enable SBGEMM by small matrix path 2021-08-30 17:40:30 +08:00
sbgemm_small_kernel_template_cooperlake.c sbgemm: cooperlake: make sure hot buffer aligned to 64 2021-08-30 17:40:30 +08:00
sbgemm_small_kernel_tn_cooperlake.c sbgemm: cooperlake: enable SBGEMM by small matrix path 2021-08-30 17:40:30 +08:00
sbgemm_small_kernel_tt_cooperlake.c sbgemm: cooperlake: enable SBGEMM by small matrix path 2021-08-30 17:40:30 +08:00
sbgemm_tcopy_4_cooperlake.c sbgemm: cooperlake: change kernel size to 16x4 2021-09-07 21:30:45 +08:00
sbgemm_tcopy_16_cooperlake.c sbgemm: cooperlake: kernel works for NN 2021-09-07 21:30:45 +08:00
sbgemv_n.c Implementation of BF16 based gemv 2020-10-29 02:08:23 +08:00
sbgemv_n_microk_cooperlake.c Implementation of BF16 based gemv 2020-10-29 02:08:23 +08:00
sbgemv_n_microk_cooperlake_template.c Implementation of BF16 based gemv 2020-10-29 02:08:23 +08:00
sbgemv_t.c Implementation of BF16 based gemv 2020-10-29 02:08:23 +08:00
sbgemv_t_microk_cooperlake.c Implementation of BF16 based gemv 2020-10-29 02:08:23 +08:00
sbgemv_t_microk_cooperlake_template.c Implementation of BF16 based gemv 2020-10-29 02:08:23 +08:00
scal.S Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
scal_atom.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
scal_sse.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
scal_sse2.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
sdot.c Enable COOPERLAKE build target 2020-08-13 06:18:00 +08:00
sdot_microk_bulldozer-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 2019-01-17 23:20:32 +01:00
sdot_microk_haswell-2.c x86_64: clobber all xmm registers after vzeroupper 2020-10-20 02:16:47 +00:00
sdot_microk_nehalem-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 2019-01-17 23:20:32 +01:00
sdot_microk_sandy-2.c x86_64: clobber all xmm registers after vzeroupper 2020-10-20 02:16:47 +00:00
sdot_microk_skylakex-2.c Fix typo in sdot function 2018-08-11 17:16:45 +00:00
sdot_microk_steamroller-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 2019-01-17 23:20:32 +01:00
sgemm_beta_skylakex.c sbgemm: cooperlake: add dummy source files 2021-09-07 21:30:45 +08:00
sgemm_direct_performant.c [WIP] Refactor the driver code for direct SGEMM (#2782) 2020-08-19 14:51:09 +02:00
sgemm_direct_skylakex.c Move common.h back to the top of the file so that SKYLAKEX (from config.h) is defined in time 2021-03-18 21:28:19 +01:00
sgemm_kernel_8x4_bulldozer.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
sgemm_kernel_8x4_haswell.c Update sgemm_kernel_8x4_haswell.c 2020-02-06 01:47:46 +00:00
sgemm_kernel_8x4_haswell_2.c Strip UTF8 byte order marker from source 2020-06-26 09:00:43 +02:00
sgemm_kernel_8x8_sandy.S Update organization info. 2014-11-25 15:28:58 +08:00
sgemm_kernel_16x2_bulldozer.S Ref #380: lowered stack usage for piledriver and bulldozer kernels 2014-06-19 14:02:14 +02:00
sgemm_kernel_16x2_piledriver.S Ref #380: lowered stack usage for piledriver and bulldozer kernels 2014-06-19 14:02:14 +02:00
sgemm_kernel_16x4_haswell.S modification for clang compiler 2014-08-27 09:00:20 +02:00
sgemm_kernel_16x4_sandy.S Refs #535. Fix the wrong vector instruction in sgemm sandy bridge kernel. 2015-04-08 03:55:49 +08:00
sgemm_kernel_16x4_skylakex.S Use AVX512 also for DGEMM 2018-06-03 22:17:27 +00:00
sgemm_kernel_16x4_skylakex.c make skylakex sgemm code more friendly for readers 2020-01-13 16:28:41 +08:00
sgemm_kernel_16x4_skylakex_2.c AVX512 STRMM kernel 2020-02-16 22:58:00 +08:00
sgemm_kernel_16x4_skylakex_3.c Use "old" compute(24) function with clang due to register limitations 2021-04-06 19:58:32 +02:00
sgemm_ncopy_4_skylakex.c Use sgemm_ncopy_4_skylakex.c also for Haswell 2018-12-15 13:49:19 +00:00
sgemm_small_kernel_nn_skylakex.c Small Matrix: skylakex: sgemm nn: fix n6 conflicts with n4 2021-08-02 07:06:54 +00:00
sgemm_small_kernel_nt_skylakex.c Small Matrix: skylakex: fix build error in old compiler 2021-08-05 04:43:47 +00:00
sgemm_small_kernel_permit_skylakex.c Small Matrix: skylakex: add sgemm tt kernel 2021-08-02 07:06:54 +00:00
sgemm_small_kernel_tn_skylakex.c Small Matrix: skylakex: add sgemm tn kernel 2021-08-02 07:06:54 +00:00
sgemm_small_kernel_tt_skylakex.c Small Matrix: skylakex: fix build error in old compiler 2021-08-05 04:43:47 +00:00
sgemm_tcopy_16_skylakex.c Add a C+intrinsics version of the SGEMM/skylakex kernel 2018-10-10 01:49:22 +00:00
sgemv_n.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
sgemv_n.c removed obsolete gemv kernel files 2014-09-14 11:00:53 +02:00
sgemv_n_4.c sgemv: skylakex: fix build warning 2021-08-25 07:13:00 +00:00
sgemv_n_microk_bulldozer-4.c Fix inline assembly constraints 2019-02-16 18:46:17 +01:00
sgemv_n_microk_haswell-4.c x86_64: clobber all xmm registers after vzeroupper 2020-10-20 02:16:47 +00:00
sgemv_n_microk_nehalem-4.c Fix inline assembly constraints 2019-02-16 18:24:11 +01:00
sgemv_n_microk_sandy-4.c Fix inline assembly constraints 2019-02-16 18:36:39 +01:00
sgemv_n_microk_skylakex-8.c optimize on sgemv_n for small n 2021-04-30 12:14:58 -04:00
sgemv_t.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
sgemv_t.c removed obsolete gemv kernel files 2014-09-14 11:00:53 +02:00
sgemv_t_4.c sgemv: skylakex: bug fix for sgemv_t kernel in corner case 2021-08-25 07:07:27 +00:00
sgemv_t_microk_bulldozer-4.c Tag %1 and %2 as both input and output operands 2017-12-31 18:03:36 +01:00
sgemv_t_microk_haswell-4.c x86_64: clobber all xmm registers after vzeroupper 2020-10-20 02:16:47 +00:00
sgemv_t_microk_nehalem-4.c Replace .align with .p2align in the Nehalem microkernels 2018-02-26 20:58:33 +01:00
sgemv_t_microk_sandy-4.c Use .p2align instead of .align for compatibility on Sandybridge as well 2018-02-24 19:43:15 +01:00
sgemv_t_microk_skylakex.c Optimized sgemv_t for small N based on AVX512 2021-06-08 15:08:28 -04:00
sgemv_t_microk_skylakex_template.c sgemv: skylakex: fix build warning 2021-08-25 07:13:00 +00:00
sger.c added optimized sger kernel for sandybridge 2015-04-28 15:33:38 +02:00
sger_microk_sandy-2.c Fix declaration of input arguments in the Sandybridge GER microkernels (#1967) 2019-01-18 08:11:39 +01:00
srot.c Use Haswell optimizations for Zen as well 2021-02-11 09:24:51 +01:00
srot_microk_haswell-2.c Remove spurious AVX512 requirement and add AVX2/FMA3 guard 2021-03-06 14:35:49 +01:00
srot_microk_skylakex-2.c Improve the performance of rot by using AVX512 and AVX2 intrinsic 2020-11-05 15:12:36 +08:00
ssymv_L.c Enable COOPERLAKE build target 2020-08-13 06:18:00 +08:00
ssymv_L_microk_bulldozer-2.c Fix declaration of arguments in inline assembly 2019-02-12 16:14:02 +01:00
ssymv_L_microk_haswell-2.c Fix declaration of arguments in inline assembly 2019-02-12 16:14:02 +01:00
ssymv_L_microk_nehalem-2.c Fix declaration of arguments in inline assembly 2019-02-12 16:14:02 +01:00
ssymv_L_microk_sandy-2.c Fix declaration of arguments in inline assembly 2019-02-12 16:14:02 +01:00
ssymv_U.c Enable COOPERLAKE build target 2020-08-13 06:18:00 +08:00
ssymv_U_microk_bulldozer-2.c Fix declaration of assembly arguments in SSYMV and DSYMV microkernels 2019-02-12 16:00:18 +01:00
ssymv_U_microk_haswell-2.c Fix declaration of assembly arguments in SSYMV and DSYMV microkernels 2019-02-12 16:00:18 +01:00
ssymv_U_microk_nehalem-2.c Fix declaration of assembly arguments in SSYMV and DSYMV microkernels 2019-02-12 16:00:18 +01:00
ssymv_U_microk_sandy-2.c Fix declaration of assembly arguments in SSYMV and DSYMV microkernels 2019-02-12 16:00:18 +01:00
staticbuffer.S Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
stobf16_microk_cooperlake.c Add bfloat16 based dot and conversion with single/double 2020-09-04 02:31:25 +08:00
strsm_kernel_8x4_haswell_LN.c Strip UTF8 byte order marker from source 2020-06-26 09:00:43 +02:00
strsm_kernel_8x4_haswell_LT.c AVX2 STRSM kernel 2020-03-17 00:34:08 +08:00
strsm_kernel_8x4_haswell_L_common.h Strip UTF8 byte order marker from source 2020-06-26 09:00:43 +02:00
strsm_kernel_8x4_haswell_RN.c AVX2 STRSM kernel 2020-03-17 00:34:08 +08:00
strsm_kernel_8x4_haswell_RT.c AVX2 STRSM kernel 2020-03-17 00:34:08 +08:00
strsm_kernel_8x4_haswell_R_common.h AVX2 STRSM kernel 2020-03-17 00:34:08 +08:00
strsm_kernel_LN_bulldozer.c Fix inline assembly constraints in Bulldozer TRSM kernels 2019-02-16 20:06:48 +01:00
strsm_kernel_LT_bulldozer.c Fix inline assembly constraints in Bulldozer TRSM kernels 2019-02-16 20:06:48 +01:00
strsm_kernel_RN_bulldozer.c Fix inline assembly constraints in Bulldozer TRSM kernels 2019-02-16 20:06:48 +01:00
strsm_kernel_RT_bulldozer.c Fix inline assembly constraints in Bulldozer TRSM kernels 2019-02-16 20:06:48 +01:00
sum.S use emms instead, add WIN guards 2020-10-12 18:15:01 +03:00
swap.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
swap_sse.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
swap_sse2.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
symv_L_sse.S Enable COOPERLAKE build target 2020-08-13 06:18:00 +08:00
symv_L_sse2.S Enable COOPERLAKE build target 2020-08-13 06:18:00 +08:00
symv_U_sse.S Enable COOPERLAKE build target 2020-08-13 06:18:00 +08:00
symv_U_sse2.S Enable COOPERLAKE build target 2020-08-13 06:18:00 +08:00
tobf16.c Add bfloat16 based dot and conversion with single/double 2020-09-04 02:31:25 +08:00
trsm_kernel_LN_2x8_nehalem.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
trsm_kernel_LN_4x2_atom.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
trsm_kernel_LN_4x4_barcelona.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
trsm_kernel_LN_4x4_core2.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
trsm_kernel_LN_4x4_penryn.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
trsm_kernel_LN_4x4_sse2.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
trsm_kernel_LN_4x4_sse3.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
trsm_kernel_LN_4x8_nehalem.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
trsm_kernel_LN_8x4_sse.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
trsm_kernel_LT_2x8_nehalem.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
trsm_kernel_LT_4x2_atom.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
trsm_kernel_LT_4x4_barcelona.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
trsm_kernel_LT_4x4_core2.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
trsm_kernel_LT_4x4_penryn.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
trsm_kernel_LT_4x4_sse2.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
trsm_kernel_LT_4x4_sse3.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
trsm_kernel_LT_4x8_nehalem.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
trsm_kernel_LT_8x4_sse.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
trsm_kernel_RT_2x8_nehalem.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
trsm_kernel_RT_4x2_atom.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
trsm_kernel_RT_4x4_barcelona.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
trsm_kernel_RT_4x4_core2.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
trsm_kernel_RT_4x4_penryn.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
trsm_kernel_RT_4x4_sse2.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
trsm_kernel_RT_4x4_sse3.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
trsm_kernel_RT_4x8_nehalem.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
trsm_kernel_RT_8x4_sse.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
xdot.S use emms instead, add WIN guards 2020-10-12 18:15:01 +03:00
xgemm3m_kernel_2x2.S use emms instead, add WIN guards 2020-10-12 18:15:01 +03:00
xgemm_kernel_1x1.S use emms instead, add WIN guards 2020-10-12 18:15:01 +03:00
xgemv_n.S use emms instead, add WIN guards 2020-10-12 18:15:01 +03:00
xgemv_t.S use emms instead, add WIN guards 2020-10-12 18:15:01 +03:00
xtrsm_kernel_LT_1x1.S use emms instead, add WIN guards 2020-10-12 18:15:01 +03:00
zamax.S use emms instead, add WIN guards 2020-10-12 18:15:01 +03:00
zamax_atom.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zamax_sse.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zamax_sse2.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zasum.S use emms instead, add WIN guards 2020-10-12 18:15:01 +03:00
zasum.c fix error declare function blas_level1_thread_with_return_value 2020-12-02 09:51:52 +08:00
zasum_atom.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zasum_microk_skylakex-2.c Improve the performance of zasum and casum with AVX512 intrinsic 2020-12-01 16:49:26 +08:00
zasum_sse.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zasum_sse2.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zaxpy.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zaxpy.c Enable COOPERLAKE build target 2020-08-13 06:18:00 +08:00
zaxpy_atom.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zaxpy_microk_bulldozer-2.c x86_64: clobber all xmm registers after vzeroupper 2020-10-20 02:16:47 +00:00
zaxpy_microk_haswell-2.c x86_64: clobber all xmm registers after vzeroupper 2020-10-20 02:16:47 +00:00
zaxpy_microk_sandy-2.c x86_64: clobber all xmm registers after vzeroupper 2020-10-20 02:16:47 +00:00
zaxpy_microk_steamroller-2.c x86_64: clobber all xmm registers after vzeroupper 2020-10-20 02:16:47 +00:00
zaxpy_sse.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zaxpy_sse2.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zcopy.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zcopy_sse.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zcopy_sse2.S Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
zdot.S use emms instead, add WIN guards 2020-10-12 18:15:01 +03:00
zdot.c Fix mssing dummy parameter (imag part of alpha) of zdot_thread_function 2020-08-23 15:08:16 +02:00
zdot_atom.S Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
zdot_microk_bulldozer-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 2019-01-17 23:20:32 +01:00
zdot_microk_haswell-2.c Replace vpermpd with vpermilpd 2019-07-22 08:28:16 +02:00
zdot_microk_sandy-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 2019-01-17 23:20:32 +01:00
zdot_microk_steamroller-2.c Fix declaration of input arguments in the x86_64 microkernels for DOT and AXPY (#1965) 2019-01-17 23:20:32 +01:00
zdot_sse.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zdot_sse2.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zgemm3m_kernel_2x8_nehalem.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zgemm3m_kernel_4x2_atom.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zgemm3m_kernel_4x4_barcelona.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zgemm3m_kernel_4x4_core2.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zgemm3m_kernel_4x4_haswell.c Update zgemm3m_kernel_4x4_haswell.c 2019-12-30 17:33:42 +08:00
zgemm3m_kernel_4x4_penryn.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zgemm3m_kernel_4x4_sse2.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zgemm3m_kernel_4x4_sse3.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zgemm3m_kernel_4x8_nehalem.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zgemm3m_kernel_8x4_barcelona.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zgemm3m_kernel_8x4_core2.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zgemm3m_kernel_8x4_penryn.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zgemm3m_kernel_8x4_sse.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zgemm3m_kernel_8x4_sse3.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zgemm_beta.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zgemm_kernel_1x4_nehalem.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zgemm_kernel_2x1_atom.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zgemm_kernel_2x2_barcelona.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zgemm_kernel_2x2_bulldozer.S bugfix for bulldozer cgemm-, zgemm- and zgemv-kernel 2014-06-28 12:16:20 +02:00
zgemm_kernel_2x2_core2.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zgemm_kernel_2x2_penryn.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zgemm_kernel_2x2_piledriver.S bugfix for piledriver cgemm-, zgemm- and zgemv-kernel 2014-06-28 11:46:58 +02:00
zgemm_kernel_2x2_sse2.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zgemm_kernel_2x2_sse3.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zgemm_kernel_2x4_nehalem.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zgemm_kernel_4x2_barcelona.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zgemm_kernel_4x2_core2.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zgemm_kernel_4x2_haswell.S modification for clang compiler 2014-08-27 09:00:20 +02:00
zgemm_kernel_4x2_haswell.c Update zgemm_kernel_4x2_haswell.c 2020-02-27 22:25:19 +08:00
zgemm_kernel_4x2_penryn.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zgemm_kernel_4x2_skylakex.c AVX512 CGEMM & ZGEMM kernels 2019-11-11 20:04:52 +08:00
zgemm_kernel_4x2_sse.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zgemm_kernel_4x2_sse3.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zgemm_kernel_4x4_sandy.S Update organization info. 2014-11-25 15:28:58 +08:00
zgemm_ncopy_1.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zgemm_ncopy_2.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zgemm_tcopy_1.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zgemm_tcopy_2.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zgemv_n.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zgemv_n_4.c Enable COOPERLAKE build target 2020-08-13 06:18:00 +08:00
zgemv_n_atom.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zgemv_n_dup.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zgemv_n_microk_bulldozer-4.c Tag %1 and %2 as both input and output operands 2017-12-31 18:03:36 +01:00
zgemv_n_microk_haswell-4.c Tag %1 and %2 as both input and output 2017-12-29 23:56:41 +01:00
zgemv_n_microk_sandy-4.c Use .p2align instead of .align for compatibility on Sandybridge as well 2018-02-24 19:43:15 +01:00
zgemv_t.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zgemv_t_4.c Enable COOPERLAKE build target 2020-08-13 06:18:00 +08:00
zgemv_t_atom.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zgemv_t_dup.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zgemv_t_microk_bulldozer-4.c Tag %1 and %2 as both input and output operands 2017-12-31 18:03:36 +01:00
zgemv_t_microk_haswell-4.c Tag %1 and %2 as both input and output 2017-12-29 23:56:41 +01:00
znrm2.S use emms instead, add WIN guards 2020-10-12 18:15:01 +03:00
znrm2_sse.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zrot.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zrot_sse.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zrot_sse2.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zscal.S use emms instead, add WIN guards 2020-10-12 18:15:01 +03:00
zscal.c Enable COOPERLAKE build target 2020-08-13 06:18:00 +08:00
zscal_atom.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zscal_microk_bulldozer-2.c Fix declaration of input arguments in the x86_64 SCAL microkernels (#1966) 2019-01-18 08:11:07 +01:00
zscal_microk_haswell-2.c Fix declaration of input arguments in the x86_64 SCAL microkernels (#1966) 2019-01-18 08:11:07 +01:00
zscal_microk_steamroller-2.c Fix declaration of input arguments in the x86_64 SCAL microkernels (#1966) 2019-01-18 08:11:07 +01:00
zscal_sse.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zscal_sse2.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zsum.S use emms instead, add WIN guards 2020-10-12 18:15:01 +03:00
zswap.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zswap_sse.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
zswap_sse2.S Import GotoBLAS2 1.13 BSD version codes. 2011-01-24 14:54:24 +00:00
zsymv_L_sse.S Enable COOPERLAKE build target 2020-08-13 06:18:00 +08:00
zsymv_L_sse2.S Enable COOPERLAKE build target 2020-08-13 06:18:00 +08:00
zsymv_U_sse.S Enable COOPERLAKE build target 2020-08-13 06:18:00 +08:00
zsymv_U_sse2.S Enable COOPERLAKE build target 2020-08-13 06:18:00 +08:00
ztrsm_kernel_LN_2x1_atom.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
ztrsm_kernel_LN_2x2_core2.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
ztrsm_kernel_LN_2x2_penryn.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
ztrsm_kernel_LN_2x2_sse2.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
ztrsm_kernel_LN_2x2_sse3.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
ztrsm_kernel_LN_2x4_nehalem.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
ztrsm_kernel_LN_4x2_sse.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
ztrsm_kernel_LN_bulldozer.c added optimized trsm_kernels 2016-01-05 13:05:05 +01:00
ztrsm_kernel_LT_1x4_nehalem.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
ztrsm_kernel_LT_2x1_atom.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
ztrsm_kernel_LT_2x2_core2.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
ztrsm_kernel_LT_2x2_penryn.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
ztrsm_kernel_LT_2x2_sse2.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
ztrsm_kernel_LT_2x2_sse3.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
ztrsm_kernel_LT_2x4_nehalem.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
ztrsm_kernel_LT_4x2_sse.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
ztrsm_kernel_LT_bulldozer.c added optimized trsm_kernels 2016-01-05 13:05:05 +01:00
ztrsm_kernel_RN_bulldozer.c added optimized trsm_kernels 2016-01-05 13:05:05 +01:00
ztrsm_kernel_RT_1x4_nehalem.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
ztrsm_kernel_RT_2x2_core2.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
ztrsm_kernel_RT_2x2_penryn.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
ztrsm_kernel_RT_2x2_sse2.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
ztrsm_kernel_RT_2x2_sse3.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
ztrsm_kernel_RT_2x4_nehalem.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
ztrsm_kernel_RT_4x2_sse.S Remove all trailing whitespace except lapack-netlib 2014-06-27 12:05:18 -07:00
ztrsm_kernel_RT_bulldozer.c added optimized trsm_kernels 2016-01-05 13:05:05 +01:00