Commit Graph

5857 Commits

Author SHA1 Message Date
Ali Saidi 97ce6bbce2 Fix barriers in level3_thread 2020-02-29 17:45:17 +00:00
Martin Kroeker a9aeb6745c
Merge pull request #2465 from AGSaidi/neoverse-n1
Add Neoverse-N1 core
2020-02-29 13:24:44 +01:00
Ali Saidi 0af9991cc9 Use wait-for-event to not spin in the blas_lock 2020-02-29 04:23:48 +00:00
Ali Saidi 19f3a4091c Make rpcc() on arm64 get closer to what x86 returns
The Arm implementation of rpcc() uses the architected timer
which is defined by the SBSA to be between 10-400MHz. These numbers
are much smaller than the cycle counter frequency used by x86. Make
the numbers closer by shifting the cycle counter up by the number of
leading zeros in the cntfrq_el0 register which gets us closer to a
noraml cpu clock cycle range.
2020-02-29 04:23:22 +00:00
Ali Saidi c623a965f9 Add Neoverse-N1 core
The implementation is a hybird of the ARMV8 one with some of the
improved TX2 rountines along with specifying -march=v8.2-a
2020-02-29 03:22:04 +00:00
j00520245 e1062400c4 New add syr benchmark 2020-02-28 16:36:53 +08:00
Martin Kroeker a66f4d80c8
Apply MinGW AVX512 compilation fix to fortran options as well
original issue was #1708, I see now that the same problem affects gfortran compilation. The underlying issue is said to be fixed (but not yet released) on all branches of gcc as of a few days ago but it will certainly take time to reach mingw/msys.
2020-02-27 23:09:40 +01:00
wjc404 dd22eb7621
Update cgemm_kernel_8x2_haswell.c 2020-02-27 22:26:15 +08:00
wjc404 2352331e60
Update zgemm_kernel_4x2_haswell.c 2020-02-27 22:25:19 +08:00
Martin Kroeker 430ee31e66
Merge pull request #2447 from martin-frbg/issue2446
Always select ARMV8 parameters for big servers when cpu is TSV110 or EMAG8180
2020-02-27 15:07:02 +01:00
Xianyi Zhang 265ab484c8 Change default RISC-V 64-bit corename to RISCV64_GENERIC
e.g. make CC=riscv64-unknown-linux-gnu-gcc FC=riscv64-unknown-linux-gnu-gfortran TARGET=RISCV64_GENERIC HOSTCC=gcc
2020-02-27 14:46:15 +08:00
Xianyi Zhang 44020a42a4 Fixed compile bug for RV64. 2020-02-27 14:29:42 +08:00
Xianyi Zhang 4aa2d89217 Merge branch 'develop' into risc-v 2020-02-27 13:53:49 +08:00
Martin Kroeker 8164fd1328
Always assume server-class cpu count for TSV110 and EMAG8180 2020-02-26 22:19:57 +01:00
Martin Kroeker 531c6b96d6
Merge pull request #34 from xianyi/develop
rebase
2020-02-26 22:16:28 +01:00
wjc404 1b980001dd
Update zgemm_kernel_4x2_haswell.c 2020-02-26 18:38:12 +08:00
wjc404 2515e1152f
Update cgemm_kernel_8x2_haswell.c 2020-02-26 18:36:54 +08:00
Martin Kroeker ddcbed6690
Merge pull request #2437 from martin-frbg/issue2434
[WIP] Add support for Ampere EMAG8180 ARMV8 cpu
2020-02-25 18:42:52 +01:00
Martin Kroeker f8ec538c82
Add Ampere EMAG8180 2020-02-25 14:30:00 +01:00
Martin Kroeker ca4f7dceff
Add parameters for EMAG8180 DYNAMIC_ARCH support with cmake 2020-02-24 20:23:18 +01:00
Martin Kroeker 1ddf9f1067
Add EMAG8180 to arm64 DYNAMIC_ARCH list for cmake 2020-02-24 20:16:18 +01:00
Martin Kroeker 4c5fac5a2b
Typo fix 2020-02-24 20:15:04 +01:00
Martin Kroeker 320e2648cd
Add EMAG8180 to DYNAMIC_CORE list for ARM64 2020-02-24 19:23:46 +01:00
Martin Kroeker 9b732696c6
Add DYNAMIC_ARCH support for ARMV8 EMAG8180 2020-02-24 19:20:00 +01:00
Martin Kroeker c9dcb3d4a4
Merge pull request #2443 from aaawuanjun/develop
[OpenBlas]:benchmark/copy.c has time,x,y data loop problems
2020-02-24 13:14:51 +01:00
Martin Kroeker 3bb7f0138e
Merge pull request #2442 from martin-frbg/lapackpr390
Apply fix from Reference-LAPACK PR 390
2020-02-24 12:27:01 +01:00
wuanjun 00447568 c93ae92579 [OpenBlas]:benchmark/copy.c has time,x,y data loop problems 2020-02-24 11:23:39 +08:00
Martin Kroeker 87ac1ceb0b
Apply fix from Reference-LAPACK PR390, NaN not propagating 2020-02-23 22:40:40 +01:00
Martin Kroeker 9e40c080f2
Apply fix from Reference-LAPACK PR390, NaN not propagating 2020-02-23 22:39:01 +01:00
wjc404 903854c168
Add files via upload 2020-02-22 23:40:02 +08:00
wjc404 a2ff577a30
Update KERNEL.ZEN 2020-02-22 23:39:43 +08:00
wjc404 97a32cb0a5
Update KERNEL.HASWELL 2020-02-22 23:39:20 +08:00
wjc404 f1746e7284
Delete sgemm_kernel_8x4_haswell_2.c 2020-02-22 23:38:48 +08:00
wjc404 f6fcbd7906
Fix performance bug when LDC is a multiple of 1024 2020-02-22 23:37:45 +08:00
Martin Kroeker 1e8410f18c
Merge pull request #2441 from martin-frbg/ismin2
Add proper defaults for the IxMIN/IxMAX kernels on mips64 and power
2020-02-22 11:21:03 +01:00
Martin Kroeker 07454bf4d5
Add proper defaults for IxMIN/IxMAX kernels
the fallbacks from Makefile.L1 assume a combined source for absolute value and non-absolute (with ifdef USE_ABS) but here we have separate implementations
2020-02-21 11:58:15 +01:00
Martin Kroeker 4046985913
Add proper defaults for IxMIN/IxMAX kernels
the fallbacks from Makefile.L1 assume a combined source for absolute value and non-absolute (with ifdef USE_ABS) but here we have separate implementations
2020-02-21 11:55:52 +01:00
Martin Kroeker 75577f95a7
Merge pull request #33 from xianyi/develop
rebase
2020-02-21 09:56:05 +01:00
Martin Kroeker 33d92c7a37
Merge pull request #2435 from martin-frbg/issue2433
Fix handling of ppc endianness
2020-02-21 00:01:58 +01:00
Martin Kroeker e57b11acca
Add preliminary support for EMAG8180 2020-02-19 19:00:28 +01:00
Martin Kroeker 71e5669c3e
Add preliminary support for EMAG8180 ARMV8 processor 2020-02-19 18:57:26 +01:00
Martin Kroeker e8d82c01d4
Recognize Ampere EMAG8180 2020-02-19 18:49:13 +01:00
Martin Kroeker 0b39cf95b0
Fix endianness conditionals 2020-02-19 18:09:54 +01:00
Martin Kroeker 76b2cec6ce
Get endianness into Makefile variable 2020-02-19 18:08:20 +01:00
Martin Kroeker 276c1791ea
Merge pull request #32 from xianyi/develop
rebase
2020-02-19 18:06:39 +01:00
Martin Kroeker c5bbfd8fee
Merge pull request #2432 from isuruf/install_name
Fix install name on osx again
2020-02-19 08:14:28 +01:00
Isuru Fernando 130c1741e5 Fix install name on osx again 2020-02-18 10:22:49 -08:00
Martin Kroeker 8f782f0673
Merge pull request #2426 from zbeekman/nightly-homebrew-check
Nightly homebrew check
2020-02-18 12:09:15 +01:00
Martin Kroeker 6a517dcb6a
Merge pull request #2427 from martin-frbg/powermin
Fix ISMIN and ISMAX kernel choices for POWER8
2020-02-18 08:15:02 +01:00
Martin Kroeker 9f39f0a2c3
Specify ismin/ismax assembly kernels for POWER8 directly
to fix utest failure in new ismin test -  Makefile.L1 defaults look wrong
2020-02-17 19:55:39 +01:00