Commit Graph

3830 Commits

Author SHA1 Message Date
Martin Kroeker 5f36f18148
Update with 0.3.7 changes 2019-08-11 23:23:27 +02:00
Martin Kroeker d47fe78b0e
Set version to 0.3.7 2019-08-11 23:16:45 +02:00
Martin Kroeker ebe2f47a0f
Set version to 0.3.7 2019-08-11 23:16:11 +02:00
Martin Kroeker 20d417762f
Merge pull request #2213 from xianyi/develop
Update from develop in preparation of the 0.3.7 release
2019-08-11 23:14:49 +02:00
Martin Kroeker 321288597c
Merge pull request #2212 from martin-frbg/nofort-nolib
Avoid spurious dependency on the fortran runtime despite NOFORTRAN=1
2019-08-11 20:26:34 +02:00
Martin Kroeker be147a9f28
Avoid adding a spurious dependency on the fortran runtime despite NOFORTRAN=1
for cases where a fortran compiler is present but not wanted (e.g. not fully functional)
2019-08-11 16:24:39 +02:00
Martin Kroeker c275290ea6
Merge pull request #2211 from martin-frbg/arm64_gcc_trivial
Silence two nuisance warnings from gcc
2019-08-11 16:08:05 +02:00
Martin Kroeker b7bbb02447
Silence two nuisance warnings from gcc 2019-08-11 12:46:05 +02:00
Martin Kroeker bf1430f7d7
Merge pull request #2208 from martin-frbg/munmap-debug
Provide more information on mmap/munmap failure
2019-08-09 07:55:35 +02:00
Martin Kroeker dccff2e785
Merge pull request #2206 from martin-frbg/zen-dtrmm
Replace vpermpd with vpermilpd in the Haswell DTRMM kernel
2019-08-09 07:55:20 +02:00
Martin Kroeker 5c3458a6e7
Merge pull request #2199 from martin-frbg/zen-dtrsm
Replace most vpermpd calls in the Haswell DTRSM_RN kernel
2019-08-09 07:55:02 +02:00
Martin Kroeker 1776ad82c0
Add files via upload 2019-08-09 00:08:11 +02:00
Martin Kroeker 4e2f81cfa1
Provide more information on mmap/munmap failure
for #2207
2019-08-08 23:15:35 +02:00
Martin Kroeker acf6002ab2
Replace most vpermpd calls in the Haswell DTRSM_RN kernel 2019-08-03 12:40:13 +02:00
Martin Kroeker 96a794e9fd
Merge pull request #2198 from martin-frbg/icelake
Update CPUID recognition for Intel Ice Lake
2019-08-02 08:36:14 +02:00
Martin Kroeker 3d36c45116
Add CPUID identification of Intel Ice Lake 2019-08-01 22:52:35 +02:00
Martin Kroeker 648491e1aa
Autodetect Intel Ice Lake (as SKYLAKEX target) 2019-08-01 22:51:09 +02:00
Martin Kroeker 2dfb804cb9
Replace vpermpd with vpermilpd in the Haswell DTRMM kernel
to improve performance on AMD Zen (#2180) applying wjc404's improvement of the DGEMM kernel from #2186
2019-07-28 23:17:28 +02:00
Martin Kroeker 4c153ec9da
Merge pull request #2196 from wjc404/develop
Add vbroadcastsd kernel to dgemm_kernel_4x8_haswell.S
2019-07-28 23:11:40 +02:00
wjc404 7eecd8e39c
Add files via upload 2019-07-28 07:39:09 +08:00
Martin Kroeker f0406a7708
Merge pull request #2112 from ffontaine/develop
Makefile.arm: remove -march flags
2019-07-27 13:00:13 +02:00
Martin Kroeker 561f3fd995
Merge pull request #2193 from martin-frbg/makeutest
Override special make variables
2019-07-24 20:19:21 +02:00
Martin Kroeker 30efed14d1
Unset special make variables in ctest Makefile as well 2019-07-24 15:26:09 +02:00
Martin Kroeker af2e7f28fc
Override special make variables
as seen in https://github.com/xianyi/OpenBLAS/issues/1912#issuecomment-514183900 , any external setting of TARGET_ARCH (which could result from building OpenBLAS as part of a larger project that actually uses this variable) would cause the utest build to fail. 
(Other subtargets appear to be unaffected as they do not use implicit make rules)
2019-07-23 16:56:40 +02:00
Martin Kroeker 4250e6ed64
Merge pull request #2191 from tylerjereddy/conditional_updates
MAINT: remove legacy CMake endif()
2019-07-23 16:20:39 +02:00
Martin Kroeker 7b0b7c11d2
Merge pull request #2190 from martin-frbg/zdot-zen
Replace vpermpd with vpermilpd in the Haswell/Zen zdot microkernel
2019-07-23 16:15:08 +02:00
Martin Kroeker d14cf1ccf4
Merge pull request #2189 from wjc404/develop
Update dgemm_kernel_4x8_haswell.S for reducing cache misses
2019-07-23 08:32:56 +02:00
Tyler Reddy 3f6ab1582a MAINT: remove legacy CMake endif()
* clean up a case where CMake endif()
contained the conditional used in the
if(), which is no longer needed /
discouraged since our minimum required
CMake version supports the modern syntax
2019-07-22 21:24:57 -06:00
Martin Kroeker 28e96458e5
Replace vpermpd with vpermilpd
to improve performance on Zen/Zen2 (as demonstrated by wjc404 in #2180)
2019-07-22 08:28:16 +02:00
wjc404 95fb98f556
Update dgemm_kernel_4x8_haswell.S 2019-07-21 01:10:32 +08:00
wjc404 4801c6d36b
Update dgemm_kernel_4x8_haswell.S 2019-07-21 00:47:45 +08:00
wjc404 9440fa607d
Add files via upload 2019-07-20 22:08:22 +08:00
wjc404 94db259e5b
Add files via upload 2019-07-20 22:04:41 +08:00
wjc404 f49f8047ac
Add files via upload 2019-07-20 14:33:37 +08:00
wjc404 825777faab
Update dgemm_kernel_4x8_haswell.S 2019-07-19 23:58:24 +08:00
wjc404 9c89757562
Add files via upload 2019-07-19 23:47:58 +08:00
Martin Kroeker b0b7600bef
Merge pull request #2186 from wjc404/develop
Update "dgemm_kernel_4x8_haswell.S" for improving performance on zen2 chips
2019-07-18 16:04:44 +02:00
wjc404 9b04baeaee
Update dgemm_kernel_4x8_haswell.S 2019-07-17 23:50:03 +08:00
wjc404 8a074b3965
Update dgemm_kernel_4x8_haswell.S 2019-07-17 23:47:30 +08:00
wjc404 211ab03b14
Update dgemm_kernel_4x8_haswell.S 2019-07-17 22:39:15 +08:00
wjc404 1733f927e6
Update dgemm_kernel_4x8_haswell.S 2019-07-17 21:27:41 +08:00
wjc404 182b06d6ad
Update dgemm_kernel_4x8_haswell.S 2019-07-17 17:02:35 +08:00
wjc404 7a9050d681
Update dgemm_kernel_4x8_haswell.S 2019-07-17 00:55:06 +08:00
wjc404 0ba29fd262
Update dgemm_kernel_4x8_haswell.S for zen2
replaced a bunch of vpermpd instructions with vpermilpd and vperm2f128
2019-07-17 00:46:51 +08:00
Martin Kroeker bafa021ed6
Merge pull request #2181 from isuruf/install_name
Change install_name on osx to match linux
2019-07-09 20:08:52 +02:00
Isuru Fernando b89d9762a2 Change install_name on osx to match linux 2019-07-08 17:14:35 -05:00
Martin Kroeker 08dedf4c5e
Merge pull request #2177 from martin-frbg/noaff
Fix surprising behaviour of NO_AFFINITY=0
2019-07-07 18:28:21 +02:00
Martin Kroeker b89c781637
Fix surprising behaviour of NO_AFFINITY=0 2019-07-07 16:04:45 +02:00
Martin Kroeker dd7ff77f4b
Merge pull request #2175 from martin-frbg/cmake-mingw-fixes
Fix CMAKE compilation with MinGW32 and add it to Appveyor
2019-07-06 18:07:19 +02:00
Martin Kroeker 8fb76134bc
Mingw32 needs leading underscore on object names
(also copy BUNDERSCORE settings for FORTRAN from the corresponding Makefile)
2019-07-06 15:07:15 +02:00