Martin Kroeker
7dc8a76f60
Merge pull request #2293 from martin-frbg/pr2288
...
Add support for NetBSD by adding it to the existing xBSD conditionals
2019-10-25 23:46:39 +02:00
Martin Kroeker
df857551c0
Remove special parameter set for obsolete IOS/ARMV8 workaround
2019-10-25 23:07:00 +02:00
Martin Kroeker
85ccdce8c4
Remove the IOS fallbacks to generic C kernels
2019-10-25 23:02:37 +02:00
Martin Kroeker
aeabe0a83f
Fix regex to parse -R options with and without whitespace
...
Both forms are seen on NetBSD (#2288 )
2019-10-25 22:52:30 +02:00
Martin Kroeker
1b90989662
Add NetBSD to the xBSD conditionals
2019-10-25 12:52:49 +02:00
Martin Kroeker
e3e8b5cdca
Add NetBSD
2019-10-25 12:51:06 +02:00
Martin Kroeker
69b16a894d
Merge pull request #2292 from martin-frbg/g95fixes
...
Improve support for g95 and non-GNU ld
2019-10-25 10:35:17 +02:00
Martin Kroeker
6782e5767d
Merge pull request #2291 from martin-frbg/gensymbol
...
Fix netlib 3.7/3.8 function enumeration for linktest
2019-10-25 10:34:50 +02:00
Martin Kroeker
48f5a89f92
Merge pull request #2282 from martin-frbg/issue2281
...
Optimize RPCC function on ARM64
2019-10-25 09:56:30 +02:00
Martin Kroeker
4ae1610f37
Merge pull request #2290 from martin-frbg/cpuidfixes
...
Fixup x86 cpuid changes from #2283
2019-10-24 22:52:15 +02:00
Martin Kroeker
911c3e2f4b
Improve support for g95 and non-GNU ld
...
Auto-add "-fno-second-underscore" option to make LAPACKE compile (as it calls LAPACK functions that may have gotten a second underscore added otherwise). Also support -R for rpath when parsing compiler directives in f_check
2019-10-24 22:43:27 +02:00
Martin Kroeker
fab49e49e5
Move most lapack 3.7/3.8 additions to the embedded_underscores list
...
to allow linktest to pass with a compiler that adds a second underscore to such names
2019-10-24 21:26:20 +02:00
Martin Kroeker
b687fba5bc
Disable direct clock register access on IOS and Android
...
as I find conflicting information on accessibility from non-priviledged processes
2019-10-24 21:18:17 +02:00
luzpaz
46a8c2519a
Remove prototype of unused, unimplemented function ( #2274 )
...
* Fix source typo
Found via `codespell -q 3 -L amin,als,ba,dum,mone,nd,nto,orign -S Changelog.txt,./lapack*`
* Remove beta-thread function per request
2019-10-24 18:56:53 +02:00
Martin Kroeker
e9437eebd2
Restore Goldmont ID and improve QEMU support
...
#2283 had inadvertently removed Goldmont+, and cpuid was reporting a mix of Core2 and Pentium2 for some QEMU configurations
2019-10-24 18:45:27 +02:00
Martin Kroeker
3a39062cfc
Merge pull request #12 from xianyi/develop
...
resync with upstream
2019-10-24 18:40:13 +02:00
Martin Kroeker
eaa0be1313
Merge pull request #2286 from wjc404/develop
...
AVX512 DGEMM kernel
2019-10-20 12:44:19 +02:00
wjc404
6ff013bae0
native support for icopy_4
...
90% MKL 1-thread performance.
2019-10-19 03:54:44 +08:00
wjc404
0d669e04bb
Update dgemm_kernel_8x8_skylakex.c
2019-10-18 15:00:17 +08:00
wjc404
17cdd9f9e1
some correction
2019-10-18 14:58:07 +08:00
wjc404
6bcb06fcb1
make further changes to icopy_8 easier
2019-10-18 10:47:31 +08:00
wjc404
b7315f8401
Add files via upload
2019-10-16 19:23:36 +08:00
wjc404
9b19e9e1b0
Update dgemm_kernel_8x8_skylakex.c
2019-10-16 10:14:51 +08:00
wjc404
6bd67ddbab
Update dgemm_kernel_8x8_skylakex.c
2019-10-16 03:20:08 +08:00
wjc404
5da9484d93
Add files via upload
2019-10-16 02:01:13 +08:00
wjc404
844629af57
Add files via upload
2019-10-16 02:00:34 +08:00
Martin Kroeker
2beaa82c05
Merge pull request #2283 from martin-frbg/issue2176
...
Support QEMU virtual cpu in 64bit mode as CORE2 or BARCELONA
2019-10-09 22:06:09 +02:00
Martin Kroeker
e8a2aed2b9
Support QEMU cpu calling itself 64bit AMD Athlon as well
...
Some QEMU instances pretend to be "AuthenticAMD" with the same family 6/model 6 even when running on an Intel host
(could be related to qemu or libvirt version and/or kvm availability). Also fix the define to depend on __x86_64__ set by the
compiler, the defines using __64BIT__ will only work for getarch_2nd.
2019-10-09 18:24:13 +02:00
Martin Kroeker
f262031685
Support QEMU virtual cpu as CORE2
...
qemu itself claims it is a 64bit P6, which does not exist in the wild.
2019-10-08 22:30:02 +02:00
Martin Kroeker
5f6206fa2d
Simplify OSX/IOS cross-compilation and add a CI test for it ( #2279 )
...
* Add automatic fixups for OSX/IOS cross-compilation
* Add OSX/IOS cross-compilation test to Travis CI
* Handle platforms that lack hwcap.h by falling back to ARMV8
* Fix PROLOGUE for OSX/IOS
2019-10-08 20:13:14 +02:00
Martin Kroeker
f2cde2ccfb
Update common_arm64.h
2019-10-08 20:12:08 +02:00
Martin Kroeker
ba7838d2e1
Merge pull request #2280 from martin-frbg/iosfix
...
Add overlooked part of IOS compilation fix
2019-10-08 10:25:25 +02:00
Martin Kroeker
a448884a63
Remove automatic label postfixes from macro included only once
2019-10-08 08:37:50 +02:00
Martin Kroeker
17609f88f1
Merge pull request #11 from xianyi/develop
...
sync with upstream
2019-10-08 08:32:52 +02:00
Martin Kroeker
3a2df19db6
Fix accidental duplication of jump instruction
2019-10-08 08:09:26 +02:00
Martin Kroeker
d2093a40d3
Merge pull request #2277 from martin-frbg/issue2275
...
Rewrite ARMV8 code to allow cross-compilation for IOS
2019-10-06 23:01:54 +02:00
Martin Kroeker
aa04b0925e
Merge pull request #2276 from xianyi/revert-2272-thread-sqrt-of-negative
...
Revert "Avoid taking root of negative number in symv_thread.c"
2019-10-06 11:12:44 +02:00
Martin Kroeker
258ac56e0a
Move 32bit OSX build back to xcode 8.3 but switch to gcc8
2019-10-05 10:52:47 +02:00
Martin Kroeker
56837e9d92
Make local labels in macro compatible with the xcode assembler
...
... which does not perform the automatic numbering on instantiation that the _@ suffix signifies
2019-10-04 14:53:23 +02:00
Martin Kroeker
bb5413863f
Rewrite ARM64 PROLOGUE to make it compatible with xcode/ios
2019-10-04 14:50:03 +02:00
Martin Kroeker
32f5907fef
Update 32bit macOS again to xcode 9.3
...
os version 10.13 "High Sierra" appears to be the oldest release now for which Homebrew provides a gcc package.
Anything older and the Travis job will run out of time building gcc from source
2019-10-03 01:09:02 +02:00
Martin Kroeker
ac10236cc8
Update the OSX BINARY=32 test to xcode9.2
...
in response to Homebrew updates
2019-10-02 22:35:34 +02:00
Martin Kroeker
8617d75548
Revert "Avoid taking root of negative number in symv_thread.c"
2019-10-01 23:50:41 +02:00
Martin Kroeker
c07d78b9e9
Merge pull request #2272 from seberg/thread-sqrt-of-negative
...
Avoid taking root of negative number in symv_thread.c
2019-09-30 11:27:29 +02:00
Sebastian Berg
6355c25dde
Avoid taking root of negative number in symv_thread.c
...
This is similar to fixes in gh-1929, but there was one remaining
occurance of this type of pattern in the driver/level2/*_thread.c
files.
2019-09-29 22:03:12 -07:00
Martin Kroeker
5e244d80f2
Merge pull request #2271 from quickwritereader/strmm_fix
...
fixed bug power9 strmm . BLAS-TESTER passes
2019-09-29 13:53:45 +02:00
AbdelRauf
ede5efebab
trmm fix
2019-09-29 02:28:34 +00:00
Martin Kroeker
84908d60d2
Merge pull request #2269 from martin-frbg/ppc-fixes
...
Ppc fixes
2019-09-27 09:52:19 +02:00
Martin Kroeker
596a22325a
Fix prologue of power9 assembly cdot(c) kernel to provide cdotc
2019-09-27 00:47:18 +02:00
Martin Kroeker
7f58f3ad0e
Fix mis-edits in the gcc-derived power8 caxpy kernel
2019-09-27 00:44:26 +02:00