Commit Graph

7452 Commits

Author SHA1 Message Date
Ashwin Sekhar T K 19ba133383 THUNDERX2T99: Add Optimized ZGEMM Implementation 2017-02-28 05:31:41 +00:00
Martin Kroeker f09a9afa03 Merge pull request #1107 from quickwritereader/develop
ztrmm(zgemm) complex double precision kernel for ibm z13
2017-02-26 09:49:01 +01:00
Abdurrauf 0d96b0e2a7 Merge branch 'z13' into develop 2017-02-26 06:17:33 +04:00
Abdurrauf 848cb27b1e ztrmm kernel. 2017-02-26 06:14:12 +04:00
Martin Kroeker dc34a0da96 Merge pull request #915 from mdong/small_fix_for_icc
remove input from clobbered list
2017-02-23 20:00:22 +01:00
Ashwin Sekhar T K a3935f0dfb THUNDERX2T99: Add Optimized D/Z NRM2 Implementation 2017-02-23 10:02:15 -08:00
Martin Kroeker 47e9fe0bb4 Merge pull request #1105 from martin-frbg/testing-eig-typos
TESTING/EIG: fix spurious EXTERNAL references to nonexistent functions
2017-02-22 22:42:52 +01:00
Martin Kroeker c7bc0ee823 Remove spurious names from EXTERNAL list
Remove unused (and nonexistent) functions ZHETRD_SY2SB and ZHETRD_SB2ST from comment and EXTERNAL declaration
2017-02-22 21:48:35 +01:00
Martin Kroeker 6bdee6d50a Remove spurious names from EXTERNAL list
Remove unused (and nonexistent) ZHETRD_SY2SB and ZHETRD_SB2ST
2017-02-22 21:45:27 +01:00
Martin Kroeker 009c0d2e5a Fix typo in EXTERNAL declaration
ZHBTRD_HB2ST  should be ZHETRD_HB2ST
2017-02-22 21:41:07 +01:00
Martin Kroeker 4d88e1a4ad Merge pull request #1104 from martin-frbg/lapack-comma
LAPACK: fix missing comma on continued lines
2017-02-22 10:31:39 +01:00
Martin Kroeker 0958b49811 Fix missing comma on continued line
EXTERNAL declaration of subroutines missed a comma before the continuation line,
causing a strange run-together name to appear in the object when compiled with ifort.
2017-02-22 08:40:39 +01:00
Martin Kroeker 09b240f1ef Fix missing comma on continued line
EXTERNAL declaration of subroutines missed a comma before the continuation line,
causing a strange run-together name to appear in the object when compiled with ifort.
2017-02-22 08:39:06 +01:00
Martin Kroeker 69f4e8b86c Fix missing comma on continued line
EXTERNAL declaration of subroutines missed a comma before the continuation line,
causing a strange run-together name to appear in the object when compiled with ifort.
2017-02-22 08:34:20 +01:00
Martin Kroeker e072e68aa0 Fix missing comma in continued line
EXTERNAL declaration of subroutines missed a comma before the continuation line,
causing a strange run-together name to appear in the object when compiled with ifort.
2017-02-22 08:32:20 +01:00
Ashwin Sekhar T K 738628e9a8 ARM64: Remove unused code 2017-02-21 21:42:32 -08:00
Martin Kroeker e527dbffaa Merge pull request #1103 from vladimir-ch/fix-lapacke-ormbr
LAPACKE: fix wrong matrix size in ?ormbr
2017-02-21 22:58:30 +01:00
Vladimir Chalupecky eeaee46e86 LAPACKE: fix wrong matrix size in ?ormbr
Changes made upstream in Reference LAPACK in
https://github.com/Reference-LAPACK/lapack/pull/128
2017-02-21 21:57:18 +01:00
Martin Kroeker 040672ecf6 Merge pull request #1098 from martin-frbg/amodra-power8
Power8 inline assembly fixes
2017-02-21 15:26:14 +01:00
Martin Kroeker c8ce9e4377 Merge pull request #1101 from martin-frbg/martin-frbg-patch-1
LAPACKE: fix wrong number of columns in ?ormlq
2017-02-21 15:19:56 +01:00
Ashwin Sekhar T K ab3ffab96a THUNDERX2T99: Add Optimized C/Z DOT Implementation 2017-02-21 03:40:59 -08:00
Ashwin Sekhar T K f036be9ce2 THUNDERX2T99: Add Optimized SDOT Implementation 2017-02-21 03:24:32 -08:00
Martin Kroeker 39eecfd20c Merge pull request #1102 from brada4/develop
Correct Apollo Lake CPUID identification in dynamic_arch builds
2017-02-21 08:26:39 +01:00
Andrew 5088523786 detect apollo lake for real 2017-02-20 23:54:59 +01:00
Martin Kroeker 3f7720ec4b LAPACKE: fix wrong number of columns in ?ormlq
Copied from lapack https://github.com/Reference-LAPACK/lapack/pull/127  by vladimir-ch (with earlier changes from echeresh's  
PR 115 "lapacke_*ormlq_work: move declarations under if" there as they touched some of the same files)
2017-02-20 16:20:43 +01:00
Ashwin Sekhar T K faba876fda THUNDERX2T99: Bug fix in C/Z IAMAX 2017-02-19 23:11:50 -08:00
Ashwin Sekhar T K 172a62d73e THUNDERX2T99: Add Optimized C/Z IAMAX Implementation 2017-02-17 03:06:32 -08:00
Martin Kroeker e545a66a5b Merge pull request #1091 from staticfloat/sf/corei5_7600k
CPUID mappings for Core i5-7600K (Kaby Lake)
2017-02-17 10:30:09 +01:00
Ashwin Sekhar T K 228c75a69c THUNDERX2T99: Add parallel SCNRM2 Implementation 2017-02-14 04:10:06 -08:00
Martin Kroeker 9e2f316ede Power8 inline assembly fixes
Quoting patch author amodra from #1078
Lots of issues here.
- The vsx regs weren't listed as clobbered.
- Poor choice of vsx regs, which along with the lack of clobbers led to
  trashing v0..v21 and fr14..fr23.  Ideally you'd let gcc choose all
  temp vsx regs, but asms currently have a limit of 30 i/o parms.
- Other regs were clobbered unnecessarily, seemingly in an attempt to
  clobber inputs, with gcc-7 complaining about the clobber of r2.
  (Changed inputs should be also listed as outputs or as an i/o.)
- "r" constraint used instead of "b" for gprs used in insns where the
  r0 encoding means zero rather than r0.
- There were unused asm inputs too.
- All memory was clobbered rather than hooking up memory outputs with
  proper memory constraints, and that and the lack of proper memory
  input constraints meant the asms needed to be volatile and their
  containing function noinline.
- Some parameters were being passed unnecessarily via memory.
- When a copy of a
2017-02-13 23:38:50 +01:00
Martin Kroeker e2489c9a92 Merge pull request #1096 from martin-frbg/pkg-config
Build only openblas.pc for pkg-config and install it from cmake as well
2017-02-12 17:00:17 +01:00
Martin Kroeker c4ea9eea67 Add cmake template for openblas.pc 2017-02-12 14:38:32 +01:00
Martin Kroeker cd8f80634f Create and install openblas.pc in cmake builds 2017-02-12 14:37:33 +01:00
Martin Kroeker faf06f0d8b Create and install only a single openblas.pc file 2017-02-12 14:35:48 +01:00
Martin Kroeker c6fa4aef0c Rename blas.pc.in to openblas.pc.in 2017-02-12 14:34:03 +01:00
Martin Kroeker 1029dcd60d Merge pull request #1095 from martin-frbg/lapack370-cmake
Update cmakefiles for netlib 3.7.0
2017-02-12 14:30:29 +01:00
Martin Kroeker d12c8bbcbb Add zlasyf_aa to lapack.cmake 2017-02-12 13:49:49 +01:00
Martin Kroeker 15f0d65010 Add another bunch of lapack 3.7 functions to cmake list 2017-02-12 01:59:30 +01:00
Martin Kroeker 7d831af1ba Add LAPACK 3.7 files not mentioned in announcement 2017-02-12 01:37:35 +01:00
Martin Kroeker ee3e87cf46 Update cmake file list for lapacke 3.7.0 2017-02-12 00:40:16 +01:00
Martin Kroeker 8772c00bb0 Update cmake file list for lapack 3.7.0 2017-02-11 23:11:26 +01:00
Martin Kroeker 0a4a7e18f6 Merge pull request #1094 from martin-frbg/cmake-1
Update cmakefiles with changes from netlib 3.6.1
2017-02-11 20:48:41 +01:00
Martin Kroeker 357ef3cd8c Reflect name change of lapacke_mangling.h template 2017-02-11 19:56:02 +01:00
Martin Kroeker 002e646476 Add new functions from LAPACK 3.6.1 2017-02-11 19:54:02 +01:00
Martin Kroeker 3dad87bbb5 Merge pull request #1093 from martin-frbg/restore-cmakeinstall
Restore cmake install target
2017-02-11 17:41:39 +01:00
Martin Kroeker bdd51cdabc Add cmake install target
Add CMAKE install target (based on patch provided by PrimarchOfTheSpaceWolves in #957)
This was originally merged as 988 but accidentally reverted by my subsequent PR the following day
2017-02-11 16:43:46 +01:00
Elliot Saba 1d8ab99e09 Add `exfamily == 9` case (Kaby Lake) to dynamic arch detection 2017-02-10 15:23:55 -08:00
Elliot Saba 04b2b06665 CPUID mappings for Core i5-7600K (Kaby Lake) 2017-02-10 14:53:15 -08:00
Martin Kroeker 8a83daf4bf Merge pull request #1084 from isuruf/develop
Install pkg-config files
2017-02-08 01:01:18 +01:00
Martin Kroeker 39abb079fb Merge pull request #1087 from grisuthedragon/enable-a12
Enable EXCAVATOR kernels for A12-9800
2017-02-08 01:00:32 +01:00