Martin Kroeker
475bd2452b
Suffix BUFFERSIZEs as UL to prevent int overflow in computations
2024-07-11 20:13:57 +02:00
TGY
b5ba95a6c0
Modernize obsolete inline order
2023-08-16 00:48:40 +02:00
Martin Kroeker
0256294921
Fix syntax mixup
2020-11-22 17:41:44 +01:00
Martin Kroeker
60e1fddca7
Ensure that the same (large) BUFFERSIZE is used for all cpus in DYNAMIC_ARCH builds
2020-11-22 16:48:22 +01:00
Martin Kroeker
1d4c96fa0c
Increase BUFFERSIZE further
2020-10-23 00:12:06 +02:00
Martin Kroeker
ee90f30384
Increase BUFFERSIZE for POWER8-10 and use same value for POWER6
...
to fix overflow warning for PWR8 ZGEMM and PWR9 C/ZGEMM and avoid size mismatches in DYNAMIC_ARCH
2020-10-22 18:47:07 +02:00
Martin Kroeker
c9d32674ea
Add memory barrier to the blas_lock implementation for Linux
...
as recommended by cparrott73 in #2760
2020-08-09 19:17:04 +02:00
Rajalakshmi Srinivasaraghavan
9fe930f205
powerpc: Add support for future processor
...
This is the initial patch to support build infrastructure
for POWER10 architecture.
2020-06-11 15:47:20 -05:00
Martin Kroeker
3d4db4d002
Add read barrier definition
2020-04-13 12:16:44 +02:00
Martin Kroeker
1a6ea8ee6d
Merge pull request #2338 from kavanabhat/aix_mod
...
Changes to build on AIX in POWER8 mode
2019-12-09 17:54:49 +01:00
Kavana Bhat
6baa9b07d7
AIX changes for Power8
2019-12-06 04:33:32 -06:00
Martin Kroeker
6fa89b06a1
Use the two-operand form of DCBT on all PPC970 regardless of OS
...
There seems to be no advantage to the three-operand form used in the earliest GotoBLAS kernels, and it causes compilation problems on other than the previously special-cased platforms as well
2019-11-03 22:55:31 +01:00
Kavana Bhat
3dc6b26eff
AIX changes for Power8
2019-08-20 06:51:35 -05:00
pkubaj
5a4f1a2118
Fix build for PPC970 on FreeBSD pt. 1
...
FreeBSD needs DCBT_ARG=0 as well.
2019-06-28 10:29:44 +00:00
Piotr Kubaj
eebfeba768
Fix build on FreeBSD/powerpc64.
...
Signed-off-by: Piotr Kubaj <pkubaj@anongoth.pl>
2019-06-25 10:58:56 +02:00
Martin Kroeker
7c51cc8527
Merge branch 'develop' into develop
2019-03-29 19:36:29 +01:00
AbdelRauf
853a18bc17
power9 makefile. dgemm based on power8 kernel with following changes : 32x unrolled 16x4 kernel and 8x4 kernel using (lxv stxv butterfly rank1 update). improvement from 17 to 22-23gflops. dtrmm cases were added into dgemm itself
2019-03-29 15:49:40 +00:00
Ayappan P
b043a5962e
AIX asm syntax changes needed for shared object creation
2019-03-25 18:53:25 +05:30
ken-cunningham-webuse
f7a06463d9
common_power.h: force DCBT_ARG 0 on PPC970 Darwin
...
without this, we see
../kernel/power/gemv_n.S:427:Parameter syntax error
and many more similar entries
that relates to this assembly command
dcbt 8, r24, r18
this change makes the DCBT_ARG = 0
and openblas builds through to completion on PowerMac 970
Tests pass
2019-03-07 12:03:45 -08:00
Werner Saar
8310d4d3f7
optimized dgemm for 20 threads
2016-05-16 14:14:25 +02:00
Werner Saar
9276c9012f
Optimized sgemm and dgemm and tested again.
2016-04-21 11:37:57 +02:00
Werner Saar
9c42f0374a
Updated cgemm- and sgemm-kernel for POWER8 SMP
2016-04-07 15:08:15 +02:00
Werner Saar
cc26d888b8
BUGFIX: increased BUFFER_SIZE for POWER8
2016-03-04 10:26:53 +01:00
Werner Saar
b752858d6c
added dgemm-, dtrmm-, zgemm- and ztrmm-kernel for power8
2016-03-01 07:33:56 +01:00
Grazvydas Ignotas
6b92204a7c
add fallback blas_lock implementation
...
to be used on armv5 and new platforms
2015-08-16 18:59:17 +02:00
Grazvydas Ignotas
e12cf1123e
add fallback rpcc implementation
...
- use on arm, arm64 and any new platform
- use faster integer math instead of double
- use similar scale as rdtsc so that timeouts work
2015-08-16 18:59:16 +02:00
Matthew Brandyberry
7ba4fe5afb
ppc64le platform support (ELF ABI v2)
2015-07-21 22:20:19 -05:00
Timothy Gu
6c2ead30f0
Remove all trailing whitespace except lapack-netlib
...
Signed-off-by: Timothy Gu <timothygu99@gmail.com>
2014-06-27 12:05:18 -07:00
Xianyi Zhang
342bbc3871
Import GotoBLAS2 1.13 BSD version codes.
2011-01-24 14:54:24 +00:00