Rajalakshmi Srinivasaraghavan
|
b5d30b390d
|
Fix build issues with bfloat16
This patch fixes compilation errors due to recent renaming from SH to SB
with BUILD_BFLOAT16.
|
2020-10-13 11:00:22 -05:00 |
Martin Kroeker
|
d85b968424
|
Merge pull request #2891 from martin-frbg/fix-2886
Fix several bugs and omissions from the BFLOAT16 rename
|
2020-10-13 13:46:17 +02:00 |
Martin Kroeker
|
9dca578c79
|
Cleanup
|
2020-10-13 10:14:08 +02:00 |
Martin Kroeker
|
1e7eb7b7a9
|
Fix typos in currently unused sections
|
2020-10-13 09:17:15 +02:00 |
Martin Kroeker
|
84949754a0
|
Fix bfloat16 conditional
|
2020-10-13 09:11:36 +02:00 |
Martin Kroeker
|
2ae8785603
|
Add a POWER9 build with BFLOAT16 enabled
|
2020-10-13 09:07:50 +02:00 |
Martin Kroeker
|
e05af6575e
|
Fix some overlooked "SHBLAS" entries
|
2020-10-13 09:05:04 +02:00 |
Martin Kroeker
|
c1643006ae
|
Merge pull request #97 from xianyi/develop
rebase
|
2020-10-13 09:01:49 +02:00 |
Martin Kroeker
|
08929430cd
|
Merge pull request #2886 from martin-frbg/issue_2767
Rename "HALF" precision functions (sh prefix) to "BFLOAT16" with "sb" prefix
|
2020-10-13 00:04:35 +02:00 |
Martin Kroeker
|
0c84ffe05f
|
Merge pull request #2881 from mattip/fninit
add fninit to reset fpu registers before assembler routines
|
2020-10-12 23:50:41 +02:00 |
Martin Kroeker
|
cb4274e3ad
|
Merge pull request #2888 from Qiyu8/usimd-sum
Optimize the performance of sum by using universal intrinsics
|
2020-10-12 23:22:08 +02:00 |
Matti Picus
|
403eb513a0
|
use emms instead, add WIN guards
|
2020-10-12 18:15:01 +03:00 |
Martin Kroeker
|
cb839575ed
|
Convert the prototypes of the unimplemented BFLOAT16 functions to the new naming scheme
|
2020-10-12 14:44:33 +02:00 |
Qiyu8
|
0ed1f07660
|
Optimize the performance of sum by using universal intrinsics
|
2020-10-12 19:48:53 +08:00 |
Martin Kroeker
|
bb74dd29db
|
Restore -msse3
|
2020-10-12 00:42:05 +02:00 |
Martin Kroeker
|
629c497b6c
|
common_sh.h renamed to common_sb.h
|
2020-10-12 00:27:11 +02:00 |
Martin Kroeker
|
2c552f1074
|
Change "HALF" and "sh" to "BFLOAT16" and "sb"
|
2020-10-12 00:11:31 +02:00 |
Martin Kroeker
|
7ae9e8960e
|
Change "HALF" and "sh" to "BFLOAT16" and "sb"
|
2020-10-12 00:08:29 +02:00 |
Martin Kroeker
|
e3a29f6b58
|
Change "HALF" and "sh" to "BFLOAT16" and "sb"
|
2020-10-12 00:07:37 +02:00 |
Martin Kroeker
|
006c7f6671
|
Change "HALF" and "sh" to "BFLOAT16" and "sb"
|
2020-10-12 00:06:06 +02:00 |
Martin Kroeker
|
85154c2e18
|
Change "HALF" and "sh" to "BFLOAT16" and "sb"
|
2020-10-12 00:05:05 +02:00 |
Martin Kroeker
|
ae1ab5bfdf
|
Change "HALF" and "sh" to "BFLOAT16" and "sb"
|
2020-10-12 00:03:21 +02:00 |
Martin Kroeker
|
052f31bc3c
|
Change "HALF" and "sh" to "BFLOAT16" and "sb"
|
2020-10-12 00:02:16 +02:00 |
Martin Kroeker
|
3aecafad80
|
Change "HALF" and "sh" to "BFLOAT16" and "sb"
|
2020-10-12 00:00:55 +02:00 |
Martin Kroeker
|
756062afa5
|
Rename "HALF" and "sh" to "BFLOAT16" and "sb"
|
2020-10-11 23:56:17 +02:00 |
Martin Kroeker
|
2061f7fdff
|
Rename "HALF" and "sh" to "BFLOAT16" and "sb"
|
2020-10-11 23:54:53 +02:00 |
Martin Kroeker
|
dc8a1afa63
|
Rename "HALF" and "sh" to "BFLOAT16" and "sb"
|
2020-10-11 23:53:50 +02:00 |
Martin Kroeker
|
32733ded04
|
Rename "HALF" and "sh" to "BFLOAT16" and "sb"
|
2020-10-11 23:52:45 +02:00 |
Martin Kroeker
|
3bc8e8c334
|
Rename "HALF" and "sh" to "BFLOAT16"and "sb"
|
2020-10-11 23:51:34 +02:00 |
Martin Kroeker
|
573508f0ee
|
Rename common_sh.h to common_sb.h
|
2020-10-11 23:50:54 +02:00 |
Martin Kroeker
|
ca31c32693
|
Rename "HALF" and "sh" to "BFLOAT16" and "sb"
|
2020-10-11 23:49:22 +02:00 |
Martin Kroeker
|
5800758b43
|
Rename "HALF" and "sh" to "BFLOAT16" and "sb"
|
2020-10-11 23:44:38 +02:00 |
Martin Kroeker
|
924fd806d0
|
Rename "HALF" and "sh" to "BFLOAT16" and "sb"
|
2020-10-11 23:43:36 +02:00 |
Martin Kroeker
|
4db09c6cec
|
Rename compare_sgemm_shgemm.c to compare_sgemm_sbgemm.c
|
2020-10-11 23:42:45 +02:00 |
Martin Kroeker
|
fd94236042
|
Rename "HALF" and "sh" to "BFLOAT16" and "sb"
|
2020-10-11 23:42:07 +02:00 |
Martin Kroeker
|
68ce719fac
|
Rename shdot_microk_cooperlake.c to sbdot_microk_cooperlake.c
|
2020-10-11 23:41:13 +02:00 |
Martin Kroeker
|
d7dd9b396c
|
Rename shdot.c to sbdot.c
|
2020-10-11 23:40:43 +02:00 |
Martin Kroeker
|
9ae80490e0
|
rename "HALF" and "sh" to "BFLOAT16" and "sb"
|
2020-10-11 23:39:42 +02:00 |
Martin Kroeker
|
d314d1f49f
|
Rename shgemm_kernel_power10.c to sbgemm_kernel_power10.c
|
2020-10-11 23:37:38 +02:00 |
Martin Kroeker
|
f0883740e4
|
Merge pull request #96 from xianyi/develop
rebase
|
2020-10-11 23:34:36 +02:00 |
Martin Kroeker
|
1c0b03efb4
|
Merge branch 'develop' into develop
|
2020-10-11 23:34:14 +02:00 |
Martin Kroeker
|
c589c3e2a1
|
Merge pull request #2882 from martin-frbg/issue2709
Use generic C for (D/Z)NRM2 on Windows x86_64
|
2020-10-11 22:22:30 +02:00 |
Martin Kroeker
|
ec638a82bf
|
Merge pull request #2852 from martin-frbg/issue2588-cmake
Support building only a subset of variable types
|
2020-10-11 22:21:33 +02:00 |
Martin Kroeker
|
caa0d757ca
|
repair TABs
|
2020-10-11 18:29:34 +02:00 |
Martin Kroeker
|
6154f72d6d
|
Copy BUILD_ settings to the LAPACK make.inc
|
2020-10-11 18:25:16 +02:00 |
Martin Kroeker
|
ae8b0d257a
|
Set BUILD_ options to 1 instead of just defining them
|
2020-10-11 18:08:21 +02:00 |
Martin Kroeker
|
1da32cc1fc
|
Add cblas_xerbla interface
|
2020-10-11 17:45:41 +02:00 |
Martin Kroeker
|
8c5e08076e
|
If none of the BUILD_ options is set, enable them all
|
2020-10-11 17:33:51 +02:00 |
Martin Kroeker
|
5f23bdf437
|
remove debug output
|
2020-10-11 17:23:08 +02:00 |
Martin Kroeker
|
b593e6b650
|
Merge pull request #2885 from martin-frbg/ifexists
Improve CMAKE check for conflicting config_kernel.h
|
2020-10-11 15:45:24 +02:00 |