Commit Graph

5072 Commits

Author SHA1 Message Date
Rajalakshmi Srinivasaraghavan
b5d30b390d Fix build issues with bfloat16
This patch fixes compilation errors due to recent renaming from SH to SB
with BUILD_BFLOAT16.
2020-10-13 11:00:22 -05:00
Martin Kroeker
d85b968424 Merge pull request #2891 from martin-frbg/fix-2886
Fix several bugs and omissions from the BFLOAT16 rename
2020-10-13 13:46:17 +02:00
Martin Kroeker
9dca578c79 Cleanup 2020-10-13 10:14:08 +02:00
Martin Kroeker
1e7eb7b7a9 Fix typos in currently unused sections 2020-10-13 09:17:15 +02:00
Martin Kroeker
84949754a0 Fix bfloat16 conditional 2020-10-13 09:11:36 +02:00
Martin Kroeker
2ae8785603 Add a POWER9 build with BFLOAT16 enabled 2020-10-13 09:07:50 +02:00
Martin Kroeker
e05af6575e Fix some overlooked "SHBLAS" entries 2020-10-13 09:05:04 +02:00
Martin Kroeker
c1643006ae Merge pull request #97 from xianyi/develop
rebase
2020-10-13 09:01:49 +02:00
Martin Kroeker
08929430cd Merge pull request #2886 from martin-frbg/issue_2767
Rename "HALF" precision functions (sh prefix) to "BFLOAT16" with "sb" prefix
2020-10-13 00:04:35 +02:00
Martin Kroeker
0c84ffe05f Merge pull request #2881 from mattip/fninit
add fninit to reset fpu registers before assembler routines
2020-10-12 23:50:41 +02:00
Martin Kroeker
cb4274e3ad Merge pull request #2888 from Qiyu8/usimd-sum
Optimize the performance of sum by using universal intrinsics
2020-10-12 23:22:08 +02:00
Matti Picus
403eb513a0 use emms instead, add WIN guards 2020-10-12 18:15:01 +03:00
Martin Kroeker
cb839575ed Convert the prototypes of the unimplemented BFLOAT16 functions to the new naming scheme 2020-10-12 14:44:33 +02:00
Qiyu8
0ed1f07660 Optimize the performance of sum by using universal intrinsics 2020-10-12 19:48:53 +08:00
Martin Kroeker
bb74dd29db Restore -msse3 2020-10-12 00:42:05 +02:00
Martin Kroeker
629c497b6c common_sh.h renamed to common_sb.h 2020-10-12 00:27:11 +02:00
Martin Kroeker
2c552f1074 Change "HALF" and "sh" to "BFLOAT16" and "sb" 2020-10-12 00:11:31 +02:00
Martin Kroeker
7ae9e8960e Change "HALF" and "sh" to "BFLOAT16" and "sb" 2020-10-12 00:08:29 +02:00
Martin Kroeker
e3a29f6b58 Change "HALF" and "sh" to "BFLOAT16" and "sb" 2020-10-12 00:07:37 +02:00
Martin Kroeker
006c7f6671 Change "HALF" and "sh" to "BFLOAT16" and "sb" 2020-10-12 00:06:06 +02:00
Martin Kroeker
85154c2e18 Change "HALF" and "sh" to "BFLOAT16" and "sb" 2020-10-12 00:05:05 +02:00
Martin Kroeker
ae1ab5bfdf Change "HALF" and "sh" to "BFLOAT16" and "sb" 2020-10-12 00:03:21 +02:00
Martin Kroeker
052f31bc3c Change "HALF" and "sh" to "BFLOAT16" and "sb" 2020-10-12 00:02:16 +02:00
Martin Kroeker
3aecafad80 Change "HALF" and "sh" to "BFLOAT16" and "sb" 2020-10-12 00:00:55 +02:00
Martin Kroeker
756062afa5 Rename "HALF" and "sh" to "BFLOAT16" and "sb" 2020-10-11 23:56:17 +02:00
Martin Kroeker
2061f7fdff Rename "HALF" and "sh" to "BFLOAT16" and "sb" 2020-10-11 23:54:53 +02:00
Martin Kroeker
dc8a1afa63 Rename "HALF" and "sh" to "BFLOAT16" and "sb" 2020-10-11 23:53:50 +02:00
Martin Kroeker
32733ded04 Rename "HALF" and "sh" to "BFLOAT16" and "sb" 2020-10-11 23:52:45 +02:00
Martin Kroeker
3bc8e8c334 Rename "HALF" and "sh" to "BFLOAT16"and "sb" 2020-10-11 23:51:34 +02:00
Martin Kroeker
573508f0ee Rename common_sh.h to common_sb.h 2020-10-11 23:50:54 +02:00
Martin Kroeker
ca31c32693 Rename "HALF" and "sh" to "BFLOAT16" and "sb" 2020-10-11 23:49:22 +02:00
Martin Kroeker
5800758b43 Rename "HALF" and "sh" to "BFLOAT16" and "sb" 2020-10-11 23:44:38 +02:00
Martin Kroeker
924fd806d0 Rename "HALF" and "sh" to "BFLOAT16" and "sb" 2020-10-11 23:43:36 +02:00
Martin Kroeker
4db09c6cec Rename compare_sgemm_shgemm.c to compare_sgemm_sbgemm.c 2020-10-11 23:42:45 +02:00
Martin Kroeker
fd94236042 Rename "HALF" and "sh" to "BFLOAT16" and "sb" 2020-10-11 23:42:07 +02:00
Martin Kroeker
68ce719fac Rename shdot_microk_cooperlake.c to sbdot_microk_cooperlake.c 2020-10-11 23:41:13 +02:00
Martin Kroeker
d7dd9b396c Rename shdot.c to sbdot.c 2020-10-11 23:40:43 +02:00
Martin Kroeker
9ae80490e0 rename "HALF" and "sh" to "BFLOAT16" and "sb" 2020-10-11 23:39:42 +02:00
Martin Kroeker
d314d1f49f Rename shgemm_kernel_power10.c to sbgemm_kernel_power10.c 2020-10-11 23:37:38 +02:00
Martin Kroeker
f0883740e4 Merge pull request #96 from xianyi/develop
rebase
2020-10-11 23:34:36 +02:00
Martin Kroeker
1c0b03efb4 Merge branch 'develop' into develop 2020-10-11 23:34:14 +02:00
Martin Kroeker
c589c3e2a1 Merge pull request #2882 from martin-frbg/issue2709
Use generic C for (D/Z)NRM2 on Windows x86_64
2020-10-11 22:22:30 +02:00
Martin Kroeker
ec638a82bf Merge pull request #2852 from martin-frbg/issue2588-cmake
Support building only a subset of variable types
2020-10-11 22:21:33 +02:00
Martin Kroeker
caa0d757ca repair TABs 2020-10-11 18:29:34 +02:00
Martin Kroeker
6154f72d6d Copy BUILD_ settings to the LAPACK make.inc 2020-10-11 18:25:16 +02:00
Martin Kroeker
ae8b0d257a Set BUILD_ options to 1 instead of just defining them 2020-10-11 18:08:21 +02:00
Martin Kroeker
1da32cc1fc Add cblas_xerbla interface 2020-10-11 17:45:41 +02:00
Martin Kroeker
8c5e08076e If none of the BUILD_ options is set, enable them all 2020-10-11 17:33:51 +02:00
Martin Kroeker
5f23bdf437 remove debug output 2020-10-11 17:23:08 +02:00
Martin Kroeker
b593e6b650 Merge pull request #2885 from martin-frbg/ifexists
Improve CMAKE check for conflicting config_kernel.h
2020-10-11 15:45:24 +02:00