Commit Graph

5072 Commits

Author SHA1 Message Date
Rajalakshmi Srinivasaraghavan b5d30b390d Fix build issues with bfloat16
This patch fixes compilation errors due to recent renaming from SH to SB
with BUILD_BFLOAT16.
2020-10-13 11:00:22 -05:00
Martin Kroeker d85b968424
Merge pull request #2891 from martin-frbg/fix-2886
Fix several bugs and omissions from the BFLOAT16 rename
2020-10-13 13:46:17 +02:00
Martin Kroeker 9dca578c79
Cleanup 2020-10-13 10:14:08 +02:00
Martin Kroeker 1e7eb7b7a9
Fix typos in currently unused sections 2020-10-13 09:17:15 +02:00
Martin Kroeker 84949754a0
Fix bfloat16 conditional 2020-10-13 09:11:36 +02:00
Martin Kroeker 2ae8785603
Add a POWER9 build with BFLOAT16 enabled 2020-10-13 09:07:50 +02:00
Martin Kroeker e05af6575e
Fix some overlooked "SHBLAS" entries 2020-10-13 09:05:04 +02:00
Martin Kroeker c1643006ae
Merge pull request #97 from xianyi/develop
rebase
2020-10-13 09:01:49 +02:00
Martin Kroeker 08929430cd
Merge pull request #2886 from martin-frbg/issue_2767
Rename "HALF" precision functions (sh prefix) to "BFLOAT16" with "sb" prefix
2020-10-13 00:04:35 +02:00
Martin Kroeker 0c84ffe05f
Merge pull request #2881 from mattip/fninit
add fninit to reset fpu registers before assembler routines
2020-10-12 23:50:41 +02:00
Martin Kroeker cb4274e3ad
Merge pull request #2888 from Qiyu8/usimd-sum
Optimize the performance of sum by using universal intrinsics
2020-10-12 23:22:08 +02:00
Matti Picus 403eb513a0 use emms instead, add WIN guards 2020-10-12 18:15:01 +03:00
Martin Kroeker cb839575ed
Convert the prototypes of the unimplemented BFLOAT16 functions to the new naming scheme 2020-10-12 14:44:33 +02:00
Qiyu8 0ed1f07660 Optimize the performance of sum by using universal intrinsics 2020-10-12 19:48:53 +08:00
Martin Kroeker bb74dd29db
Restore -msse3 2020-10-12 00:42:05 +02:00
Martin Kroeker 629c497b6c
common_sh.h renamed to common_sb.h 2020-10-12 00:27:11 +02:00
Martin Kroeker 2c552f1074
Change "HALF" and "sh" to "BFLOAT16" and "sb" 2020-10-12 00:11:31 +02:00
Martin Kroeker 7ae9e8960e
Change "HALF" and "sh" to "BFLOAT16" and "sb" 2020-10-12 00:08:29 +02:00
Martin Kroeker e3a29f6b58
Change "HALF" and "sh" to "BFLOAT16" and "sb" 2020-10-12 00:07:37 +02:00
Martin Kroeker 006c7f6671
Change "HALF" and "sh" to "BFLOAT16" and "sb" 2020-10-12 00:06:06 +02:00
Martin Kroeker 85154c2e18
Change "HALF" and "sh" to "BFLOAT16" and "sb" 2020-10-12 00:05:05 +02:00
Martin Kroeker ae1ab5bfdf
Change "HALF" and "sh" to "BFLOAT16" and "sb" 2020-10-12 00:03:21 +02:00
Martin Kroeker 052f31bc3c
Change "HALF" and "sh" to "BFLOAT16" and "sb" 2020-10-12 00:02:16 +02:00
Martin Kroeker 3aecafad80
Change "HALF" and "sh" to "BFLOAT16" and "sb" 2020-10-12 00:00:55 +02:00
Martin Kroeker 756062afa5
Rename "HALF" and "sh" to "BFLOAT16" and "sb" 2020-10-11 23:56:17 +02:00
Martin Kroeker 2061f7fdff
Rename "HALF" and "sh" to "BFLOAT16" and "sb" 2020-10-11 23:54:53 +02:00
Martin Kroeker dc8a1afa63
Rename "HALF" and "sh" to "BFLOAT16" and "sb" 2020-10-11 23:53:50 +02:00
Martin Kroeker 32733ded04
Rename "HALF" and "sh" to "BFLOAT16" and "sb" 2020-10-11 23:52:45 +02:00
Martin Kroeker 3bc8e8c334
Rename "HALF" and "sh" to "BFLOAT16"and "sb" 2020-10-11 23:51:34 +02:00
Martin Kroeker 573508f0ee
Rename common_sh.h to common_sb.h 2020-10-11 23:50:54 +02:00
Martin Kroeker ca31c32693
Rename "HALF" and "sh" to "BFLOAT16" and "sb" 2020-10-11 23:49:22 +02:00
Martin Kroeker 5800758b43
Rename "HALF" and "sh" to "BFLOAT16" and "sb" 2020-10-11 23:44:38 +02:00
Martin Kroeker 924fd806d0
Rename "HALF" and "sh" to "BFLOAT16" and "sb" 2020-10-11 23:43:36 +02:00
Martin Kroeker 4db09c6cec
Rename compare_sgemm_shgemm.c to compare_sgemm_sbgemm.c 2020-10-11 23:42:45 +02:00
Martin Kroeker fd94236042
Rename "HALF" and "sh" to "BFLOAT16" and "sb" 2020-10-11 23:42:07 +02:00
Martin Kroeker 68ce719fac
Rename shdot_microk_cooperlake.c to sbdot_microk_cooperlake.c 2020-10-11 23:41:13 +02:00
Martin Kroeker d7dd9b396c
Rename shdot.c to sbdot.c 2020-10-11 23:40:43 +02:00
Martin Kroeker 9ae80490e0
rename "HALF" and "sh" to "BFLOAT16" and "sb" 2020-10-11 23:39:42 +02:00
Martin Kroeker d314d1f49f
Rename shgemm_kernel_power10.c to sbgemm_kernel_power10.c 2020-10-11 23:37:38 +02:00
Martin Kroeker f0883740e4
Merge pull request #96 from xianyi/develop
rebase
2020-10-11 23:34:36 +02:00
Martin Kroeker 1c0b03efb4
Merge branch 'develop' into develop 2020-10-11 23:34:14 +02:00
Martin Kroeker c589c3e2a1
Merge pull request #2882 from martin-frbg/issue2709
Use generic C for (D/Z)NRM2 on Windows x86_64
2020-10-11 22:22:30 +02:00
Martin Kroeker ec638a82bf
Merge pull request #2852 from martin-frbg/issue2588-cmake
Support building only a subset of variable types
2020-10-11 22:21:33 +02:00
Martin Kroeker caa0d757ca
repair TABs 2020-10-11 18:29:34 +02:00
Martin Kroeker 6154f72d6d
Copy BUILD_ settings to the LAPACK make.inc 2020-10-11 18:25:16 +02:00
Martin Kroeker ae8b0d257a
Set BUILD_ options to 1 instead of just defining them 2020-10-11 18:08:21 +02:00
Martin Kroeker 1da32cc1fc
Add cblas_xerbla interface 2020-10-11 17:45:41 +02:00
Martin Kroeker 8c5e08076e
If none of the BUILD_ options is set, enable them all 2020-10-11 17:33:51 +02:00
Martin Kroeker 5f23bdf437
remove debug output 2020-10-11 17:23:08 +02:00
Martin Kroeker b593e6b650
Merge pull request #2885 from martin-frbg/ifexists
Improve CMAKE check for conflicting config_kernel.h
2020-10-11 15:45:24 +02:00