Commit Graph

  • 7d75177446 Move temp to x21 to leave x18 unused (reserved on OSX) Martin Kroeker 2021-09-17 09:24:11 +02:00
  • 0a4ac4b585 Use x21 for I to leave x18 unused (reserved on OSX) Martin Kroeker 2021-09-17 09:19:51 +02:00
  • 7d4a221579 Remove unused TEMP2 and reshuffle to leave x18 unused (reserved on OSX) Martin Kroeker 2021-09-17 09:18:25 +02:00
  • d3a9c7ef7f Merge pull request #3382 from rafaelcfsousa/rafael/cwarnings Martin Kroeker 2021-09-17 09:15:16 +02:00
  • 72c26f4f7f Merge pull request #3381 from martin-frbg/issue3371 Martin Kroeker 2021-09-16 07:14:49 +02:00
  • 0e8b4adf22 Remove unused commented code (#if directive) Rafael Cardoso Fernandes Sousa 2021-09-15 22:18:48 +00:00
  • 8dfa61a61c Initialize abs_mask1 with itself to silence a gcc warning Martin Kroeker 2021-09-15 22:11:35 +02:00
  • 99aa10b3ff Initialize abs_mask1 with itself to silence a gcc warning Martin Kroeker 2021-09-15 22:10:43 +02:00
  • b751edf624 Fix unused variable warnings on Power Rafael Cardoso Fernandes Sousa 2021-09-15 13:36:07 -05:00
  • fa8bf57768 Merge pull request #3380 from martin-frbg/structwarn Martin Kroeker 2021-09-15 07:19:09 +02:00
  • 80346b8813 Merge pull request #3379 from martin-frbg/issue3369-2 Martin Kroeker 2021-09-15 07:18:57 +02:00
  • 13182b2801 Merge pull request #3378 from martin-frbg/issue3368-2 Martin Kroeker 2021-09-15 07:18:38 +02:00
  • dd09f0173e Remove extraneous qualifiers from struct definition Martin Kroeker 2021-09-14 21:52:26 +02:00
  • ce036a2fc0 Add casts Martin Kroeker 2021-09-14 21:41:53 +02:00
  • ddf106f769 Add dedicated entries for BFLOAT16 kernels Martin Kroeker 2021-09-14 16:17:18 +02:00
  • c35739db5e Add separate entries for BFLOAT16 functions and fix missing cblas_xerbla Martin Kroeker 2021-09-14 16:15:57 +02:00
  • 2f8220d757 Add sbgemm Martin Kroeker 2021-09-14 16:14:43 +02:00
  • 5f6a609253 Add sbgemv Martin Kroeker 2021-09-14 16:13:57 +02:00
  • e02df9fc55 Propagate BUILD_BFLOAT16 to CFLAGS Martin Kroeker 2021-09-14 16:12:27 +02:00
  • 1c0a8a714a Add defaults for SBGEMV kernels Martin Kroeker 2021-09-14 16:10:58 +02:00
  • 5e4f1e3677 Remove BFLOAT16 from the task list of GenerateNamedObject Martin Kroeker 2021-09-14 16:09:46 +02:00
  • af8843875a Merge pull request #3376 from martin-frbg/issue3370 Martin Kroeker 2021-09-12 00:01:31 +02:00
  • d1ee2e9c7d Merge pull request #3375 from martin-frbg/issue3369 Martin Kroeker 2021-09-12 00:01:20 +02:00
  • 0925dfe2c9 One instance of kernel_4x1 is used even on SKX Martin Kroeker 2021-09-11 15:30:19 +02:00
  • 1085775bc6 really remove the unused variable Martin Kroeker 2021-09-11 15:05:55 +02:00
  • 7d873a329f Add ifdefs around conditionally used functions Martin Kroeker 2021-09-11 14:38:47 +02:00
  • ef24712030 Move a conditionally used variable Martin Kroeker 2021-09-11 14:37:44 +02:00
  • 20581bf303 Remove unused variable Martin Kroeker 2021-09-11 14:36:27 +02:00
  • d17238599b Add casts Martin Kroeker 2021-09-11 13:38:28 +02:00
  • 3e8c448696 Merge pull request #3367 from RajalakshmiSR/makesyntax Martin Kroeker 2021-09-08 20:19:39 +02:00
  • 7f4aa106f2 Fixing syntax error in makefile Rajalakshmi Srinivasaraghavan 2021-09-08 07:04:13 -05:00
  • a6ed4f0d37 Merge pull request #3366 from martin-frbg/azure-ubuntu Martin Kroeker 2021-09-08 13:57:35 +02:00
  • b858e65476 migrate from deprecated ubuntu-16.04 vmImage Martin Kroeker 2021-09-08 10:51:59 +02:00
  • d3d6601727 Merge pull request #3365 from martin-frbg/travis-lx Martin Kroeker 2021-09-07 16:24:33 +02:00
  • da5bd8b5e3 Merge pull request #3364 from guowangy/bf16-cooperlake Martin Kroeker 2021-09-07 13:57:40 +02:00
  • 045ed5c91d sbgemm: fix build error in BFLOAT16 disabled Wangyang Guo 2021-09-07 23:37:08 +08:00
  • 4289cf048d sbgemm: avoid falling into SGEMM_KERNEL_DIRECT Wangyang Guo 2021-09-07 18:34:26 +08:00
  • 59a1114d03 sbgemm: cooperlake: tuning for small matrix Wangyang Guo 2021-09-07 18:12:40 +08:00
  • 682d66555d sbgemm: cooperlake: implement ncopy_16 Wangyang Guo 2021-08-20 22:01:00 +08:00
  • beccb83b16 sbgemm: cooperlake: add n24 kernel for tcopy_4 Wangyang Guo 2021-08-19 19:46:08 +08:00
  • 5fcacad32b sbgemm: cooperlake: implement tcopy_4 Wangyang Guo 2021-08-19 00:08:06 +08:00
  • bb1c4fa5bd sbgemm: cooperlake: prefetch A & B Wangyang Guo 2021-08-18 21:17:08 +08:00
  • 7a2d1601ec sbgemm: cooperlake: unroll core loop by 2 Wangyang Guo 2021-08-17 23:21:19 +08:00
  • 45fdf951b6 sbgemm: cooperlake: reorder ptr increase for performance Wangyang Guo 2021-08-17 22:08:24 +08:00
  • cece3541ab sbgemm: cooperlake: fix bug in m64n12 Wangyang Guo 2021-08-17 21:13:29 +08:00
  • 8356a604f0 sbgemm: cooperlake: tuning for block params Wangyang Guo 2021-08-17 19:35:40 +08:00
  • 9df0953cde sbgemm: cooperlake: kernel works for NN Wangyang Guo 2021-08-16 19:39:24 +08:00
  • 2ec9f3a8aa sbgemm: cooperlake: change kernel size to 16x4 Wangyang Guo 2021-08-12 01:46:49 +00:00
  • ef8f5fecc8 sbgemm: cooperlake: implement sbgemm_tcopy_32 Wangyang Guo 2021-08-10 06:14:45 +00:00
  • 4c294336e6 sbgemm: cooperlake: add dummy source files Wangyang Guo 2021-08-10 03:23:45 +00:00
  • 8c68b6f26d Update .travis.yml Martin Kroeker 2021-09-07 11:40:40 +02:00
  • 349fb4910b Disable the remaining x86_64 job on Travis Martin Kroeker 2021-09-07 11:19:51 +02:00
  • 7c72c45be6 Merge pull request #3363 from martin-frbg/fixpr3360 Martin Kroeker 2021-09-07 08:02:53 +02:00
  • 32fee86033 Correct misplaced ifdef lines Martin Kroeker 2021-09-06 23:44:20 +02:00
  • 72f3ce5f08 Add NO_AVX=1 fallbacks to newer generation x86_64 for completeness (#3360) Martin Kroeker 2021-09-05 20:35:48 +02:00
  • af19cda65a Add "recursive" option for IBM xlf compiler (#3359) Martin Kroeker 2021-09-04 18:26:59 +02:00
  • a3e80069fb Merge pull request #3355 from martin-frbg/smallgemmcr Martin Kroeker 2021-09-02 00:27:23 +02:00
  • f1e3305974 Add workaround for Windows10 macro name clash Martin Kroeker 2021-09-01 21:36:50 +02:00
  • 3cdfe33610 Merge pull request #3352 from martin-frbg/3321-2n Martin Kroeker 2021-09-01 13:52:40 +02:00
  • 47171e4b93 Merge pull request #3354 from nsait-linaro/fix_gmemm_align_win_arm Martin Kroeker 2021-08-31 21:47:21 +02:00
  • 7cddbf99b1 Make explicit conversion condition on _WIN64 flag Niyas Sait 2021-08-31 14:36:44 +01:00
  • d1ed72fa87 [win/arm64]: Explicit casting for GMEMM_DEFAULT_ALIGN to create 64-bit value Niyas Sait 2021-08-24 06:09:29 +01:00
  • 806221440b Merge pull request #3353 from guowangy/bf16-small-matrix-cooperlake Martin Kroeker 2021-08-30 20:39:51 +02:00
  • cd10d1c03b Fix typo Martin Kroeker 2021-08-30 14:38:28 +02:00
  • 2db1a99aca Clean up debug messages Martin Kroeker 2021-08-30 14:21:25 +02:00
  • 619588fbab sbgemm: remove unnecessary b0 files Wangyang Guo 2021-08-30 17:48:11 +08:00
  • f39301935c sbgemm: cooperlake: make sure hot buffer aligned to 64 Wangyang Guo 2021-08-13 18:43:41 +08:00
  • 2e44ca0136 sbgemm: add missing cblas_sbgemm definition Wangyang Guo 2021-08-13 00:51:24 +08:00
  • 7d27b182fc sbgemm: cooperlake: enable SBGEMM by small matrix path Wangyang Guo 2021-08-12 06:10:51 +00:00
  • 1d83ca4bca Small Matrix: support BFLOAT16 data type Wangyang Guo 2021-08-12 03:14:18 +00:00
  • bec9d9f63d Merge pull request #3335 from guowangy/small-matrix-latest Martin Kroeker 2021-08-29 22:33:33 +02:00
  • 89fc5b8f4f Fix unmap logic Martin Kroeker 2021-08-29 19:50:24 +02:00
  • 7fd12a5e69 Add likely() hints for gcc Martin Kroeker 2021-08-29 13:54:51 +02:00
  • 2ba9a567aa Fix typo Martin Kroeker 2021-08-28 17:14:59 +02:00
  • b4b952eece Add auxiliary tracking space for thread buffer frees too Martin Kroeker 2021-08-28 17:03:53 +02:00
  • 7d1becc575 Allocate an auxiliary struct when running out of preconfigured threads Martin Kroeker 2021-08-28 14:18:36 +02:00
  • 6bb1805ed6 Merge pull request #3348 from guowangy/skylakex-sgemv_t-fix Martin Kroeker 2021-08-25 22:43:45 +02:00
  • 0f0a0be95d Merge pull request #3345 from nsait-linaro/windows_on_arm64 Martin Kroeker 2021-08-25 15:49:55 +02:00
  • dbbb39199f sgemv: skylakex: fix build warning Wangyang Guo 2021-08-25 07:13:00 +00:00
  • e9acb46431 sgemv: skylakex: bug fix for sgemv_t kernel in corner case Wangyang Guo 2021-08-25 07:07:27 +00:00
  • c6c2a71fb7 Fix ctest.h to build using clang on windows Niyas Sait 2021-08-16 11:25:07 +01:00
  • cdb5d2737e add support for building on windows/arm64 target Niyas Sait 2021-08-16 11:22:51 +01:00
  • 13d411677f Add more OSX build jobs to Azure CI (#3338) Martin Kroeker 2021-08-15 00:17:23 +02:00
  • f9dba63c28 Small Matrix: skylakex: remove unnecessary b0 source files Wangyang Guo 2021-08-13 03:28:44 +00:00
  • 989e6bbdd3 Small Matrix: reduce generic kernel source files Wangyang Guo 2021-08-13 03:17:38 +00:00
  • 04255be948 Merge pull request #3344 from gxw-loongson/develop Martin Kroeker 2021-08-12 15:16:46 +02:00
  • a7bc8ec1f1 Delete the macro instruction "li" and use "li.d" instead gxw 2021-08-10 16:42:57 +08:00
  • 8cd2b32fef Merge pull request #3343 from cianciosa/develop Martin Kroeker 2021-08-12 01:28:18 +02:00
  • 4c766cd11f Fix a small syntax error. A ( was accidently deleted. cianciosa 2021-08-11 12:08:34 -04:00
  • c28560129f Check the total number of arguments passed insead of if the ARGV# is defined. This fixes a problem when compling openblas as a subproject of another code. cianciosa 2021-08-11 12:00:07 -04:00
  • b9e4fb206d Merge pull request #3341 from RajalakshmiSR/dasump10 Martin Kroeker 2021-08-11 09:39:10 +02:00
  • b06880c2cd POWER10: Improving dasum performance Rajalakshmi Srinivasaraghavan 2021-08-10 22:06:04 -05:00
  • cbc583eb54 Merge pull request #3336 from martin-frbg/traviscom Zhang Xianyi 2021-08-05 19:13:19 +08:00
  • e5ba7c3235 Disable all x86 jobs Martin Kroeker 2021-08-05 11:08:18 +02:00
  • 435d84a7ce Merge pull request #3332 from martin-frbg/travisbadge Martin Kroeker 2021-08-05 09:36:59 +02:00
  • 139f632ca4 Merge pull request #3334 from Guobing-Chen/BF16_gemm_full_kernel Martin Kroeker 2021-08-05 08:01:13 +02:00
  • c17d6dacb2 Small Matrix: skip compile in unimplemented data type Wangyang Guo 2021-08-05 05:46:13 +00:00
  • 44d0032f3b Small Matrix: skylakex: fix build error in old compiler Wangyang Guo 2021-08-05 04:43:47 +00:00
  • 5d86becdae Add all SBGEMM kernels for IA AVX512-BF16 based platforms Chen, Guobing 2021-08-05 11:11:14 +08:00
  • 76ea8db4da Small Matrix: enable by default for x86_64 arch Wangyang Guo 2021-08-05 02:57:58 +00:00