Commit Graph

  • 95dbeff66d
    Merge branch 'release-0.3.0' into develop Martin Kroeker 2020-06-14 22:02:45 +0200
  • 3b673a24b7
    Increment version to 0.3.10.dev Martin Kroeker 2020-06-14 21:57:52 +0200
  • 1eb1979050
    Increment version to 0.3.10.dev Martin Kroeker 2020-06-14 21:57:15 +0200
  • efc53b6e7e
    Merge pull request #2665 from martin-frbg/flang-fixes-2a Martin Kroeker 2020-06-14 21:56:08 +0200
  • 72888497e2
    Update with 0.3.10 changes Martin Kroeker 2020-06-14 21:55:31 +0200
  • 7e3e006af6
    Merge pull request #2666 from martin-frbg/blastest Martin Kroeker 2020-06-14 18:28:37 +0200
  • d906d14402
    Merge pull request #2664 from ACSimon33/exported_symbols Martin Kroeker 2020-06-14 18:27:03 +0200
  • 3785c0e82b
    Merge pull request #2663 from martin-frbg/issue2654 Martin Kroeker 2020-06-14 18:26:43 +0200
  • f2d8879af6
    Merge pull request #2661 from martin-frbg/issue2660 Martin Kroeker 2020-06-14 18:25:37 +0200
  • 6876221cf3
    Remove optimization level limit for flang again and add -fno-unroll-loops for AOCC flang 2.x instead Martin Kroeker 2020-06-14 17:40:24 +0200
  • 79cdcde717
    Re-enable higher optimization levels for flang while disabling loop unrolling for AOCC flang Martin Kroeker 2020-06-14 17:18:16 +0200
  • 18a11137f1
    Update BLAS tests to correspond to Reference-LAPACK 3.9.0 Martin Kroeker 2020-06-14 10:26:25 +0200
  • 1dd712131e
    Fix spelling of flang option -Mrecursive and add -Kieee Martin Kroeker 2020-06-14 00:09:31 +0200
  • 0ed2adf0b2
    Fix spelling of flang option -Mrecursive and add -Kieee Martin Kroeker 2020-06-14 00:01:20 +0200
  • abf670757b
    Respect predefined defaults for AR, AS, LD and RANLIB Martin Kroeker 2020-06-13 23:21:13 +0200
  • 41fc6f3cd2 Added missing exported symbols. Simon Märtens 2020-06-13 22:37:39 +0200
  • c90c528eeb
    Force flang optimization level to -O0 and correct spelling of -Mrecursive Martin Kroeker 2020-06-13 19:41:49 +0200
  • f132b05de1
    Force flang optimization level to -O0 to work around failures in ctest and lapack-test Martin Kroeker 2020-06-13 19:36:01 +0200
  • f6ccca344d
    Correct flang option to -Mrecursive Martin Kroeker 2020-06-13 19:32:54 +0200
  • 007d9f97d7
    Make gotoblas_corename report the name of the selected TARGET rather than its aliases Martin Kroeker 2020-06-13 19:25:28 +0200
  • 63d26090f5
    Merge pull request #64 from xianyi/develop Martin Kroeker 2020-06-13 19:14:47 +0200
  • 9fe930f205 powerpc: Add support for future processor Rajalakshmi Srinivasaraghavan 2020-06-11 15:47:20 -0500
  • 3a1b58d54a
    Merge pull request #2653 from craft-zhang/cortex-a53 Martin Kroeker 2020-06-10 12:19:33 +0200
  • f7659be4a0
    Merge pull request #2652 from martin-frbg/flang-fixes Martin Kroeker 2020-06-09 20:31:06 +0200
  • bc6fd20a40 fix INIT8x4 ZhangDanfeng 2020-06-10 01:01:16 +0800
  • 3ce469a34f
    Limit optimization level to O1 for flang and add -frecursive Martin Kroeker 2020-06-09 16:11:13 +0200
  • ba2c5b404d
    When building with flang, use it also for the final link step to get dependencies right Martin Kroeker 2020-06-09 16:09:34 +0200
  • f07a80354b
    Apply previously AOCC-specific workaround to all versions of flang Martin Kroeker 2020-06-09 16:07:03 +0200
  • fdd1b50263
    Merge pull request #63 from xianyi/develop Martin Kroeker 2020-06-09 15:54:30 +0200
  • b98923f33a Test enforce -O1 for flang Leonard Lausen 2020-06-09 06:54:42 +0000
  • 4cb1db0e3b Test flang build Leonard Lausen 2020-06-09 06:25:45 +0000
  • 430e8b45fe
    Merge pull request #2648 from martin-frbg/lapack411 Martin Kroeker 2020-06-07 19:45:52 +0200
  • 88fe85f4e0
    Merge pull request #2647 from martin-frbg/aocc-flang Martin Kroeker 2020-06-07 19:45:11 +0200
  • 89091e6b64
    Merge pull request #2645 from martin-frbg/misc_fixes Martin Kroeker 2020-06-07 19:44:50 +0200
  • 522aaf53bf
    Break out of potentially infinite rescaling loop in LAPACK xLARGV/xLARTG/xLARTGP Martin Kroeker 2020-06-07 14:30:20 +0200
  • c3574ffe53
    Merge pull request #2646 from wjc404/develop Martin Kroeker 2020-06-07 13:18:22 +0200
  • 4e28dc6353
    Use only -O1 with AMD AOCC version of flang Martin Kroeker 2020-06-07 00:05:02 +0200
  • 13c28889a2
    Update "cosmetic fixes for non-C99 compilers" Martin Kroeker 2020-06-06 15:22:27 +0200
  • 0e3ac4a06b
    Add files via upload wjc404 2020-06-06 14:56:57 +0800
  • 28915eed72
    Cosmetic fixes for non-C99 compilers Martin Kroeker 2020-06-05 10:05:34 +0200
  • 7f60fb6b91
    Delete spurious copy of common_param.h Martin Kroeker 2020-06-05 10:04:16 +0200
  • 0464e662ad
    make blas_quickdivide unsigned and guard against miscompilation Martin Kroeker 2020-06-05 10:03:36 +0200
  • 0f9a935a5a
    Merge pull request #62 from xianyi/develop Martin Kroeker 2020-06-05 09:51:06 +0200
  • 79cd69fea4
    Merge pull request #2644 from martin-frbg/cmake-maxstack Martin Kroeker 2020-06-05 08:33:48 +0200
  • bb12c2c854
    Limit MAX_STACK_ALLOC availability to non-Wndows Martin Kroeker 2020-06-04 19:07:27 +0200
  • 32c1c1e125
    Update azure-pipelines.yml Martin Kroeker 2020-06-04 19:03:46 +0200
  • f1953b8b81
    Update azure-pipelines.yml Martin Kroeker 2020-06-04 17:58:13 +0200
  • 6e97df7b47
    Add CMAKE support for MAX_STACK_ALLOC setting Martin Kroeker 2020-06-04 14:45:31 +0200
  • 729303e5ed
    Merge pull request #2643 from craft-zhang/cortex-a53 Martin Kroeker 2020-06-04 07:58:45 +0200
  • 547965530f
    Merge pull request #2638 from leezu/actions Martin Kroeker 2020-06-04 00:02:37 +0200
  • 9b7877ccf1 sgemm copy source init ZhangDanfeng 2020-06-04 02:09:38 +0800
  • f82fa802d1 Insert prefetch ZhangDanfeng 2020-06-04 02:08:48 +0800
  • 3eda3d34c3
    Merge pull request #2641 from martin-frbg/ppcg4 Martin Kroeker 2020-06-03 16:43:46 +0200
  • a8f42ae85c
    set cmake build type to Release Martin Kroeker 2020-06-03 15:28:59 +0200
  • e6e2e531bc
    revert clang pragma Martin Kroeker 2020-06-03 15:16:27 +0200
  • 456dc04441
    Update sgemm_kernel_16x4_skylakex_3.c Martin Kroeker 2020-06-03 15:15:41 +0200
  • 89323458a9
    preset optimization level for apple clang Martin Kroeker 2020-06-03 15:07:25 +0200
  • e153bdeb70
    Update dynamic_arch.yml Martin Kroeker 2020-06-03 13:46:43 +0200
  • c2001f7756
    Make cmake build verbose to see options in use Martin Kroeker 2020-06-03 12:18:15 +0200
  • c2b3f0b3f6
    Revert "keep Apple Clang from optimizing this" Martin Kroeker 2020-06-03 10:22:15 +0200
  • f16e39554d
    Change PPCG4 CGEMM_M to match kernel change Martin Kroeker 2020-06-03 09:15:29 +0200
  • b1ee81228a
    Change complex DOT and ROT to generic kernels and switch CGEMM Martin Kroeker 2020-06-03 09:13:29 +0200
  • 9f7358d7dc
    Keep Apple Clang from optimizing this Martin Kroeker 2020-06-03 08:52:53 +0200
  • 54fa90fb25
    Keep apple clang 11.0.3 from trying to optimize this (and running out of registers) Martin Kroeker 2020-06-02 17:31:45 +0200
  • 5a709b8340 Print CPU info in output Leonard Lausen 2020-06-01 20:51:11 +0000
  • b31a68b835 Add Github Actions test for DYNAMIC_ARCH builds Leonard Lausen 2020-05-31 01:17:05 +0000
  • 86552bf4c7
    Update f_check Martin Kroeker 2020-05-31 15:22:12 +0200
  • a349d48d89
    Merge pull request #2636 from martin-frbg/issue2634 Martin Kroeker 2020-05-31 15:16:09 +0200
  • 4db00121dc
    Disable EXPRECISION and add -lm on OSX (same as the BSDs and Linux) Martin Kroeker 2020-05-31 12:39:36 +0200
  • 909897f13b
    Document option USE_LOCKING Martin Kroeker 2020-05-31 12:37:57 +0200
  • e79245acd9
    Merge pull request #2635 from ilayn/patch-1 Martin Kroeker 2020-05-30 14:37:12 +0200
  • 76d2612e0c
    BUG: Fix the loop range in ZHEEQUB.f Ilhan Polat 2020-05-30 14:11:11 +0200
  • ced49466f0
    Use the fortran compiler to link LAPACK-related benchmarks Martin Kroeker 2020-05-29 13:35:51 +0200
  • 6e270f91ec
    add support for RETURN_BY_STACK semantics, e.g. clang Martin Kroeker 2020-05-29 13:29:10 +0200
  • 200296b0f4
    remove libomp from link list only for pgfortran Martin Kroeker 2020-05-29 13:23:51 +0200
  • dd7a650792
    Merge pull request #59 from xianyi/develop Martin Kroeker 2020-05-29 13:06:25 +0200
  • 4a4c50a7ce
    Merge pull request #2627 from pkubaj/patch-1 Martin Kroeker 2020-05-26 08:36:24 +0200
  • d069780e63
    Merge pull request #2626 from docularxu/working-gcc-version-detections Martin Kroeker 2020-05-26 08:35:58 +0200
  • 33c8790603
    Add powerpc (32-bit) pkubaj 2020-05-25 13:14:09 +0200
  • 06387ac0e6 make GCC version detection OS-independent Guodong Xu 2020-05-25 10:40:12 +0000
  • f1a18d245b
    Merge pull request #2618 from craft-zhang/cortex-A53 Martin Kroeker 2020-05-25 12:14:46 +0200
  • 2a3aa91354 update CONTRIBUTORS.md, adding myself 张丹枫 2020-05-20 22:35:26 +0800
  • ea5bdc3f72 split cortex-a53 param to match 8x8 kernel 张丹枫 2020-05-20 22:34:47 +0800
  • 9df79ae9a3 update sgemm and strmm kernel selecting strategy 张丹枫 2020-05-20 21:57:12 +0800
  • a1fc6041cd use general register to speedup 张丹枫 2020-05-20 21:55:32 +0800
  • edb423d772 align general register using to strmm_kernel_8x8 张丹枫 2020-05-20 21:52:49 +0800
  • 0e6eb8c247 sgemm kernel use sgemm_kernel_8x8_cortexa53 zhangdanfeng 2020-05-18 16:51:33 +0800
  • d475db29c6 optimized for cortex-a53 zhangdanfeng 2020-05-18 16:47:33 +0800
  • 729ac6bd4a
    Merge pull request #2623 from mhillenibm/zarch_dgemm_z14 Martin Kroeker 2020-05-20 14:51:04 +0200
  • 89fe17f20e s390x: Use new sgemm kernel also for DGEMM and DTRMM on Z14 Marius Hillenbrand 2020-05-19 14:56:34 +0200
  • bdd795ed03 s390x/GEMM: replace 0-init with peeled first iteration Marius Hillenbrand 2020-05-19 14:30:44 +0200
  • e1038ea836
    Merge pull request #2622 from martin-frbg/issue2619 Martin Kroeker 2020-05-19 23:07:22 +0200
  • 6baa9a778d
    Improve declaration of LAPACKE_get_nancheck Martin Kroeker 2020-05-19 17:59:31 +0200
  • cf46c9f84e
    Merge pull request #2617 from martin-frbg/issue2616 Martin Kroeker 2020-05-18 13:23:58 +0200
  • 55602fce56
    Ignore spurious all-numeric library names derived from mishandled jobserver flags Martin Kroeker 2020-05-17 15:28:14 +0200
  • 3d5e159e7a
    Ignore spurious all-numeric library names derived from mishandled jobserver flags Martin Kroeker 2020-05-17 15:26:57 +0200
  • 2931feb575
    Merge pull request #58 from xianyi/develop Martin Kroeker 2020-05-17 15:23:32 +0200
  • 20245ded5f
    Merge pull request #2615 from mhillenibm/z14_alignment_hints Martin Kroeker 2020-05-14 21:06:34 +0200
  • 2840432e49 s390x: improvise vector alignment hints for older compilers Marius Hillenbrand 2020-05-13 17:48:50 +0200
  • ea78106c71
    Merge pull request #2614 from mhillenibm/gemm_vec_z14 Martin Kroeker 2020-05-13 15:09:23 +0200