Update Changelog with 0.3.12 changes

This commit is contained in:
Martin Kroeker 2020-10-24 12:50:04 +02:00 committed by GitHub
parent e1c18e4eeb
commit 89db73569b
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 29 additions and 3 deletions

View File

@ -1,9 +1,36 @@
OpenBLAS ChangeLog
====================================================================
Version 0.3.12
24-Oct-2020
common:
* Fixed missibg LAPACK functions (inadvertently dropped during
the build system restructuring)
* Fixed argument conversion macro in LAPACKE_zgesvdq (LAPACK #458)
POWER:
* Added optimized SCOPY/CCOPY kernels for POWER10
* Increased and unified the default size of the GEMM BUFFER
* Fixed building for POWER1ß in DYNAMIC_ARCH mode
* POWER10 compatibility test now checks binutils version as well
* Cleaned up compiler warnings
x86_64:
* corrected compiler version checks for AVX2 compatibility
* added compiler option -mavx2 for building with flang
* fixed direct SGEMM pathway for small matrix sizes (broken by
the code refactoring in 0.3.11)
* fixed unhandled partial register clobbers in several kernels
for AXPY,DOT,GEMV_N and GEMV_T flagged by gcc10 tree-vectorizer
ARMV8:
* improved Apple Vortex support to include cross-compiling
====================================================================
Version 0.3.11
17-Oct-2020
common:
common:
* API change:
the newly added BFLOAT16 functions were renamed to use the
letter "B" instead of "H" to avoid potential confusion with
@ -28,7 +55,7 @@ Version 0.3.11
* Makefile builds no longer misread NO_CBLAS=0 or NO_LAPACK=0 as
enabling these options
* Fixed detection of gfortran when invoked through an mpi wrapper
* Improve thread reinitialization performance with OpenMP xafter a fork
* Improve thread reinitialization performance with OpenMP after a fork
* Added support for building only the subset of the library required
for a particular precision by specifying BUILD_SINGLE, BUILD_DOUBLE
* Optional function name prefixes and suffixes are now correctly
@ -66,7 +93,6 @@ ARMV8:
* Fixed cpu detection on BSD-like systems
* Fixed compilation in -std=C18 mode
IBM Z:
* Added support for compiling with the clang compiler
* Improved GEMM performance on Z14