This patch makes use of new POWER10 vector pair instructions for loads and stores. Also reorganized all variants of copy functions to make use of same kernel.