From ab27f411ebecac17a09ef2aafa02d4dcf5ea53de Mon Sep 17 00:00:00 2001 From: lkcl Date: Thu, 14 Jul 2022 18:33:45 +0100 Subject: [PATCH] --- openpower/sv/remap.mdwn | 26 ++++++++------------------ 1 file changed, 8 insertions(+), 18 deletions(-) diff --git a/openpower/sv/remap.mdwn b/openpower/sv/remap.mdwn index 334430cdd..f92d79b9c 100644 --- a/openpower/sv/remap.mdwn +++ b/openpower/sv/remap.mdwn @@ -615,16 +615,14 @@ notes from conversations: Remapping of SUBVL (vec2/3/4) elements is not permitted: the vec2/3/4 itself must be considered to be the "element". To perform REMAP -on the elements of a vec2/3/4, use Swizzle, Indexing, or add one +on the elements of a vec2/3/4, either use Swizzle, or, +due to the sub-elements themselves being contiguous, treat them as +such and use Indexing, or add one extra dimension to Matrix REMAP, the inner dimension being the size of the Subvector (2, 3, or 4). -The reason for allowing SUBVL Remaps is that some regular patterns using -Swizzle which would otherwise require multiple explicit instructions -with 12 bit swizzles encoded in them may be efficently encoded with Remap -instead. Not however that Swizzle is *still permitted to be applied*. - -An example where SUBVL Remap is appropriate is the Rijndael MixColumns +Note that Swizzle on Sub-vectors may be applied on top of REMAP. +Where this is appropriate is the Rijndael MixColumns stage: @@ -646,18 +644,15 @@ void gmix_column(unsigned char *r) { unsigned char b[4]; unsigned char c; unsigned char h; - // no swizzle here but still SUBVL.Remap - // can be done as vec4 byte-level - // elwidth overrides though. + // no swizzle here but vec4 byte-level + // elwidth overrides can be done though. for (c = 0; c < 4; c++) { a[c] = r[c]; h = (unsigned char)((signed char)r[c] >> 7); b[c] = r[c] << 1; b[c] ^= 0x1B & h; /* Rijndael's Galois field */ } - // SUBVL.Remap still needed here - // bytelevel elwidth overrides and vec4 - // These may then each be 4x 8bit bit Swizzled + // These may then each be 4x 8bit Swizzled // r0.vec4 = b.vec4 // r0.vec4 ^= a.vec4.WXYZ // r0.vec4 ^= a.vec4.ZWXY @@ -669,11 +664,6 @@ void gmix_column(unsigned char *r) { } ``` -With the assumption made by the above code that the column bytes have -already been turned around (vertical rather than horizontal) SUBVL.REMAP -may transparently fill that role, in-place, without a complex byte-level -mv operation. - The application of the swizzles allows the remapped vec4 a, b and r variables to perform four straight linear 32 bit XOR operations where a scalar processor would be required to perform 16 byte-level individual -- 2.30.2