From 7a6be2d94b9d3420330077852d838eb43c2694b0 Mon Sep 17 00:00:00 2001 From: lkcl Date: Fri, 12 Aug 2022 14:58:47 +0100 Subject: [PATCH] --- openpower/sv/svp64_quirks.mdwn | 35 ++++++++++++++++++++++++++++++++++ 1 file changed, 35 insertions(+) diff --git a/openpower/sv/svp64_quirks.mdwn b/openpower/sv/svp64_quirks.mdwn index 1ae1e7604..44b70792d 100644 --- a/openpower/sv/svp64_quirks.mdwn +++ b/openpower/sv/svp64_quirks.mdwn @@ -604,3 +604,38 @@ mapping (X to X, Y to Y...) By applying a straight linear swizzle map, the `RM-2P-1S1D-PU` mode of `sv.mv.swizzle` is available. + +It has however been decided to make one of the many pseudo-op aliases +a Pack/Unpack variant: `sv.xori RT,RA,0`. This loses half the range +of access of the Scalar regs (r0..r63) due to using an EXTRA2. It was +felt to be better to do this to xori rather than ori or addi. + +Pack/Unpack has to be deployed across SVP64 sparingly (not so uniformly +general) due to the fact that it takes up two RM.EXTRA bits, putting +pressure on developers by restricting the register range as above. + +# LD/ST with zero-immediate + +LD/ST operations with a zero immediate effectively means that on a +Vector operation the element index to offset the memory location is +multiplied by zero. Thus, a sequence of LD operations will load from +the exact same address, and likewise STs to the exact same address. + +Ordinarily this would make absolutely no sense whatsoever, except +that Power ISA has cache-inhibited LD/STs, for accessing memory-mapped +peripherals and other crucial uses. Thus, *despite not being a mapreduce mode*, +zero-immediates cause multiple hits on the same element. + +Recall above that mapreduce mode is not actually mapreduce at all: it is +a relaxation of the normal rule where if the destination is a Scalar the +Vector for-looping is not terminated on first write to the destination. +Instead, the developer is expected to exploit the strict Program Order, +make one of the sources the same as that Scalar destination, effectively +making that Scalar register an "Accumulator", thus creating the *appearance* +(effect) of Simple-V having a mapreduce capability, when in fact it is +more of an artefact. + +LD/ST zero-immediate has similar quirky overwriting as the "mapreduce" +mode, but actually requires the registers to be Vectors. It is simply +a mathematical artefact of multiplying by zero, which happens to be +useful for cache=inhibited operations. -- 2.30.2