From aa3a2370fdfa3062f7dc1cf6af42dfeb50ed1b5f Mon Sep 17 00:00:00 2001 From: lkcl Date: Mon, 12 Sep 2022 17:21:15 +0100 Subject: [PATCH] --- openpower/sv/ldst.mdwn | 36 +----------------------------------- 1 file changed, 1 insertion(+), 35 deletions(-) diff --git a/openpower/sv/ldst.mdwn b/openpower/sv/ldst.mdwn index ab62a9cd6..e47cf6275 100644 --- a/openpower/sv/ldst.mdwn +++ b/openpower/sv/ldst.mdwn @@ -106,7 +106,7 @@ The table for [[sv/svp64]] for `immed(RA)` which is `RM.MODE` | 0-1 | 2 | 3 4 | description | | --- | --- |---------|--------------------------- | | 00 | 0 | zz els | simple mode | -| 00 | 1 | zz els | Structured Pack/Unpack | +| 00 | 1 | / / | reserved | | 01 | inv | CR-bit | Rc=1: ffirst CR sel | | 01 | inv | els RC1 | Rc=0: ffirst z/nonz | | 10 | N | zz els | sat mode: N=0/1 u/s | @@ -385,24 +385,6 @@ Therefore, Fail-First LD/ST in Vertical-First is `UNDEFINED`. This is very different from Arithmetic (Data-dependent) FFirst where Vertical-First Mode is fully deterministic, not speculative. -# LD/ST Pack/Unpack Mode - -As described in [[sv/normal]], -Structured Pack/Unpack is similar to VSX `vpack` and `vunpack` except -generalised not only to a Schedule to be applied to any operation but -also extended to vec2/3/4. - -Just as in [[sv/normal]] operations, -setting this mode changes the meaning of bits 4-5 in `RM` from being -`ELWIDTH` to a pair of Pack/Unpack bits. Thus it is not possible -to separately override source and destination elwidths at the same -time as use Pack/Unpack: the `SRC_ELWIDTH` bits (6-7) must be used as -both source and destination elwidth. - -Pack/Unpack only applies to LD/ST-immediate operations. -See [[sv/svp64/appendix]] for details on how Pack/Unpack -is implemented. - # LOAD/STORE Elwidths Loads and Stores are almost unique in that the Power Scalar ISA @@ -503,8 +485,6 @@ for clarity and simplicity: j++; Note above that the source elwidth is *not used at all* in LD-immediate. -*(For Pack/Unpack Mode which shares the same source elwidth bits this -is no great loss)*. For LD/Indexed, the key is that in the calculation of the Effective Address, RA has no elwidth override but RB does. Pseudocode below is simplified @@ -571,17 +551,3 @@ Thus we do not need to provide specialist LD/ST "Structure Packed" opcodes because the generic abstracted concept of "Remapping", when applied to LD/ST, will give that same capability, with far more flexibility. -Also LD/ST with immediate has a Pack/Unpack option similar to VSX -'vpack' and 'vunpack', as well as the VSX Pixel instructions. Enabling -this mode on SubVectors is straightforward and does not involve -the setup cost of REMAP. Unlike REMAP, Pack/Unpack on LD/ST does not have -Saturation (or Fail-first) at the same time. - -*Programmer's note: a decision on what is best if combining Saturation -with Pack/Unpack is required will depend on resources. REMAP will -require less registers but is more costly to set up. On the other -hand LDST Pack/Unpack followed by Saturated MV or arithmetic requires -intermediary registers at full width prior to reduced saturated width. -A balanced decision is therefore needed*. - - -- 2.30.2