* normal
* fail-first (where Vector Indexed is banned)
* Signed Effective Address computation (Vector Indexed only)
+* Pack/Unpack (on LD/ST immediate operations only)
More than that however it is necessary to fit the usual Vector ISA
capabilities onto both Power ISA LD/ST with immediate and to
| 0-1 | 2 | 3 4 | description |
| --- | --- |---------|--------------------------- |
| 00 | 0 | zz els | normal mode |
-| 00 | 1 | rsvd | reserved |
+| 00 | 1 | zz els | Structured Pack/Unpack |
| 01 | inv | CR-bit | Rc=1: ffirst CR sel |
| 01 | inv | els RC1 | Rc=0: ffirst z/nonz |
| 10 | N | zz els | sat mode: N=0/1 u/s |
This is very different from Arithmetic (Data-dependent) FFirst
where Vertical-First Mode is fully deterministic, not speculative.
+# LD/ST Pack/Unpack Mode
+
+As described in [[sv/normal],
+Structured Pack/Unpack is similar to VSX `vpack` and `vunpack` except
+generalised not only to a Schedule to be applied to any operation but
+also extended to vec2/3/4.
+
+Just as in [[sv/normal] operations,
+setting this mode changes the meaning of bits 4-5 in `RM` from being
+`ELWIDTH` to a pair of Pack/Unpack bits. Thus it is not possible
+to separately override source and destination elwidths at the same
+time as use Pack/Unpack: the `SRC_ELWIDTH` bits (6-7) must be used as
+both source and destination elwidth.
+
+Pack/Unpack only applies to LD/ST-immediate operations.
+See [[sv/svp64/appendix]] for details on how Pack/Unpack
+is implemented.
+
# LOAD/STORE Elwidths <a name="elwidth"></a>
Loads and Stores are almost unique in that the OpenPOWER Scalar ISA
*three* widths involved:
* operation width (lb=8, lh=16, lw=32, ld=64)
-* src elelent width override
-* destination element width override
+* src elelent width override (8/16/32/default)
+* destination element width override (8/16/32/default)
Some care is therefore needed to express and make clear the transformations,
which are expressly in this order:
augmentation. This is primarily down to quirks surrounding LE/BE and
byte-reversal in OpenPOWER.
-It is unfortunately possible to request an elwidth override on the memory side which
-does not mesh with the operation width: these result in `UNDEFINED`
+It is rather unfortunately possible to request an elwidth override
+on the memory side which
+does not mesh with the overridden operation width: these result in
+`UNDEFINED`
behaviour. The reason is that the effect of attempting a 64-bit `sv.ld`
operation with a source elwidth override of 8/16/32 would result in
overlapping memory requests, particularly on unit and element strided