From: lkcl Date: Sun, 12 Jun 2022 12:31:17 +0000 (+0100) Subject: (no commit message) X-Git-Tag: opf_rfc_ls005_v1~1834 X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=f897d2544eeef8f0dcafe63f51269dd78d74e4cc;p=libreriscv.git --- diff --git a/openpower/sv/mv.swizzle.mdwn b/openpower/sv/mv.swizzle.mdwn index 4095f02e5..182e9bea6 100644 --- a/openpower/sv/mv.swizzle.mdwn +++ b/openpower/sv/mv.swizzle.mdwn @@ -17,6 +17,47 @@ A compromise is to provide a Swizzle "Move". The encoding for this instruction embeds static predication into the swizzle as well as constants 1/1.0 and 0/0.0 +# Format + +| 0.5 |6.10|11.15|16.27|28.31| name | +|-----|----|-----|-----|-----|--------------| +|PO | RTp| RAp |imm | 0011| mv.swiz | +|PO | RTp| RAp |imm | 1011| fmv.swiz | + +this gives a 12 bit immediate across bits 16 to 27. +Each swizzle mnemonic (XYZW), commonly known from 3D GPU programming, +has an associated index. 3 bits of the immediate are allocated +to each: + +| imm |0.2 |3.5 |6.8|9.11| +|-------|----|----|---|----| +|swizzle|X | Y | Z | W | +|index |0 | 1 | 2 | 3 | + +the options for each Swizzle are: + +* 0b000 to indicate "skip". this is equivalent to predicate masking +* 0b001 is not needed (reserved) +* 0b010 to indicate "constant 0" +* 0b011 to indicate "constant 1" (or 1.0) +* 0b1NN index 0 thru 3 to copy from subelement in pos XYZW + +Evaluating efforts to encode 12 bit swizzle into less proved unsuccessful: 7^4 comes out to 2,400 which is larger than 11 bits. + +Note that 7 options are needed (not 6) because the 7th option allows static +predicate masking to be encoded within the swizzle immediate. +For example this allows "W.Y." to specify: "copy W to position X, +and Y to position Z, leave the other two positions Y and W unaltered" + + 0 1 2 3 + X Y Z W + | | + +----+ | + | | | + +--------------+ + | | | | + W Y Y W + **As a Scalar instruction** Given that XYZW Swizzle can select simultaneously between one *and four* @@ -36,11 +77,11 @@ registers: | Z | RA+1 | RT+1 | lo-half | | W | RA+1 | RT+1 | hi-half | -When RA=RT (in-place swizzle) any portion of RT not covered by +When `RA=RT` (in-place swizzle) any portion of RT not covered by the Swizzle is unmodified. For example a Swizzle of "..XY" will copy the contents RA+1 into RT but leave RT+1 unmodified. -When RA!=RT any part of RT or RT+1 not set as a destination by +When `RA!=RT` any part of RT or RT+1 not set as a destination by the Swizzle will be set to zero. A Swizzle of "..XY" would copy the contents RA+1 into RT, but set RT+1 to zero. @@ -74,50 +115,10 @@ Horizontal-First Mode: *Implementor's note: the cost of Vertical-First Mode in an Embedded design of storing four 64-bit in-flight elements may be too high. If this is the -case it is acceptable to throw an Illegal Instruction Trap. +case it is acceptable to throw an Illegal Instruction Trap, and emulate +the instruction in software. Performance will obviously be adversely affected. See [[sv/compliancy_levels]]* -# Format - -| 0.5 |6.10|11.15|16.27|28.31| name | -|-----|----|-----|-----|-----|--------------| -|PO | RTp| RAp |imm | 0011| mv.swiz | -|PO | RTp| RAp |imm | 1011| fmv.swiz | - -this gives a 12 bit immediate across bits 16 to 27. -Each swizzle mnemonic (XYZW), commonly known from 3D GPU programming, -has an associated index. 3 bits of the immediate are allocated -to each: - -| imm |0.2 |3.5 |6.8|9.11| -|-------|----|----|---|----| -|swizzle|X | Y | Z | W | -|index |0 | 1 | 2 | 3 | - -the options for each Swizzle are: - -* 0b000 to indicate "skip". this is equivalent to predicate masking -* 0b001 is not needed (reserved) -* 0b010 to indicate "constant 0" -* 0b011 to indicate "constant 1" (or 1.0) -* 0b1NN index 0 thru 3 to copy from subelement in pos XYZW - -Evaluating efforts to encode 12 bit swizzle into less proved unsuccessful: 7^4 comes out to 2,400 which is larger than 11 bits. - -Note that 7 options are needed (not 6) because the 7th option allows static -predicate masking to be encoded within the swizzle immediate. -For example this allows "W.Y." to specify: "copy W to position X, -and Y to position Z, leave the other two positions Y and W unaltered" - - 0 1 2 3 - X Y Z W - | | - +----+ | - | | | - +--------------+ - | | | | - W Y Y W - # RM Mode Concept: MVRM-2P-2S1D: