[[!tag standards]] # mv.swizzle Links * the encoding embeds predication into the swizzle as well as constants 1/1.0 and 0/0.0 As a Scalar instruction, mv.swiz and fmv.swiz operate on four 32-bit quantities, reducing this instruction to 2-in, 2-out pairs of 64-bit registers: | swizzle name | source | dest | half | |-- | -- | -- | -- | | X | RA | RT | lo-half | | Y | RA | RT | hi-half | | Z | RA+1 | RT+1 | lo-half | | W | RA+1 | RT+1 | hi-half | When RA=RT (in-place swizzle) any # Format | 0.5 |6.10|11.15|16.27|28.31| name | |-----|----|-----|-----|-----|--------------| |PO | RT | RA |imm | 0011| mv.swiz | |PO | RT | RA |imm | 1011| fmv.swiz | this gives a 12 bit immediate across bits 16 to 25 and 29-30. * 3 bits X * 3 bits Y * 3 bits Z * 3 bits W the options are: * 0b000 to indicate "skip". this is equivalent to predicate masking * 0b001 is not needed (reserved) * 0b010 to indicate "constant 0" * 0b011 to indicate "constant 1" (or 1.0) * 0b1NN index 0 thru 3 to copy from subelement in pos XYZW Evaluating efforts to encode 12 bit swizzle into less proved unsuccessful: 7^4 comes out to 2,400 which is larger than 11 bits. Note that 7 options are needed (not 6) because the 7th option allows predicate masking to be encoded within the swizzle immediate. For example this allows "W..Y" to be specified, "copy W to position X, and Y to position W, leave the other two positions Y and Z unaltered" # RM Mode Concept: MVRM-2P-2S1D: | Field Name | Field bits | Description | |------------|------------|----------------------------| | Rdest_EXTRA2 | `10:11` | extends Rdest (R\*\_EXTRA2 Encoding) | | Rsrc_EXTRA2 | `12:13` | extends Rsrc (R\*\_EXTRA2 Encoding) | | src_SUBVL | `14:15` | SUBVL for Source | | MASK_SRC | `16:18` | Execution Mask for Source | The inclusion of a separate src SUBVL would allow either `sv.mv.swiz RT.vecN RA.vecN` to mean contiguous sequential copy or it could mean zip/unzip (pack/unpack).