bit-level packing descriptions for crweird family of instructions

[libreriscv.git] / openpower / sv / cr_int_predication.mdwn
diff --git a/openpower/sv/cr_int_predication.mdwn b/openpower/sv/cr_int_predication.mdwn

index 9feb34d35356c6bb77bfb2b37dd59215f0711968..26958dd4ca2ae300e93f06badcdca4b4b6c5d4e6 100644 (file)
--- a/openpower/sv/cr_int_predication.mdwn
+++ b/openpower/sv/cr_int_predication.mdwn
@@ -89,11 +89,14 @@ OPF ISA WG):
  |0-5|6-10 |11|12-15|16-18|19-20|21-25  |26-30  |31|name      |
  |---|---- |--|-----|-----|-----|-----  |-----  |--|----      |
  |19 |RT   |  |mask |BFA  |     |XO[0:4]|XO[5:9]|/ |          |
-|19 |RT   |M |mask |BFA  | 0 0 |XO[0:4]|0 mode |Rc|crrweird  |
-|19 |RA   |M |mask |BF   | 0 1 |XO[0:4]|0 mode |/ |mtcrweird |
-|19 |BT   |M |mask |BFA  | 1 0 |XO[0:4]|0 mode |/ |crweirder |
-|19 |BF //|M |mask |BFA  | 1 1 |XO[0:4]|0 mode |0 |crweird   |
-|19 |BF //|M |mask |BFA  | 1 1 |XO[0:4]|0 mode |1 |mcrfm     |
+|19 |     |  |     |     |     |1 //// |00011  |  |rsvd      |
+|19 |RT   |M |mask |BFA  | 0 0 |1 mode |00011  |Rc|crrweird  |
+|19 |RT   |M |mask |BFA  | 0 1 |1 mode |00011  |Rc|mfcrweird |
+|19 |RA   |M |mask |BF   | 0 0 |0 mode |00011  |1 |mtcrrweird |
+|19 |RA   |M |mask |BF   | 0 1 |0 mode |00011  |0 |mtcrweird |
+|19 |BT   |M |mask |BFA  | 0 1 |0 mode |00011  |1 |crweirder |
+|19 |BF //|M |mask |BFA  | 1 1 |0 mode |00011  |0 |crweird   |
+|19 |BF //|M |mask |BFA  | 1 1 |0 mode |00011  |1 |mcrfm     |
  
  **crrweird**
  
@@ -117,6 +120,92 @@ When used with SVP64 Prefixing this is a [[openpower/sv/normal]]
  SVP64 type operation and as such can use Rc=1 and RC1 Data-dependent
  Mode capability
  
+Also as noted below, element-width override bits normally used
+on the source is instead used to allow multiple results to be packed
+sequentially into the destination. *Destination elwidth overrides still apply*
+
+When the destination elwidth is default (0b00) the following packing occurs
+into destination elements:
+
+- SVRM bits 6:7 equal to 0b00 - one result element packed into one bit of each
+  destination element (in the LSB)
+- SVRM bits 6:7 equal to 0b01 - two result elements packed into two bits of
+  destination element (in the bottom two LSBs)
+- SVRM bits 6:7 equal to 0b10 - four result elements packed into four bits of
+  destination element (in the bottom four LSBs)
+- SVRM bits 6:7 equal to 0b11 - eight result elements packed into four bits of
+  destination element (in the bottom four LSBs)
+
+When for example the destination elwidth is 8-bit (0b11) then the destination
+element widths are 8-bit, and the result elements (grouped up to 8) still fit
+neatly into each 8-bit destination element.
+
+**mfcrrweird**
+
+mode is encoded in XO and is 4 bits
+
+bit 19=0, bit 20=0
+
+    mfcrrweird: RT, BFA, mask.mode
+
+    creg = CR{BFA}
+    n0 = mask[0] & (mode[0] == creg[0])
+    n1 = mask[1] & (mode[1] == creg[1])
+    n2 = mask[2] & (mode[2] == creg[2])
+    n3 = mask[3] & (mode[3] == creg[3])
+    result = n0||n1||n2||n3
+    RT[60:63] = result # MSB0 numbering, 63 is LSB
+    If Rc:
+        CR0 = analyse(RT)
+
+When used with SVP64 Prefixing this is a [[openpower/sv/normal]]
+SVP64 type operation and as such can use Rc=1 and RC1 Data-dependent
+Mode capability.
+
+Also as noted below, element-width override bits normally used
+on the source is instead used to allow multiple results to be packed
+into the destination.  *Destination elwidth overrides still apply*
+
+Unlike `crrweird` however, the results are 4-bit wide, so the packing
+will begin to spill over to other destination elements.  8 results per
+destination at 4-bits each still fits into destination elwidth at 32-bit,
+but for 16-bit and 8-bit obviously this does not fit, and must split
+across to the next element
+
+When for example destination elwidth is 16-bit (0b10) the following packing
+occurs:
+
+- SVRM bits 6:7 equal to 0b00 - one 4-bit result element packed into the
+  first 4-bits of the 16-bit destination element (in the first 4 LSBs)
+- SVRM bits 6:7 equal to 0b01 - two 4-bit result elements packed into the
+  first 8-bits of the 16-bit destination element (in the first 8 LSBs)
+- SVRM bits 6:7 equal to 0b10 - four 4-bit result elements packed into each
+  16-bit destination element
+- SVRM bits 6:7 equal to 0b11 - eight 4-bit result elements, the first four
+  of which are packed into the first 16-bit destination element, the
+  second four of which are packed into the second 16-bit destination element.
+
+**mtcrrweird**
+
+mode is encoded in XO and is 4 bits
+
+bit 19=0, bit 20=0
+
+    mtcrrweird: BF, RA, M, mask.mode
+
+    n0 = mask[0] & (mode[0] == RA[63])
+    n1 = mask[1] & (mode[1] == RA[62])
+    n2 = mask[2] & (mode[2] == RA[61])
+    n3 = mask[3] & (mode[3] == RA[60])
+    result = n0 || n1 || n2 || n3
+    if M:
+        result |= CR{BF} & ~mask
+    CR{BF} = result
+
+When used with SVP64 Prefixing this is a [[openpower/sv/normal]]
+SVP64 type operation and as such can use RC1 Data-dependent
+Mode capability
+
  **mtcrweird**
  
  bit 19=0, bit 20=1