From 4e2a8a70676d61294105d2948eb101f337cd72af Mon Sep 17 00:00:00 2001 From: lkcl Date: Thu, 19 May 2022 17:42:18 +0100 Subject: [PATCH] --- openpower/sv/cr_int_predication.mdwn | 78 +++++++++++++++------------- 1 file changed, 41 insertions(+), 37 deletions(-) diff --git a/openpower/sv/cr_int_predication.mdwn b/openpower/sv/cr_int_predication.mdwn index 26958dd4c..0a42edb63 100644 --- a/openpower/sv/cr_int_predication.mdwn +++ b/openpower/sv/cr_int_predication.mdwn @@ -122,23 +122,7 @@ Mode capability Also as noted below, element-width override bits normally used on the source is instead used to allow multiple results to be packed -sequentially into the destination. *Destination elwidth overrides still apply* - -When the destination elwidth is default (0b00) the following packing occurs -into destination elements: - -- SVRM bits 6:7 equal to 0b00 - one result element packed into one bit of each - destination element (in the LSB) -- SVRM bits 6:7 equal to 0b01 - two result elements packed into two bits of - destination element (in the bottom two LSBs) -- SVRM bits 6:7 equal to 0b10 - four result elements packed into four bits of - destination element (in the bottom four LSBs) -- SVRM bits 6:7 equal to 0b11 - eight result elements packed into four bits of - destination element (in the bottom four LSBs) - -When for example the destination elwidth is 8-bit (0b11) then the destination -element widths are 8-bit, and the result elements (grouped up to 8) still fit -neatly into each 8-bit destination element. +sequentially into the destination. *Destination elwidth overrides still apply*. **mfcrrweird** @@ -166,25 +150,6 @@ Also as noted below, element-width override bits normally used on the source is instead used to allow multiple results to be packed into the destination. *Destination elwidth overrides still apply* -Unlike `crrweird` however, the results are 4-bit wide, so the packing -will begin to spill over to other destination elements. 8 results per -destination at 4-bits each still fits into destination elwidth at 32-bit, -but for 16-bit and 8-bit obviously this does not fit, and must split -across to the next element - -When for example destination elwidth is 16-bit (0b10) the following packing -occurs: - -- SVRM bits 6:7 equal to 0b00 - one 4-bit result element packed into the - first 4-bits of the 16-bit destination element (in the first 4 LSBs) -- SVRM bits 6:7 equal to 0b01 - two 4-bit result elements packed into the - first 8-bits of the 16-bit destination element (in the first 8 LSBs) -- SVRM bits 6:7 equal to 0b10 - four 4-bit result elements packed into each - 16-bit destination element -- SVRM bits 6:7 equal to 0b11 - eight 4-bit result elements, the first four - of which are packed into the first 16-bit destination element, the - second four of which are packed into the second 16-bit destination element. - **mtcrrweird** mode is encoded in XO and is 4 bits @@ -331,7 +296,7 @@ bits within the Integer element set to zero) whilst the INT (dest operand) elwidth field still sets the Integer element size as usual (8/16/32/default) - crrweird: RT, BB, mask.mode +**crrweird: RT, BB, mask.mode** for i in range(VL): if BB.isvec: @@ -376,6 +341,45 @@ Note that: of the INT Elements, the packing arrangement depending on both elwidth override settings. +**mfcrrweird: RT, BFA, mask.mode** + +Unlike `crrweird` the results are 4-bit wide, so the packing +will begin to spill over to other destination elements. 8 results per +destination at 4-bits each still fits into destination elwidth at 32-bit, +but for 16-bit and 8-bit obviously this does not fit, and must split +across to the next element + +When for example destination elwidth is 16-bit (0b10) the following packing +occurs: + +- SVRM bits 6:7 equal to 0b00 - one 4-bit result element packed into the + first 4-bits of the 16-bit destination element (in the first 4 LSBs) +- SVRM bits 6:7 equal to 0b01 - two 4-bit result elements packed into the + first 8-bits of the 16-bit destination element (in the first 8 LSBs) +- SVRM bits 6:7 equal to 0b10 - four 4-bit result elements packed into each + 16-bit destination element +- SVRM bits 6:7 equal to 0b11 - eight 4-bit result elements, the first four + of which are packed into the first 16-bit destination element, the + second four of which are packed into the second 16-bit destination element. + +Pseudocode example (dest elwidth overrides not included): + + for i in range(VL): + if BB.isvec: + creg = CR{BB+i} + else: + creg = CR{BB} + n0 = mask[0] & (mode[0] == creg[0]) + n1 = mask[1] & (mode[1] == creg[1]) + n2 = mask[2] & (mode[2] == creg[2]) + n3 = mask[3] & (mode[3] == creg[3]) + result = n0||n1||n2||n3 # 4-bit result + RT[60:63] = result # MSB0 numbering, 63 is LSB + If Rc: + CR0 = analyse(RT) + + + # v3.1 setbc instructions There are additional setb conditional instructions in v3.1 (p129) -- 2.30.2