From 936f896108abd7be3c1589de0127463692000537 Mon Sep 17 00:00:00 2001 From: lkcl Date: Tue, 29 Mar 2022 00:51:04 +0100 Subject: [PATCH] --- openpower/sv/cr_int_predication.mdwn | 59 ++++------------------------ 1 file changed, 8 insertions(+), 51 deletions(-) diff --git a/openpower/sv/cr_int_predication.mdwn b/openpower/sv/cr_int_predication.mdwn index f52622b80..4ba18ea8d 100644 --- a/openpower/sv/cr_int_predication.mdwn +++ b/openpower/sv/cr_int_predication.mdwn @@ -51,62 +51,19 @@ this gets particularly powerful if data-dependent predication is also enabled. # Bit ordering. -IBM chose MSB0 for the OpenPOWER v3.0B specification. This makes things slightly hair-raising. Our desire initially is therefore to follow the logical progression from the defined behaviour of `mtcr` and `mfcr` etc. -In [[isa/sprset]] we see the pseudocode for `mtcrf` for example: +IBM chose MSB0 for the OpenPOWER v3.0B specification. This makes things slightly hair-raising and the relationship between the CR and the CR Field +numbers is not clearly defined. To make it clear we define a new +term, `CR{n}`. +`CR{n}` refers to `CR0` when `n=0` and consequently, for CR0-7, is defined, in v3.0B pseudocode, as: - mtcrf FXM,RS - - do n = 0 to 7 - if FXM[n] = 1 then - CR[4*n+32:4*n+35] <- (RS)[4*n+32:4*n+35] - -This places (according to a mask schedule) `CR0` into MSB0-numbered bits 32-35 of the target Integer register `RS`, these bits of `RS` being the 31st down to the 28th. Unfortunately, even when not Vectorised, this inserts CR numbering inversions on each batch of 8 CRs, massively complicating matters. Predication when using CRs would have to be morphed to this (unacceptably complex) behaviour: - - for i in range(VL): - if INTpredmode: - predbit = (r3)[63-i] # IBM MSB0 spec sigh - else: - # completely incomprehensible vertical numbering - n = (7-(i%8)) | (i & ~0x7) # total mess - CRpredicate = CR{n} # select CR0, CR1, .... - predbit = CRpredicate[offs] # select eq..ov bit - -Which is nowhere close to matching the straightforward obvious case: - - for i in range(VL): - if INTpredmode: - predbit = (r3)[63-i] # IBM MSB0 spec sigh - else: - CRpredicate = CR{i} # start at CR0, work up - predbit = CRpredicate[offs] - -In other words unless we do something about this, when we transfer bits from an Integer Predicate into a Vector of CRs, our numbering of CRs, when enumerating them in a CR Vector, would be **CR7** CR6 CR5.... CR0 **CR15** CR14 CR13... CR8 **CR23** CR22 etc. **not** the more natural and obvious CR0 CR1 ... CR23. - -Therefore the instructions below need to **redefine** the relationship so that CR numbers (CR0, CR1) sequentially match the arithmetically-ordered bits of Integer registers. By `arithmetic` this is deduced from the fact that the instruction `addi r3, r0, 1` will result in the **LSB** (numbered 63 in IBM MSB0 order) of r3 being set to 1 and all other bits set to zero. We therefore refer, below, to this LSB as "Arithmetic bit 0", and it is this bit which is used - defined - as being the first bit used in Integer predication (on element 0). - -Below is some pseudocode that, given a CR offset `offs` to represent `CR.eq` thru to `CR.ov` respectively, will copy the INT predicate bits in the correct order into the first 8 CRs: - - do n = 0 to 7 - CR[4*n+32+offs] <- (RS)[63-n] - -Assuming that `offs` is set to `CR.eq` this results in: - -* Arithmetic bit 0 (the LSB, numbered 63 in IBM MSB0 terminology) - of RS being inserted into CR0.eq -* Arithmetic bit 1 of RS being inserted into CR1.eq -* ... -* Arithmetic bit 7 of RS being inserted into CR7.eq - -To clarify, then: all instructions below do **NOT** follow the IBM convention, they follow the natural sequence CR0 CR1 instead, using `CR{fieldnum}` to refer to the individual CR Fields. However it is critically important to note that the offsets **in** a CR field -(`CR.eq` for example) continue to follow the v3.0B definition and convention. + CR{7-n} = CR[32+n*4:35+n*4] +Also note that for SVP64 the relationship for the sequential +numbering of elements is to the CR **fields** within +the CR Register, not to individual bits within the CR register. # Instruction form and pseudocode -Note that `CR{n}` refers to `CR0` when `n=0` and consequently, for CR0-7, is defined, in v3.0B pseudocode, as: - - CR{7-n} = CR[32+n*4:35+n*4] - Instruction format: |0-5|6-10 |11|12-15|16-18|19-20|21-25 |26-30 |31|name | -- 2.30.2