From 9296c3546667e31e5cf46acad848d1ddf003402b Mon Sep 17 00:00:00 2001 From: lkcl Date: Wed, 6 Jan 2021 22:50:48 +0000 Subject: [PATCH] --- openpower/sv/cr_int_predication.mdwn | 20 +++++++++++++++++--- 1 file changed, 17 insertions(+), 3 deletions(-) diff --git a/openpower/sv/cr_int_predication.mdwn b/openpower/sv/cr_int_predication.mdwn index 62f907736..18c08e5e8 100644 --- a/openpower/sv/cr_int_predication.mdwn +++ b/openpower/sv/cr_int_predication.mdwn @@ -37,7 +37,7 @@ this gets particularly powerful if data-dependent predication is also enabled. # Bit ordering. -IBM chose MSB0 for the OpenPOWER v3.0B specification. This makes things slightly hair-raising. Our model initially therefore to follow the logical progression from the defined behaviour of `mtcr` and `mfcr` etc. +IBM chose MSB0 for the OpenPOWER v3.0B specification. This makes things slightly hair-raising. Our desire initially is therefore to follow the logical progression from the defined behaviour of `mtcr` and `mfcr` etc. In [[isa/sprset]] we see the pseudocode for `mtcrf` for example: mtcrf FXM,RS @@ -46,9 +46,23 @@ In [[isa/sprset]] we see the pseudocode for `mtcrf` for example: if FXM[n] = 1 then CR[4*n+32:4*n+35] <- (RS)[4*n+32:4*n+35] -This places (according to a mask schedule) `CR0` into MSB0-numbered bits 32-35 of the target Integer register `RS`, these bits of `RS` being the 31st down to the 28th. Unfortunately, even when not Vectorised, this inserts CR numbering inversions on each batch of 8 CRs, massively complicating matters. +This places (according to a mask schedule) `CR0` into MSB0-numbered bits 32-35 of the target Integer register `RS`, these bits of `RS` being the 31st down to the 28th. Unfortunately, even when not Vectorised, this inserts CR numbering inversions on each batch of 8 CRs, massively complicating matters. Predication when using CRs would have to be morphed to this (unacceptably complex) behaviour: -In other words unless we do something about this, when we transger bits from an Integer Predicate into a Vector of CRs, our numbering of CRs, when enumerating them in a CR Vector, would be CR7 CR6 CR5.... CR0 **CR15** CR14 CR13... CR8 **CR23** CR22 etc. **not** CR0 CR1 ... CR23. + for i in range(VL): + n = (7-(i%8)) | (i & ~0x7) # total mess + CRpredicate = CR{n} # select CR0, CR1, .... + predbit = CRpredicate[offs] # select eq..ov bit + +Which is nowhere close to matching the straightforward obvious case: + + for i in range(VL): + if INTpredmode: + predbit = (r3)[63-i] # IBM MSB0 spec sigh + else: + CRpredicate = CR{i} # start at CR0, work up + predbit = CRpredicate[offs] + +In other words unless we do something about this, when we transfer bits from an Integer Predicate into a Vector of CRs, our numbering of CRs, when enumerating them in a CR Vector, would be **CR7** CR6 CR5.... CR0 **CR15** CR14 CR13... CR8 **CR23** CR22 etc. **not** the more natural and obvious CR0 CR1 ... CR23. Therefore the instructions below need to **redefine** the relationship so that CR numbers (CR0, CR1) sequentially match the arithmetically-ordered bits of Integer registers. By `arithmetic` this is deduced from the fact that the ibsteuction `addi r3, r0, 1` it will result in the **LSB** (numbered 63 in IBM MSB0 order) of r3 being set to 1 and all other bits,set to zero. We therefore refer, below, to this LSB as "Arithmetic bit 0", and it is this bit which is used - defined - as being the first bit used in predication (on element 0). -- 2.30.2