From 82f5fb9b94e49c2ea1ad685cc9886d07da18dbac Mon Sep 17 00:00:00 2001
From: lkcl <lkcl@web>
Date: Tue, 25 Apr 2023 15:57:29 +0100
Subject: [PATCH]

---
 openpower/sv/rfc/ls015.mdwn | 65 +++++++++++++++++++++++++++++++++++++
 1 file changed, 65 insertions(+)

diff --git a/openpower/sv/rfc/ls015.mdwn b/openpower/sv/rfc/ls015.mdwn
index 1d9029d6b..e85b11702 100644
--- a/openpower/sv/rfc/ls015.mdwn
+++ b/openpower/sv/rfc/ls015.mdwn
@@ -80,6 +80,71 @@ Add the following entries to:
 
 \newpage{}
 
+# Rationale
+
+Condition Registers are conceptually perfect for use as predicate masks,
+the only problem being that typical Vector ISAs have quite comprehensive
+mask-based instructions: set-before-first, popcount and much more.
+In fact many Vector ISAs can use Vectors *as* masks, consequently the
+entire Vector ISA is usually available for use in creating masks (one
+exception being AVX512 which has a dedicated Mask regfile and opcodes).
+Duplication of such operations (popcount etc) is not practical for SV
+given the strategy of leveraging pre-existing Scalar instructions in a
+minimalist way.
+
+With the scalar OpenPOWER v3.0B ISA having already popcnt, cntlz and
+others normally seen in Vector Mask operations it makes sense to allow
+*both* scalar integers *and* CR-Vectors to be predicate masks.  That in
+turn means that much more comprehensive interaction between CRs and scalar
+Integers is required, because with the CR Predication Modes designating
+CR *Fields* (not CR bits) as Predicate Elements, fast transfers between
+CR *Fields* and the Integer Register File is needed.
+
+The opportunity is therefore taken to also augment CR logical arithmetic
+as well, using a mask-based paradigm that takes into consideration
+multiple bits of each CR Field (eq/lt/gt/ov).  By contrast v3.0B Scalar
+CR instructions (crand, crxor) only allow a single bit calculation, and
+both mtcr and mfcr are CR-orientated rather than CR *Field* orientated.
+
+Also strangely there is no v3.0 instruction for directly moving CR Fields,
+only CR *bits*, so that is corrected here with `mcrfm`. The opportunity
+is taken to allow inversion of CR Field bits, when copied.
+
+Basic concept:
+
+* CR-based instructions that perform simple AND/OR from any four bits
+  of a CR field to create a single bit value (0/1) in an integer register
+* Inverse of the same, taking a single bit value (0/1) from an integer
+  register to selectively target any four bits of a given CR Field
+* CR-to-CR version of the same, allowing multiple bits to be AND/OR/XORed
+  in one hit.
+* Optional Vectorisation of the same when SVP64 is implemented
+
+Purpose:
+
+* To provide a merged version of what is currently a multi-sequence of
+  CR operations (crand, cror, crxor) with mfcr and mtcrf, reducing
+  instruction count.
+* To provide a vectorised version of the same, suitable for advanced
+  predication
+
+Useful side-effects:
+
+* mtcrweird when RA=0 is a means to set or clear 
+  multiple arbitrary CR Field bits simultaneously,
+  using immediates embedded within the instruction.
+* With SVP64 on the weird instructions there is bit-for-bit interaction
+  between GPR predicate masks (r3, r10, r31) and the source
+  or destination GPR, in ways that are not possible with other
+  SVP64 instructions because normal SVP64 is bit-per-element.
+  On these weird instructions the element in effect *is* a bit.
+* `mfcrweird` mitigates a need to add `conflictd`, part of
+  [[sv/vector_ops]], as well as allowing more complex comparisons.
+
+----------
+
+\newpage{}
+
 [[!inline pages="openpower/sv/cr_int_predication" raw=yes ]]
 
 ----------
-- 
2.30.2